Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Andrew Garrett
On 6/02/10 6:44 AM, Tei wrote:
> On 5 February 2010 20:17, Aryeh Gregor  wrote:
>
>> On Fri, Feb 5, 2010 at 3:57 AM, Daniel Kinzler  wrote:
>>  
>>> Or,
>>> to put it differently: let people use "flat tagging", but let's keep the 
>>> notion
>>> of one tag implying another, i.e. math implying science and texas implying 
>>> america.
>>>
>> And as for [[Category:People executed for heresy]] ->  [[Category:Joan
>> of Arc]] ->  [[English claims to the French throne]]?  That's only two
>> steps, and it already doesn't make sense.  You could argue that
>> [[Category:Joan of Arc]] really means [[Category:Stuff related to Joan
>> of Arc]] and shouldn't be in [[Category:People executed for heresy]],
>> but that sounds like it would take as much recategorization work as
>> just using atomic categories -- and much subtler.
>>  
>
> off-topic
>

Not at all, it's entirely reasonable to discuss the problems associated 
with the current categorisation system, and what methods we'd like to 
use to improve it.

-- 
Andrew Garrett
agarr...@wikimedia.org
http://werdn.us


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Version control

2010-02-07 Thread Peter Gervai
On Sun, Feb 7, 2010 at 00:38, Ævar Arnfjörð Bjarmason  wrote:

> It's interesting that the #1 con against Git in that document is "Lots
> of annoying Git/Linux fanboys".

No, it's the "screaming 'hell yeah!' but have no idea what they're
talking about" part. :-)

g

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread David Gerard
On 7 February 2010 08:45, Andrew Garrett  wrote:

> Not at all, it's entirely reasonable to discuss the problems associated
> with the current categorisation system, and what methods we'd like to
> use to improve it.


The current categorization system is per-wiki-specific. It's done
differently in different places. So it's not clear that you won't
require 750 different discussions.

To get back to the topic of category intersections on Commons:

Could the developers please outline, point by point, the precise hoops
we need to jump through to get category intersections on Commons? New
hoops seem to have been introduced during the currently discussion.

Please make an unambiguous list of the hoops Commons will be required
to jump through before this feature can happen, so it's actually clear
to all and we're all working from the same page, rather than trying to
guess what shrubbery you'll be demanding next.

Thanks!


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Aryeh Gregor
On Sun, Feb 7, 2010 at 7:01 AM, David Gerard  wrote:
> Could the developers please outline, point by point, the precise hoops
> we need to jump through to get category intersections on Commons? New
> hoops seem to have been introduced during the currently discussion.

Right now, I'd try just waiting.  As Daniel pointed out in this
thread, Neil Harris is already being paid to work on it.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Roan Kattouw
2010/2/7 David Gerard :
> Could the developers please outline, point by point, the precise hoops
> we need to jump through to get category intersections on Commons? New
> hoops seem to have been introduced during the currently discussion.
>
> Please make an unambiguous list of the hoops Commons will be required
> to jump through before this feature can happen, so it's actually clear
> to all and we're all working from the same page, rather than trying to
> guess what shrubbery you'll be demanding next.
>
Different implementations have different hoops associated with them.
As long as there's no concrete implementation, there's no definitive
list of these hoops, only vague generic hoops that apply to any kind
of category intersection and hypothetical hoops based on hypothetical
implementations.

Like Aryeh said, Neil is currently working on a concrete
implementation of category intersections. Only when that
implementation is complete (or at least close) will it be possible to
provide the definitive, specific and unambiguous list of requirements
you asked for.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Daniel Schwen
>> Please make an unambiguous list of the hoops Commons will be required
>> to jump through before this feature can happen, so it's actually clear
Way to be acerbic...

> list of these hoops, only vague generic hoops that apply to any kind
[..]
> Like Aryeh said, Neil is currently working on a concrete
> implementation of category intersections. Only when that
> implementation is complete (or at least close) will it be possible to
> provide the definitive, specific and unambiguous list of requirements
> you asked for.
Not really. There are two main points
1) category deep tree traversal (flattening / deep indexing) at
runtime is technically unfeasable.
2) automatic flattening produces nonsense results

Ok, lets's say Neil found a way to deal with 10. I give you that this
is implementation specific. Number 2) however is independent of any
implementation. Here you have your "hoop" (to to stick with your
pejorative lingo): Get rid of the crazy category system and go atomic.
What is vague about this, what part of this is unclear to you?

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread David Gerard
On 7 February 2010 13:09, Daniel Schwen  wrote:

> Ok, lets's say Neil found a way to deal with 10. I give you that this
> is implementation specific. Number 2) however is independent of any
> implementation. Here you have your "hoop" (to to stick with your
> pejorative lingo): Get rid of the crazy category system and go atomic.
> What is vague about this, what part of this is unclear to you?


The problem is that doing this before the feature that uses it is in
place renders categorisation on Commons even more useless. What this
will mean is that you will be requiring a direct reduction in the
usability of the wiki content before *possibly* implementing a
feature.

In practice, the difference between this and saying "No, never" is
telling people to do work that you know can't happen.

Please leave commons-l in the cc: this time, thanks.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Roan Kattouw
2010/2/7 David Gerard :
> The problem is that doing this before the feature that uses it is in
> place renders categorisation on Commons even more useless. What this
> will mean is that you will be requiring a direct reduction in the
> usability of the wiki content before *possibly* implementing a
> feature.
>
> In practice, the difference between this and saying "No, never" is
> telling people to do work that you know can't happen.
>
There's no reason why it couldn't be the other way around: an
intersection feature could be written and deployed *first*, *then* the
category trees on Commons would be gradually migrated to the new
system. Issues like nonsense results for automatic flattening could be
migitated by disabling features or making them less visible.

> Please leave commons-l in the cc: this time, thanks.
>
I did on my earlier reply, and I got a bounce from commons-l-owner
saying my message was rejected because I'm not subscribed to
commons-l. I'm not going to subscribe to that list, so I left the Cc:
commons-l out this time.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread David Gerard
On 7 February 2010 13:27, Roan Kattouw  wrote:

> There's no reason why it couldn't be the other way around: an
> intersection feature could be written and deployed *first*, *then* the
> category trees on Commons would be gradually migrated to the new
> system. Issues like nonsense results for automatic flattening could be
> migitated by disabling features or making them less visible.


*Precisely*. This is why the new (and it is new) demand to trash the
present category tree before *possibly* implementing a category
intersection feature is, in practical terms, indistinguishable from
sheer contemptuous obstructionism. Daniel may be terribly offended
that I dare to be acerbic about his expression of contempt, but I find
his expression of contempt rather more offensive.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Daniel Schwen
> In practice, the difference between this and saying "No, never" is
> telling people to do work that you know can't happen.

Wow, this is rich. We already had this conversation. A reminder:

> Demanding that all six million files be de-categorised before you'll
> even allow a category intersection tool to *possibly* be deployed is
> backward.
I never demanded that. Geez. What I want is the commons community
pledges support for a change of the categorization system. Putting
intersection in the interface before they do is a _waste of time_.
I'm asking for them to show the _tiniest_ sign of support. The
programmers have already bent over backwards (including me with my own
intersection tool)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Aryeh Gregor
On Sun, Feb 7, 2010 at 8:42 AM, David Gerard  wrote:
> *Precisely*. This is why the new (and it is new) demand to trash the
> present category tree before *possibly* implementing a category
> intersection feature is, in practical terms, indistinguishable from
> sheer contemptuous obstructionism.

Nobody "demanded" this except possibly Daniel Schwen, who has never
even committed anything to SVN outside of WikiMiniAtlas, let alone
representing the opinion of The Developers.  If you are incapable of
distinguishing the personal opinion of a random person on this list
from "demands" by "the developers", maybe you should unsubscribe and
save everyone the trouble.  That way you'll avoid annoying the
developers by posting uninformed and obnoxious things like this, and
avoid confusing non-developers by forwarding them irrelevant or
incomprehensible wikitech-l posts out of context (and this is not the
first time you've done that).

On Sun, Feb 7, 2010 at 9:14 AM, Daniel Schwen  wrote:
> I never demanded that. Geez. What I want is the commons community
> pledges support for a change of the categorization system. Putting
> intersection in the interface before they do is a _waste of time_.
> I'm asking for them to show the _tiniest_ sign of support. The
> programmers have already bent over backwards (including me with my own
> intersection tool)

Since when do we write features only for Commons?  Some wikis already
have atomic categories -- e.g.,
.  It would be a useful
feature to any number of users regardless of what Commons does or does
not do.  In fact, it would be useful to Commons too even without
atomic categorization, just not as useful as it could be.

On the other hand, it's thoroughly unreasonable to expect any wiki to
change how they do things based on technologies that have been talked
about for years and may or may not materialize in the foreseeable
future.  No, stuff on the toolserver that isn't integrated into the
interface (and doesn't have a very nice interface itself) doesn't
count.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread David Gerard
On 7 February 2010 16:31, Aryeh Gregor  wrote:
> On Sun, Feb 7, 2010 at 8:42 AM, David Gerard  wrote:

>> *Precisely*. This is why the new (and it is new) demand to trash the
>> present category tree before *possibly* implementing a category
>> intersection feature is, in practical terms, indistinguishable from
>> sheer contemptuous obstructionism.

> Nobody "demanded" this except possibly Daniel Schwen, who has never
> even committed anything to SVN outside of WikiMiniAtlas, let alone
> representing the opinion of The Developers.


Thank you for clarifying that.


>If you are incapable of
> distinguishing the personal opinion of a random person on this list
> from "demands" by "the developers", maybe you should unsubscribe and
> save everyone the trouble.  That way you'll avoid annoying the
> developers by posting uninformed and obnoxious things like this,


It would be nice if you'd bothered to make it clear that Daniel's
opinion didn't count. Else your tacit acceptance is the only message
being sent. Thank you for finally doing s.


> and
> avoid confusing non-developers by forwarding them irrelevant or
> incomprehensible wikitech-l posts out of context (and this is not the
> first time you've done that).


Of course I'm going to cc the post to the list for the project the
matter concerns. It's ridiculous not to when a demand is being made of
said project.

Since you're speaking with more authority - what are the actual next
steps to make this a happener?


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Aryeh Gregor
On Sun, Feb 7, 2010 at 11:50 AM, David Gerard  wrote:
> It would be nice if you'd bothered to make it clear that Daniel's
> opinion didn't count. Else your tacit acceptance is the only message
> being sent.

*Nothing* said here is a "message being sent" to anyone other than
developers and sysadmins.  This is not a medium for official dev
announcements to the larger world.  If something actually needs to be
announced to the projects, it will be announced to the projects, such
as by global site notices or such.  Taking it upon yourself to
broadcast what's said here to the larger world is not useful, because
what's said here is not targeted to the larger world.

If you were actually a developer or followed MediaWiki development
closely, you would understand perfectly well that Daniel Schwen's
opinion doesn't count for anything in this particular case.  He is not
in a position to say "we won't do X", because he can't stop anyone
else from doing it (and nor can most developers).  CCing it to third
parties as though it were an authoritative statement by The Developers
is confusing and irresponsible.

> Of course I'm going to cc the post to the list for the project the
> matter concerns. It's ridiculous not to when a demand is being made of
> said project.

No demand *was* being made of the project, not by anyone who had any
say.  (No offense to Daniel -- if I said the same thing, my opinion
wouldn't have any weight either.)  If you don't understand that, then
you don't know what you're reading here, and you should stop acting
like you do.

> Since you're speaking with more authority - what are the actual next
> steps to make this a happener?

The same steps as for every other possible feature that anyone would
want.  Someone needs to implement the feature, commit it or get it
committed, and address any objections that arise to avoid having it
reverted or disabled.  This includes jumping through any hoops that
come up along the way -- and *nobody* knows what those are in advance.
 It looks like Neil Harris will be doing this over the next few
months, with any luck, for this particular feature.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Daniel Schwen
> say.  (No offense to Daniel -- if I said the same thing, my opinion
None taken, especially since these "demands" are purely a product of
David's fantasy in the first place.

My intent was to point out the convoluted situation in the commons
case. Multiple requests have been made for intersection on commons.
But the existing category system is unsuitable for any technical
solution. It is far from my mind to dictate what the developers should
do with their time, and what not. I was offering an opinion. If that
is not welcome, then make this a closed/moderated list or just kick me
out.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Flattening a wikimedia category

2010-02-07 Thread Andrew Garrett
I think this thread is more heated than it needs to be.

Perhaps we could keep the on-list discussion on-topic, and perhaps 
appropriate disclaimers and more judicious phrasing could be used to 
prevent people from getting the wrong idea. I also think that clarifying 
misunderstandings can be done without calling people ignorant (or 
implying it in the way a response is phrased).

-- 
Andrew Garrett
agarr...@wikimedia.org
http://werdn.us


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] importing enwiki into local database

2010-02-07 Thread Eric Sun
I stripped out the 's and imported enwiki using xml2sql,
but none of the templates rendered correctly--for example, navigating
to /The_Matrix results in a page with lots of mediawiki source like

{{#if: |This {{#ifeq:||article|page}} is about . }}For {{#if:the
series|the series|other uses}}, see {{#if:The Matrix (franchise)|The
Matrix (franchise){{#ifeq:the setting|and| and {{#if:Matrix (fictional
universe)|Matrix (fictional

Any ideas if this is a known problem with xml2sql, or did something
get corrupted during my import?
I haven't yet tried importDump.php because it seems to be extremely
slow (can only import a few pages per second)

Eric

On Fri, Feb 5, 2010 at 1:13 AM, Andrew Krizhanovsky
 wrote:
> Yes, it was safe in my case (import of Russian and English Wiktionary).
> See http://meta.wikimedia.org/wiki/Talk:Xml2sql
> and example of script or shell command to strip out the 
>
> -- Andrew.
>
> On Fri, Feb 5, 2010 at 6:38 AM, Eric Sun  wrote:
>> Would it be safe to strip out the  tags from the xml and
>> reimport, or will that cause other problems?
>>
>> Thanks,
>> Eric
>>
>> On Thu, Feb 4, 2010 at 6:24 PM, Chad  wrote:
>>
>>> On Thu, Feb 4, 2010 at 9:12 PM, Eric Sun  wrote:
>>> > Hi,
>>> >
>>> > I saw this thread back in October where someone was having trouble
>>> > importing the English Wikipedia XML dump:
>>> > http://lists.wikimedia.org/pipermail/wikitech-l/2009-October/045594.html
>>> > The thread back in October seemed to end without resolution, and the
>>> > tools still seem to be broken, so has anyone found a solution in the
>>> > meantime?
>>> >
>>> > I'm using mediawiki-1.15.1 and attempting to import
>>> > enwiki-20100130-pages-articles.xml.bz2.
>>> >
>>> > None of these options seem to work:
>>> > 1) importDump.php
>>> > fails by spewing "Warning: xml_parse(): Unable to call handler in_()
>>> > in ./includes/Import.php on line 437" repeatedly
>>> >
>>> > 2) xml2sql (http://meta.wikimedia.org/wiki/Xml2sql):
>>> > Fails with error:
>>> > xml2sql: parsing aborted at line 33 pos 16.
>>> > due to the new  tag introduced in the new dumps?
>>> >
>>> > 3) mwdumper (http://www.mediawiki.org/wiki/MWDumper):
>>> > Current XML is schema v0.4, but the documentation says that it's for 0.3
>>> >
>>> > 4) mwimport (http://meta.wikimedia.org/wiki/Data_dumps/mwimport):
>>> > Fails immediately:
>>> > siteinfo: untested generator 'MediaWiki 1.16alpha-wmf', expect trouble
>>> ahead
>>> > page: expected closing tag in line 35
>>> >
>>> > Any tips?
>>> > Thanks!
>>> > Eric
>>> >
>>> > ___
>>> > Wikitech-l mailing list
>>> > Wikitech-l@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> >
>>>
>>> Most of these errors are caused by the new(ish)  tag
>>> within  elements. 0.4 is the correct version of the schema,
>>> but unfortunately the schema was updated and dumps were
>>> produced using them before the changes made it into a release.
>>>
>>> 1.15.1 cannot import pages with , we should probably
>>> backport that. That, and we should rewrite the importers to not barf
>>> terribly when they encounter an unknown element.
>>>
>>> -Chad
>>>
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l