[Wikitech-l] Making local mirror of one of the Wikipedia subdomains.

2009-07-08 Thread Artyom Sokolov
Hello.

I'm not sure if this is appropriate list for such kind of question.
I'll appreciate if someone directs me to the proper one.

One has task to make a local fully-functional mirror of Wikipedia
sudomain (articles, images, etc. must be located on the local server).
Currently there are not so many articles and downloading dump once a
day may be an option. But there is a problem: how to synchronize
changes made to the local copy back to the Wikipedia? Is there any
piece of software that could help?

I would appreciate any help.

Sincerely,
Artyom.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making local mirror of one of the Wikipedia subdomains.

2009-07-08 Thread Jens Frank
On Thu, Jul 09, 2009 at 12:01:32AM +0500, Artyom Sokolov wrote:
> 
> One has task to make a local fully-functional mirror of Wikipedia
> sudomain (articles, images, etc. must be located on the local server).
> Currently there are not so many articles and downloading dump once a
> day may be an option. But there is a problem: how to synchronize
> changes made to the local copy back to the Wikipedia? Is there any
> piece of software that could help?

Don't do that. Synchronizing back is a very difficult task, and you will
find yourself in deep trouble very soon. If you don't do proper
replication conflict resolution, you'll have either junk on your side or
on the Wikipedia side. In the later case, you'll probably get blocked
rather soon, in the other case your users will get frustrated because
their edits don't get through.

Regards,

jens

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making local mirror of one of the Wikipedia subdomains.

2009-07-08 Thread Chad
On Wed, Jul 8, 2009 at 3:40 PM, Jens Frank wrote:
> On Thu, Jul 09, 2009 at 12:01:32AM +0500, Artyom Sokolov wrote:
>>
>> One has task to make a local fully-functional mirror of Wikipedia
>> sudomain (articles, images, etc. must be located on the local server).
>> Currently there are not so many articles and downloading dump once a
>> day may be an option. But there is a problem: how to synchronize
>> changes made to the local copy back to the Wikipedia? Is there any
>> piece of software that could help?
>
> Don't do that. Synchronizing back is a very difficult task, and you will
> find yourself in deep trouble very soon. If you don't do proper
> replication conflict resolution, you'll have either junk on your side or
> on the Wikipedia side. In the later case, you'll probably get blocked
> rather soon, in the other case your users will get frustrated because
> their edits don't get through.
>
> Regards,
>
> jens
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

Using dumps locally for an offline version is pretty easy to set up
(download the right dump, import with either importDump or mwdumper).

Syncing changes back to the live site is a Very Bad Idea, like Jens
said. There is absolutely no supported mechanism to do this (see bugs
2054 and 15468).

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making local mirror of one of the Wikipedia subdomains.

2009-07-08 Thread Dmitriy Sintsov
> Don't do that. Synchronizing back is a very difficult task, and you 
will
> find yourself in deep trouble very soon. If you don't do proper
> replication conflict resolution, you'll have either junk on your side 
or
> on the Wikipedia side. In the later case, you'll probably get blocked
> rather soon, in the other case your users will get frustrated because
> their edits don't get through.
>
> Regards,
>
> jens
>
At one wiki site (not a Wikipedia) an improved version of 
Special:Ancientpages is used to manually synchrolize between localhost 
and webhost clone wiki sites. This version of Special:Ancientpages 
allows to select another exportable namespaces besides NS_MAIN and 
optionally submits these to Special:Export. By default, the 
functionality of Special:Export is very limited (at least it was last 
time I was cheking it).

A better approach probably would be introducing an --date option to 
maintenance/dumpBackup.php.

The important problem though is, that during XML import not all hooks 
are called and extension-specific data is not saved/restored correctly 
:-( I think it's major problem that these hooks doesn't allow to pass 
their own XML tags into the dumps.
Dmitriy

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Making local mirror of one of the Wikipedia subdomains.

2009-07-09 Thread Chad
On Thu, Jul 9, 2009 at 1:14 AM, Dmitriy Sintsov wrote:
>> Don't do that. Synchronizing back is a very difficult task, and you
> will
>> find yourself in deep trouble very soon. If you don't do proper
>> replication conflict resolution, you'll have either junk on your side
> or
>> on the Wikipedia side. In the later case, you'll probably get blocked
>> rather soon, in the other case your users will get frustrated because
>> their edits don't get through.
>>
>> Regards,
>>
>> jens
>>
> At one wiki site (not a Wikipedia) an improved version of
> Special:Ancientpages is used to manually synchrolize between localhost
> and webhost clone wiki sites. This version of Special:Ancientpages
> allows to select another exportable namespaces besides NS_MAIN and
> optionally submits these to Special:Export. By default, the
> functionality of Special:Export is very limited (at least it was last
> time I was cheking it).
>
> A better approach probably would be introducing an --date option to
> maintenance/dumpBackup.php.
>
> The important problem though is, that during XML import not all hooks
> are called and extension-specific data is not saved/restored correctly
> :-( I think it's major problem that these hooks doesn't allow to pass
> their own XML tags into the dumps.
> Dmitriy
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

Tags like  which do not appear in core are dumped. You just have
to have the extension installed for them to do anything after you've
imported the content. If you don't have Cite and ParserFunctions (at a
minimum) installed, don't expect much of Enwiki to work after import.

That being said: I've found that having Cite and/or ParserFunctions activated
*while* importing to slow down the process (and occasionally cause it to
halt). It's better to import and then activate the extensions.

-Chad

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l