[Wikitech-l] HTML wikipedia dumps: Could you please provide them, or make public the code for interpreting templates?

2012-09-09 Thread Roberto Flores
Greetings,

I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app for
the iPhone, which does a somewhat decent job at interpreting the wiki
markup into HTML.
However, there are too many templates for me to program (not to mention,
it's a moving target).
Without converting these templates, many articles are simply unreadable and
useless.

Could you please provide HTML dumps (I mean, with the templates
pre-processed into HTML, everything else the same as now) every 3 or 4
months?
Or alternatively, could you make the template API available so I could
import it in my program?

Dear regards,
Roberto Flores
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] HTML wikipedia dumps: Could you please provide them, or make public the code for interpreting templates?

2012-09-09 Thread Roberto Flores
Allow me to reply to each point:

(By the way, my offline app is called WikiGear Offline:)
http://itunes.apple.com/us/app/wikigear-offline/id453614487?mt=8

> Templates are dumped just like all other pages are...

Yes, but that's only a text description of what the template does.
Code must be written to actually process them into HTML.
There are tens of thousands of them, and some can't be even programmed by
me (e.g., Wiktionary's conjugation templates)
If they were already pre-processed into HTML inside the articles' contents,
that would solve all of my problems.

> what purpose would the dump serve? you dont want to keep the full dump
> on the device.

I made an indexing program that selects only content articles (namespaces
included) and compresses it all to a reasonable size (e.g. about 7gb for
the English Wikipedia)

> How would this template API function? What does import mean?

By this I mean, a set of functions written in some computer language to
which I could send them the template within the wiki markup and receive
HTML to display.

Wikipedia does this whenever a page is requested, but I ignore the exact
mechanism through which it's performed.
Maybe you just need to make that code publicly available, and I'll try to
make it work with my application somehow.


2012/9/9 Jeremy Baron 

> On Sun, Sep 9, 2012 at 6:34 PM, Roberto Flores 
> wrote:
> > I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app
> for
> > the iPhone, which does a somewhat decent job at interpreting the wiki
> > markup into HTML.
> > However, there are too many templates for me to program (not to mention,
> > it's a moving target).
> > Without converting these templates, many articles are simply unreadable
> and
> > useless.
>
> Templates are dumped just like all other pages are. Have you found
> them in the dumps? which dump are you looking at right now?
>
> > Could you please provide HTML dumps (I mean, with the templates
> > pre-processed into HTML, everything else the same as now) every 3 or 4
> > months?
>
> 3 or 4 month frequency seems unlikely to be useful to many people.
> Otherwise no comment.
>
> > Or alternatively, could you make the template API available so I could
> > import it in my program?
>
> How would this template API function? What does import mean?
>
> -Jeremy
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] HTML wikipedia dumps: Could you please provide them, or make public the code for interpreting templates?

2012-09-09 Thread Roberto Flores
I think there is a slight misunderstanding on what my app is and does:

It is an offline Wikipedia (et al) viewer that contains all content
articles in the dump.
Everything must be contained within the app's code and the processed dump
files downloadable from my own site (gearapps.com)

> Take a look at http://en.wikipedia.org/w/api.php?action=parse...
My app is supposed to be fully offline. It does not make any network
connections, thus I cant use the online api.
I need to have the template-processing code within the app or the templates
pre-processed into the dump.

> Also a 7GB app is something you want
> to CLEARLY state as eating up that much device space/ download
> bandwidth is probably a problem for most users

The files are provided on my own site, so it doesn't add any load to
Wikipedia's servers.
The file sizes are viewable upon trying to download them.

2012/9/9 John 

> Take a look at http://en.wikipedia.org/w/api.php?action=parse it is
> exactly what you are looking for. Also a 7GB app is something you want
> to CLEARLY state as eating up that much device space/ download
> bandwidth is probably a problem for most users
>
> On Sun, Sep 9, 2012 at 3:07 PM, Roberto Flores 
> wrote:
> > Allow me to reply to each point:
> >
> > (By the way, my offline app is called WikiGear Offline:)
> > http://itunes.apple.com/us/app/wikigear-offline/id453614487?mt=8
> >
> >> Templates are dumped just like all other pages are...
> >
> > Yes, but that's only a text description of what the template does.
> > Code must be written to actually process them into HTML.
> > There are tens of thousands of them, and some can't be even programmed by
> > me (e.g., Wiktionary's conjugation templates)
> > If they were already pre-processed into HTML inside the articles'
> contents,
> > that would solve all of my problems.
> >
> >> what purpose would the dump serve? you dont want to keep the full dump
> >> on the device.
> >
> > I made an indexing program that selects only content articles (namespaces
> > included) and compresses it all to a reasonable size (e.g. about 7gb for
> > the English Wikipedia)
> >
> >> How would this template API function? What does import mean?
> >
> > By this I mean, a set of functions written in some computer language to
> > which I could send them the template within the wiki markup and receive
> > HTML to display.
> >
> > Wikipedia does this whenever a page is requested, but I ignore the exact
> > mechanism through which it's performed.
> > Maybe you just need to make that code publicly available, and I'll try to
> > make it work with my application somehow.
> >
> >
> > 2012/9/9 Jeremy Baron 
> >
> >> On Sun, Sep 9, 2012 at 6:34 PM, Roberto Flores  >
> >> wrote:
> >> > I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app
> >> for
> >> > the iPhone, which does a somewhat decent job at interpreting the wiki
> >> > markup into HTML.
> >> > However, there are too many templates for me to program (not to
> mention,
> >> > it's a moving target).
> >> > Without converting these templates, many articles are simply
> unreadable
> >> and
> >> > useless.
> >>
> >> Templates are dumped just like all other pages are. Have you found
> >> them in the dumps? which dump are you looking at right now?
> >>
> >> > Could you please provide HTML dumps (I mean, with the templates
> >> > pre-processed into HTML, everything else the same as now) every 3 or 4
> >> > months?
> >>
> >> 3 or 4 month frequency seems unlikely to be useful to many people.
> >> Otherwise no comment.
> >>
> >> > Or alternatively, could you make the template API available so I could
> >> > import it in my program?
> >>
> >> How would this template API function? What does import mean?
> >>
> >> -Jeremy
> >>
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] HTML wikipedia dumps: Could you please provide them, or make public the code for interpreting templates?

2012-09-13 Thread Roberto Flores
In all frankness, I don't see how can it be complicated or mind-blowing to
generate a HTML dump when the software is there already to produce a
wikimarkup one.
WikiMarkup dumps are of very limited use and mainly to yourselves alone.

No need to tell me to go solve my problems myself, it's what I've been
doing all the time.

2012/9/12 Emmanuel Engelhart 

> Dear Roberto
>
> Le 09/09/2012 20:34, Roberto Flores a écrit :
> > I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app
> for
> > the iPhone, which does a somewhat decent job at interpreting the wiki
> > markup into HTML.
>
> Great idea, but why reinventing the wheel concerning the format and not
> using the open and inter-operable ZIM format pushed by the movement:
> * http://www.openzim.org
>
> We already have a few open-source readers and contents:
> * http://www.kiwix.org
> * http://cip.github.com/WikiOnBoard
>
> > However, there are too many templates for me to program (not to mention,
> > it's a moving target).
> > Without converting these templates, many articles are simply unreadable
> and
> > useless.
> >
> > Could you please provide HTML dumps (I mean, with the templates
> > pre-processed into HTML, everything else the same as now) every 3 or 4
> > months?
> > Or alternatively, could you make the template API available so I could
> > import it in my program?
>
> The way you want to do it can not really work for the reasons other
> people have alreday explained. There are Wikimedians whove have been
> involved with such topics since years ; why not talking with them before
> starting your project? We have a dedicated mailing list, you are welcome
> on it:
> https://lists.wikimedia.org/mailman/listinfo/offline-l
>
> Regards
> Emmanuel
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] [Xmldatadumps-l] HTML wikipedia dumps: Could you please provide them, or make public the code for interpreting templates?

2012-10-03 Thread Roberto Flores
Could we have an HTML dump for X amount of money?
Something like a paid feature.

Include the CSS of course.
Also, leave the  tags as they are, as those have to be processed by
3rd party libraries.

2012/9/17 Pablo N. Mendes 

>
> I also think the HTML dumps would be super useful!
>
> Cheers
> Pablo
> On Sep 17, 2012 8:05 PM, "James L"  wrote:
>
>>   I’m all vote for continuing the HTML wiki dumps that were once done, *2007
>> was the last*?  Why are these discontinued? they would be more useful
>> than the so called “XML”.
>>
>> There is no complete solution to processing dumps, the XML is most
>> certainly not XML in its lowest form, and it IS DEFINITELY a moving target!
>>
>> Regards,
>>
>>  *From:* Roberto Flores 
>> *Sent:* Sunday, September 09, 2012 8:07 PM
>> *To:* Wikimedia developers 
>> *Cc:* Wikipedia Xmldatadumps-l 
>> *Subject:* Re: [Xmldatadumps-l] [Wikitech-l] HTML wikipedia dumps: Could
>> you please provide them, or make public the code for interpreting templates?
>>
>> Allow me to reply to each point:
>>
>> (By the way, my offline app is called WikiGear Offline:)
>> http://itunes.apple.com/us/app/wikigear-offline/id453614487?mt=8
>>
>> > Templates are dumped just like all other pages are...
>>
>> Yes, but that's only a text description of what the template does.
>> Code must be written to actually process them into HTML.
>> There are tens of thousands of them, and some can't be even programmed by
>> me (e.g., Wiktionary's conjugation templates)
>> If they were already pre-processed into HTML inside the articles'
>> contents, that would solve all of my problems.
>>
>> > what purpose would the dump serve? you dont want to keep the full dump
>> > on the device.
>>
>> I made an indexing program that selects only content articles (namespaces
>> included) and compresses it all to a reasonable size (e.g. about 7gb for
>> the English Wikipedia)
>>
>> > How would this template API function? What does import mean?
>>
>> By this I mean, a set of functions written in some computer language to
>> which I could send them the template within the wiki markup and receive
>> HTML to display.
>>
>> Wikipedia does this whenever a page is requested, but I ignore the exact
>> mechanism through which it's performed.
>> Maybe you just need to make that code publicly available, and I'll try to
>> make it work with my application somehow.
>>
>>
>> 2012/9/9 Jeremy Baron 
>>
>>> On Sun, Sep 9, 2012 at 6:34 PM, Roberto Flores 
>>> wrote:
>>> > I have developed an offline Wikipedia, Wikibooks, Wiktionary, etc. app
>>> for
>>> > the iPhone, which does a somewhat decent job at interpreting the wiki
>>> > markup into HTML.
>>> > However, there are too many templates for me to program (not to
>>> mention,
>>> > it's a moving target).
>>> > Without converting these templates, many articles are simply
>>> unreadable and
>>> > useless.
>>>
>>> Templates are dumped just like all other pages are. Have you found
>>> them in the dumps? which dump are you looking at right now?
>>>
>>> > Could you please provide HTML dumps (I mean, with the templates
>>> > pre-processed into HTML, everything else the same as now) every 3 or 4
>>> > months?
>>>
>>> 3 or 4 month frequency seems unlikely to be useful to many people.
>>> Otherwise no comment.
>>>
>>> > Or alternatively, could you make the template API available so I could
>>> > import it in my program?
>>>
>>> How would this template API function? What does import mean?
>>>
>>> -Jeremy
>>>
>>> ___
>>> Wikitech-l mailing list
>>> Wikitech-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>>
>>
>>  --
>> ___
>> Xmldatadumps-l mailing list
>> xmldatadump...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>
>>
>> ___
>> Xmldatadumps-l mailing list
>> xmldatadump...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
>>
>>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l