Re: [Wikisource-l] About texts without supporting files and Index: pages
On Wed, Jun 12, 2013 at 4:47 PM, Aarti K. Dwivedi ellydwivedi2...@gmail.com wrote: If I am not wrong, as of today, most books that were born digital, are still under copyright. Of course, they are available freely on the internet. But we can't use the pirated copies. How would we go about the procurement of these books? If we procure these copyrighted books, then the only we would have to do is to check for proper formatting. Isn't it? You are thinking of *books*, which are not the only documents Wikisource can host. For example, I am thinking about Open Access literature, which counts in hundred thousands CC-BY licensed articles, for example. Just look in DOAJ: http://www.doaj.org/ One of the wikimedians most involved in Open Access - Wiki collaboration is Daniel Mietchen (cc'ed). He's working on a bot who could grab the XML/HTML of an online article, format it in wikicode, and post it wherever he wants (maybe, Wikisources). The bot is aming to download automatically all images within the articles, and post them on Commons. I personally think that this project is beyond awesomeness, IF we manage to solve particular and specific issues (as converting hyperlinks to other articles in wikilinks to those articles posted on WIkisource...) As I said before, I see Wikisource as a broad, international, connected, hypertextual digital library, which has a thing no other digital library in the world has: a dedicated community[*]. It is my personal opinion, I know some people don't see it that way (like Alex :-D) Aubrey [*] there is Project Gutenberg, but I would argue they are not a digital library... ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
On Wed, Jun 12, 2013 at 1:32 PM, billinghurst billinghu...@gmail.comwrote: If you are talking about how we represent digitally prepared text with the validation process. I would have no issue with the text being ripped and having a bot run through and taking it straight to level 4 (green), and then redefining green to say validated, or digitally prepared text not requiring validation. At the same time, if someone proposed and generates a fifth colour to represent digitally prepared text not requiring proofreading, then I will be happy with that. It may make someone happier in being a truer representation, but in the end to me it is a moot point. In the end, each of those is a local community decision, though one that should be made in consideration of how the other wikis interpret their processes. Thanks for clarifying this. I agree with you, and would welcome both solutions. But a lot of wikisourcerors don't think this way, so better discuss :-) Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
On Wed, Jun 12, 2013 at 2:32 PM, Thibaut Horel thibaut.ho...@gmail.comwrote: 3. The current system with 4 quality levels to represent the proofreading state of a page is not sufficient to represent the diversity of proofreading scenarios. Indeed, there is a distinction to make between the *correctness* of the text and its *formatting*. In the case of a scanned edition which has been OCRed, we do need several passes before reaching a satisfying level of confidence about the correctness of the text as well as a suitable formatting (proper use of the wikicode, etc.). For digital-born documents however, as billinghurst said, we can automatically assume that the extracted text is correct, but that still doesn't mean that the text is correctly formatted and ready to be transcluded in the main namespace. Maybe we should add another level meaning text is correct, still needs formatting? Ideally, we should have to scales of quality levels: one dealing with the correctness of the text, and one dealing with its formatting. This would probably be too heavy and confusing though... I couldn't agree more. I think this could be an opportunity also to make task *smaller* and *clearer* (in the direction of microtask, which are contributions in crowdsourcing projects which are small, definite and simple. eg GalaxyZoo, reCAPTCHA). We could define some tasks as * corrected the page * proofread the text * formatted the page * validated the formatting * OPTIONAL added optional templates/links/annotations *... We could even have qualifiers (all/part of the page, ...) Is this idea crazy, or somewhat doable? Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
I think everything is doable, the problem is how to do it without cluttering the interface and keeping things simple. Some levels might be redundant and we could take the chance to think if they are really necessary. Some proposed changes: - Proofread page levels: Unused, Proofread, Proofread with format, Validated (the unused level would mean: pages with no text, ocr text, pages with irrelevant content). - All pages would be created at start with the extracted ocr text at unused level, so finally search engines could also find our texts even if they are not started yet - A checkbox list to tag pages: damaged scan, missing scan, contains media (image, score, etc) - Color codes: like now plus orange for Proofread with format. Page with tags would affect the color too. damaged would make the color half purple and half the corresponding proofread level color, contains media could add a (black?) square around the page number - Proofread book levels should be automatic to the lowest page level, plus two options, one to mark the book as ready to export and another one to mark it as digital source, which would bring all pages at proofread level. For the metadata interface I keep thinking about it, and my impression is that we should start working from Template:Book [1] until having a version that can be used across Commons, Index pages, and books without supporting scans (in this last case it could be the same header template with an option to expand it to show the whole template:book). That template also might need some coloring/reorganizing to reflect the Work/Edition distinction that Wikidata is bringing [2] And if with Lua it is possible to read/write Wikidata, then the possible migration towards a Wikidata-powered Wikisource shouldn't be that far away. Cheers, Micru [1] http://commons.wikimedia.org/wiki/Template:Book [2] http://www.wikidata.org/wiki/Wikidata:Books_task_force On Wed, Jun 12, 2013 at 8:48 AM, Andrea Zanni zanni.andre...@gmail.comwrote: On Wed, Jun 12, 2013 at 2:32 PM, Thibaut Horel thibaut.ho...@gmail.comwrote: 3. The current system with 4 quality levels to represent the proofreading state of a page is not sufficient to represent the diversity of proofreading scenarios. Indeed, there is a distinction to make between the *correctness* of the text and its *formatting*. In the case of a scanned edition which has been OCRed, we do need several passes before reaching a satisfying level of confidence about the correctness of the text as well as a suitable formatting (proper use of the wikicode, etc.). For digital-born documents however, as billinghurst said, we can automatically assume that the extracted text is correct, but that still doesn't mean that the text is correctly formatted and ready to be transcluded in the main namespace. Maybe we should add another level meaning text is correct, still needs formatting? Ideally, we should have to scales of quality levels: one dealing with the correctness of the text, and one dealing with its formatting. This would probably be too heavy and confusing though... I couldn't agree more. I think this could be an opportunity also to make task *smaller* and *clearer* (in the direction of microtask, which are contributions in crowdsourcing projects which are small, definite and simple. eg GalaxyZoo, reCAPTCHA). We could define some tasks as * corrected the page * proofread the text * formatted the page * validated the formatting * OPTIONAL added optional templates/links/annotations *... We could even have qualifiers (all/part of the page, ...) Is this idea crazy, or somewhat doable? Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Etiamsi omnes, ego non ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
If I am not wrong, as of today, most books that were born digital, are still under copyright. Of course, they are available freely on the internet. But we can't use the pirated copies. How would we go about the procurement of these books? If we procure these copyrighted books, then the only we would have to do is to check for proper formatting. Isn't it? On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson l...@aronsson.se wrote: On 06/12/2013 02:48 PM, Andrea Zanni wrote: We could define some tasks as * corrected the page * OPTIONAL added optional templates/links/annotations *... Geotagged all the photos, ... The list doesn't end. You need a generic mechanism for any new feature you can invent. But aren't our existing templates and categories the best way to do this? You could just add to each page: {{done|proofread=user1|**validated=user2|geotagged=**user4|...}} -- Lars Aronsson (l...@aronsson.se) Project Runeberg - free Nordic literature - http://runeberg.org/ __**_ Wikisource-l mailing list Wikisource-l@lists.wikimedia.**org Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikisource-lhttps://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Aarti K. Dwivedi ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
When we tried to convert into wiki code (a needed step to add links and to convert files into a wiki hypertext) a pdf file, that's a opaque, closed format, such a work turned off in a nightmare. If we simply load free pdf books as they are, I don't see any advantage, but feed wikisource numbers/statistics nd this in presently far from my personal interest. As you guess, I'm one of users who don't support Aubrey's enthusiasm about texts born digital, even if free. :-) Alex 2013/6/12 David Cuenca dacu...@gmail.com Nobody is saying anything about using copyrighted works, there are many books that have an open license that would allow to include them in Wikisource. For instance in ca-ws we have this translation from 2009: http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%282009%29.djvu The original is in the PD, and the translator gave away his rights. It would have been much easier to work directly with the pdf, instead of converting to djvu. Micru On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi ellydwivedi2...@gmail.com wrote: If I am not wrong, as of today, most books that were born digital, are still under copyright. Of course, they are available freely on the internet. But we can't use the pirated copies. How would we go about the procurement of these books? If we procure these copyrighted books, then the only we would have to do is to check for proper formatting. Isn't it? On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson l...@aronsson.se wrote: On 06/12/2013 02:48 PM, Andrea Zanni wrote: We could define some tasks as * corrected the page * OPTIONAL added optional templates/links/annotations *... Geotagged all the photos, ... The list doesn't end. You need a generic mechanism for any new feature you can invent. But aren't our existing templates and categories the best way to do this? You could just add to each page: {{done|proofread=user1|**validated=user2|geotagged=**user4|...}} -- Lars Aronsson (l...@aronsson.se) Project Runeberg - free Nordic literature - http://runeberg.org/ __**_ Wikisource-l mailing list Wikisource-l@lists.wikimedia.**org Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikisource-lhttps://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Aarti K. Dwivedi ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Etiamsi omnes, ego non ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
Sorry if my answer is off-topic but if metadata are stored in WIkidata, is it really needed to create index pages to store the same data as Wikidata? As I see the things, we'll have bibliographical metadata on Wikidata (title, author, date of publication...) and data related to proofreading (proofreading level, table of content...) on the Index: pages. More, as the Proofread Page extension considers that an Index page is about a scan (ie one or more files) I'm not sure that Index pages about books without scan will be managed well by the extension. {{header|index name}} is already done, for books with scan, by the Proofread Page extension with the header=1 feature. In fr Wikisource, we already use a Lua module to manage the Mediawiki:Proofreadpage_header_template template used by the header=1 feature. https://fr.wikisource.org/wiki/Module:Header_template This template outputs automatically metadata and navigation from the index page TOC (but it allows also to override data). Tpt Date: Tue, 11 Jun 2013 01:33:39 +0200 From: alex.bro...@gmail.com To: wikisource-l@lists.wikimedia.org Subject: Re: [Wikisource-l] About texts without supporting files and Index: pages I'm going to test what you are telling in a real Lua script; as you know, Lua can read the code of any page with one expensive server function only, so that a simple {{header|index name}} ns0 template call could read all the wiki code from index page, parse it, extract all its data content, and use it to build any html you like. No other field is needed. In it.wikisource we are testing something more complex, since we are exporting Index data into a local Lua data module, to be loaded with a mw.loadData function that is not listed as server-expensive; but I presume that wiki servers would not be overloaded by one server expensive call If Im not going wrong, such a script could be written tomorrow by a good Lua programmer I'll need some more time as a beginner. I'll test a MediaWiki:Proofreadpage_index_template Lua loader parser working into ns0, just to see if all runs as I guess, then I'll tell you in this thread. In which wikisource project do you work usually? Alex 2013/6/11 David Cuenca dacu...@gmail.com No, it won't be stored in Wikisource, but still there is the need to present the information in a consistent manner. If you want to display the information on ns0, you will end up needing the same fields that the Index: page is using now. So why not to have the same solution for both? It could also be a template with a reduced set of fields that expands to show Template:Book with linked data from Wikidata, no matter if they have supporting scans or not. Micru On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo alex.bro...@gmail.com wrote: Simply there is no need to store data twice or more, if they are dinamically imported from wikidata. Such data would be simply generated by a normal template. Something similar to Commons media sharing: most wikipedians but beginners know that when you want to edit a shared media file, you must do you edit in Commons; there's no need to host a media file locally. So, IMHO a good Lua wikidata-reading library could avoid at all to store data in wikisource, or wikipedia, or Commons. Alex 2013/6/10 David Cuenca dacu...@gmail.com @Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only import it. Besides, if the Wikidata client is installed in Wikisource, the inclusion syntax already takes care of displaying data... Micru On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo alex.bro...@gmail.com wrote: I don't see the need to change deeply Index/ns0 relationship, while I appreciate the idea promote coherence reducing redundance (many years ago I painfully used dBase III - dBase IV and I learned that principle by try and learn). Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a brief message about relationship among wikidata, commons, wikisource and any other project. Don't follow the link, it's so short that I copy it here (but if you like it, comment it there): Scribunto-Lua and WikidataI'd like a library to get Wikidata content; it would be a good idea IMHO to access to Wikidata data in plain form, just as such data would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC) If such a Lua library could be built, to import data from wikidata would be as simple, as writing a template, and data will be self-aligned. Alex 2013/6/10 Aarti K. Dwivedi ellydwivedi2...@gmail.com Hi, There was a thread some time ago where there were talks of having books which were born digital. These pages wouldn't have scans
Re: [Wikisource-l] About texts without supporting files and Index: pages
@aarti: sometimes some books/text/documents are born-digital. Think about all the scientific literature, or Phd thesis. These files (if cc-by/sa licensed) could be stored in Wikisource, and be useful for the wikicommunity. We already have some means to link those text to their source (with a URL). It's a long time controversy if we must or must not allow documents without scans on Wikisource. Every community should decide by itself. My personal POV (also as a librarian), is that if we leave out born digital documents we are forgetting the bulk of the stuff. I think that one of the most important added values of Wikisource is integrating texts with other Wikimedia projects, and (wiki)linking and connecting each other. No other digital library do that on the Internet, and we can do it because we have a community. So, these texts will have a source. I do think that proofreading a born digital PDF is a waste of time. Aubrey On Tue, Jun 11, 2013 at 8:46 AM, Aarti K. Dwivedi ellydwivedi2...@gmail.com wrote: A slighly off-topic question: Even if we modify the extension to proofread books which do not have scans( I am assuming books that were born digital ), against what will these books be proofread? On Tue, Jun 11, 2013 at 12:11 PM, Thomas PT thoma...@hotmail.fr wrote: Sorry if my answer is off-topic but if metadata are stored in WIkidata, is it really needed to create index pages to store the same data as Wikidata? As I see the things, we'll have bibliographical metadata on Wikidata (title, author, date of publication...) and data related to proofreading (proofreading level, table of content...) on the Index: pages. More, as the Proofread Page extension considers that an Index page is about a scan (ie one or more files) I'm not sure that Index pages about books without scan will be managed well by the extension. {{header|index name}} is already done, for books with scan, by the Proofread Page extension with the header=1 feature. In fr Wikisource, we already use a Lua module to manage the Mediawiki:Proofreadpage_header_template template used by the header=1 feature. https://fr.wikisource.org/wiki/Module:Header_template This template outputs automatically metadata and navigation from the index page TOC (but it allows also to override data). Tpt -- Date: Tue, 11 Jun 2013 01:33:39 +0200 From: alex.bro...@gmail.com To: wikisource-l@lists.wikimedia.org Subject: Re: [Wikisource-l] About texts without supporting files and Index: pages I'm going to test what you are telling in a real Lua script; as you know, Lua can read the code of any page with one expensive server function only, so that a simple {{header|index name}} ns0 template call could read all the wiki code from index page, parse it, extract all its data content, and use it to build any html you like. No other field is needed. In it.wikisource we are testing something more complex, since we are exporting Index data into a local Lua data module, to be loaded with a mw.loadData function that is not listed as server-expensive; but I presume that wiki servers would not be overloaded by *one* server expensive call If Im not going wrong, such a script could be written tomorrow by a good Lua programmer I'll need some more time as a beginner. I'll test a MediaWiki:Proofreadpage_index_template Lua loader parser working into ns0, just to see if all runs as I guess, then I'll tell you in this thread. In which wikisource project do you work usually? Alex 2013/6/11 David Cuenca dacu...@gmail.com No, it won't be stored in Wikisource, but still there is the need to present the information in a consistent manner. If you want to display the information on ns0, you will end up needing the same fields that the Index: page is using now. So why not to have the same solution for both? It could also be a template with a reduced set of fields that expands to show Template:Book with linked data from Wikidata, no matter if they have supporting scans or not. Micru On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo alex.bro...@gmail.comwrote: Simply there is no need to store data twice or more, if they are dinamically imported from wikidata. Such data would be simply generated by a normal template. Something similar to Commons media sharing: most wikipedians but beginners know that when you want to edit a shared media file, you must do you edit in Commons; there's no need to host a media file locally. So, IMHO a good Lua wikidata-reading library could avoid at all to store data in wikisource, or wikipedia, or Commons. Alex 2013/6/10 David Cuenca dacu...@gmail.com @Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only
Re: [Wikisource-l] About texts without supporting files and Index: pages
On Tue, Jun 11, 2013 at 8:41 AM, Thomas PT thoma...@hotmail.fr wrote: Sorry if my answer is off-topic but if metadata are stored in WIkidata, is it really needed to create index pages to store the same data as Wikidata? As I see the things, we'll have bibliographical metadata on Wikidata (title, author, date of publication...) and data related to proofreading (proofreading level, table of content...) on the Index: pages. More, as the Proofread Page extension considers that an Index page is about a scan (ie one or more files) I'm not sure that Index pages about books without scan will be managed well by the extension. I think that this is a matter of usability and user experience. If we are going to use Index pages, we'll let users *stay on Wikisource* the whole time, while the complexity and data workflow would be hidden to them. It's a *bad* thing to ask newbies to navigate through Wikisource (entry), then Commons (file upload), the Wikisource(create Index page), then Wikidata(fetch data), then Wikisource(start working on the book) again to work on just a book. For me this is one of the main obstacles to beginners, and we should try to ease things for people, IMHO. Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
You're right Aubrey nevertheless while promoving a user friendly interface the result is that data and wiki code is extremely difficult to use as a clean data base. Think only to wiki markup and the simple trick to mark bold and italic text with apostophes very user friendly, but something like a nightmare for a poor programmer which needs to find the algorithm to understand which apostophes are text and which are code. The server too can't solve solve apostrophes concatenation. Was it less user friendly to use something like b.../b? Yes; but how much cleaner raw wiki text would be! Distributed Proofreaders uses a completely different approach: there's a rigid set of increasing abilitations for users, and unexperienced users can do simple task only. This is far from wiki mentality, but we can't expect to keep things too much easy. Alex 2013/6/11 Andrea Zanni zanni.andre...@gmail.com On Tue, Jun 11, 2013 at 8:41 AM, Thomas PT thoma...@hotmail.fr wrote: Sorry if my answer is off-topic but if metadata are stored in WIkidata, is it really needed to create index pages to store the same data as Wikidata? As I see the things, we'll have bibliographical metadata on Wikidata (title, author, date of publication...) and data related to proofreading (proofreading level, table of content...) on the Index: pages. More, as the Proofread Page extension considers that an Index page is about a scan (ie one or more files) I'm not sure that Index pages about books without scan will be managed well by the extension. I think that this is a matter of usability and user experience. If we are going to use Index pages, we'll let users *stay on Wikisource* the whole time, while the complexity and data workflow would be hidden to them. It's a *bad* thing to ask newbies to navigate through Wikisource (entry), then Commons (file upload), the Wikisource(create Index page), then Wikidata(fetch data), then Wikisource(start working on the book) again to work on just a book. For me this is one of the main obstacles to beginners, and we should try to ease things for people, IMHO. Aubrey ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
On Tue, 11 Jun 2013 12:16:54 +0530, Aarti K. Dwivedi ellydwivedi2...@gmail.com wrote: A slighly off-topic question: Even if we modify the extension to proofread books which do not have scans( I am assuming books that were born digital ), against what will these books be proofread? I am not sure why we are looking to proofread a digital only file, unless of course it never had a text layer and it had to be OCR'd. Proofreading surely only relates to scanned images where there has been the need to proofread. Regards, Billinghurst ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
[Wikisource-l] About texts without supporting files and Index: pages
With the deployment of Wikidata it is a good moment to re-examine what Index pages are and what should be their function. The most direct transition to a Wikidata-supported Wikisource could be something like this: https://sites.google.com/site/dacuetu/BookData.pdf That would allow: - to share data book data between Commons, Wikisource and Wikipedia - to update it, when any of the sites has been updated - to facilitate better search functions (like searches by author, or topic, limiting the date range or the language) That would only apply to those texts which use a Index: page, so now the question is, what do we do with books that do not have supporting scans (and therefore no index page)? Some possible options: a) ignore pages without sources and focus only on works with supporting scans b) use ns0 pages also as data containers (instead of, or in addition to Index pages) c) create Index: pages for all works, with or without scans. Use that instead of Template:Textinfo Personally I prefer option c, even if it would require to rename Index: to Source: to make more clear what are those pages, however I would like to hear the opinion of other wikisourcerors about this. Cheers, Micru ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
@Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only import it. Besides, if the Wikidata client is installed in Wikisource, the inclusion syntax already takes care of displaying data... Micru On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo alex.bro...@gmail.com wrote: I don't see the need to change deeply Index/ns0 relationship, while I appreciate the idea promote coherence reducing redundance (many years ago I painfully used dBase III - dBase IV and I learned that principle by try and learn). Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a brief message about relationship among wikidata, commons, wikisource and any other project. Don't follow the link, it's so short that I copy it here (but if you like it, comment it there): Scribunto-Lua and Wikidata I'd like a library to get Wikidata content; it would be a good idea IMHO to access to Wikidata data in plain form, just as such data would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC) If such a Lua library could be built, to import data from wikidata would be as simple, as writing a template, and data will be self-aligned. Alex 2013/6/10 Aarti K. Dwivedi ellydwivedi2...@gmail.com Hi, There was a thread some time ago where there were talks of having books which were born digital. These pages wouldn't have scans. What the 'Index' page would have in these cases is something I am not very sure about. Cheers, Rtdwivedi On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca dacu...@gmail.com wrote: With the deployment of Wikidata it is a good moment to re-examine what Index pages are and what should be their function. The most direct transition to a Wikidata-supported Wikisource could be something like this: https://sites.google.com/site/dacuetu/BookData.pdf That would allow: - to share data book data between Commons, Wikisource and Wikipedia - to update it, when any of the sites has been updated - to facilitate better search functions (like searches by author, or topic, limiting the date range or the language) That would only apply to those texts which use a Index: page, so now the question is, what do we do with books that do not have supporting scans (and therefore no index page)? Some possible options: a) ignore pages without sources and focus only on works with supporting scans b) use ns0 pages also as data containers (instead of, or in addition to Index pages) c) create Index: pages for all works, with or without scans. Use that instead of Template:Textinfo Personally I prefer option c, even if it would require to rename Index: to Source: to make more clear what are those pages, however I would like to hear the opinion of other wikisourcerors about this. Cheers, Micru ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Aarti K. Dwivedi ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Etiamsi omnes, ego non ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
Simply there is no need to store data twice or more, if they are dinamically imported from wikidata. Such data would be simply generated by a normal template. Something similar to Commons media sharing: most wikipedians but beginners know that when you want to edit a shared media file, you must do you edit in Commons; there's no need to host a media file locally. So, IMHO a good Lua wikidata-reading library could avoid at all to store data in wikisource, or wikipedia, or Commons. Alex 2013/6/10 David Cuenca dacu...@gmail.com @Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only import it. Besides, if the Wikidata client is installed in Wikisource, the inclusion syntax already takes care of displaying data... Micru On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo alex.bro...@gmail.comwrote: I don't see the need to change deeply Index/ns0 relationship, while I appreciate the idea promote coherence reducing redundance (many years ago I painfully used dBase III - dBase IV and I learned that principle by try and learn). Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a brief message about relationship among wikidata, commons, wikisource and any other project. Don't follow the link, it's so short that I copy it here (but if you like it, comment it there): Scribunto-Lua and Wikidata I'd like a library to get Wikidata content; it would be a good idea IMHO to access to Wikidata data in plain form, just as such data would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC) If such a Lua library could be built, to import data from wikidata would be as simple, as writing a template, and data will be self-aligned. Alex 2013/6/10 Aarti K. Dwivedi ellydwivedi2...@gmail.com Hi, There was a thread some time ago where there were talks of having books which were born digital. These pages wouldn't have scans. What the 'Index' page would have in these cases is something I am not very sure about. Cheers, Rtdwivedi On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca dacu...@gmail.comwrote: With the deployment of Wikidata it is a good moment to re-examine what Index pages are and what should be their function. The most direct transition to a Wikidata-supported Wikisource could be something like this: https://sites.google.com/site/dacuetu/BookData.pdf That would allow: - to share data book data between Commons, Wikisource and Wikipedia - to update it, when any of the sites has been updated - to facilitate better search functions (like searches by author, or topic, limiting the date range or the language) That would only apply to those texts which use a Index: page, so now the question is, what do we do with books that do not have supporting scans (and therefore no index page)? Some possible options: a) ignore pages without sources and focus only on works with supporting scans b) use ns0 pages also as data containers (instead of, or in addition to Index pages) c) create Index: pages for all works, with or without scans. Use that instead of Template:Textinfo Personally I prefer option c, even if it would require to rename Index: to Source: to make more clear what are those pages, however I would like to hear the opinion of other wikisourcerors about this. Cheers, Micru ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Aarti K. Dwivedi ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Etiamsi omnes, ego non ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Re: [Wikisource-l] About texts without supporting files and Index: pages
No, it won't be stored in Wikisource, but still there is the need to present the information in a consistent manner. If you want to display the information on ns0, you will end up needing the same fields that the Index: page is using now. So why not to have the same solution for both? It could also be a template with a reduced set of fields that expands to show Template:Book with linked data from Wikidata, no matter if they have supporting scans or not. Micru On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo alex.bro...@gmail.com wrote: Simply there is no need to store data twice or more, if they are dinamically imported from wikidata. Such data would be simply generated by a normal template. Something similar to Commons media sharing: most wikipedians but beginners know that when you want to edit a shared media file, you must do you edit in Commons; there's no need to host a media file locally. So, IMHO a good Lua wikidata-reading library could avoid at all to store data in wikisource, or wikipedia, or Commons. Alex 2013/6/10 David Cuenca dacu...@gmail.com @Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only import it. Besides, if the Wikidata client is installed in Wikisource, the inclusion syntax already takes care of displaying data... Micru On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo alex.bro...@gmail.comwrote: I don't see the need to change deeply Index/ns0 relationship, while I appreciate the idea promote coherence reducing redundance (many years ago I painfully used dBase III - dBase IV and I learned that principle by try and learn). Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a brief message about relationship among wikidata, commons, wikisource and any other project. Don't follow the link, it's so short that I copy it here (but if you like it, comment it there): Scribunto-Lua and Wikidata I'd like a library to get Wikidata content; it would be a good idea IMHO to access to Wikidata data in plain form, just as such data would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC) If such a Lua library could be built, to import data from wikidata would be as simple, as writing a template, and data will be self-aligned. Alex 2013/6/10 Aarti K. Dwivedi ellydwivedi2...@gmail.com Hi, There was a thread some time ago where there were talks of having books which were born digital. These pages wouldn't have scans. What the 'Index' page would have in these cases is something I am not very sure about. Cheers, Rtdwivedi On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca dacu...@gmail.comwrote: With the deployment of Wikidata it is a good moment to re-examine what Index pages are and what should be their function. The most direct transition to a Wikidata-supported Wikisource could be something like this: https://sites.google.com/site/dacuetu/BookData.pdf That would allow: - to share data book data between Commons, Wikisource and Wikipedia - to update it, when any of the sites has been updated - to facilitate better search functions (like searches by author, or topic, limiting the date range or the language) That would only apply to those texts which use a Index: page, so now the question is, what do we do with books that do not have supporting scans (and therefore no index page)? Some possible options: a) ignore pages without sources and focus only on works with supporting scans b) use ns0 pages also as data containers (instead of, or in addition to Index pages) c) create Index: pages for all works, with or without scans. Use that instead of Template:Textinfo Personally I prefer option c, even if it would require to rename Index: to Source: to make more clear what are those pages, however I would like to hear the opinion of other wikisourcerors about this. Cheers, Micru ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Aarti K. Dwivedi ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l -- Etiamsi omnes, ego non ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l ___ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org
Re: [Wikisource-l] About texts without supporting files and Index: pages
I'm going to test what you are telling in a real Lua script; as you know, Lua can read the code of any page with one expensive server function only, so that a simple {{header|index name}} ns0 template call could read all the wiki code from index page, parse it, extract all its data content, and use it to build any html you like. No other field is needed. In it.wikisource we are testing something more complex, since we are exporting Index data into a local Lua data module, to be loaded with a mw.loadData function that is not listed as server-expensive; but I presume that wiki servers would not be overloaded by *one* server expensive call If Im not going wrong, such a script could be written tomorrow by a good Lua programmer I'll need some more time as a beginner. I'll test a MediaWiki:Proofreadpage_index_template Lua loader parser working into ns0, just to see if all runs as I guess, then I'll tell you in this thread. In which wikisource project do you work usually? Alex 2013/6/11 David Cuenca dacu...@gmail.com No, it won't be stored in Wikisource, but still there is the need to present the information in a consistent manner. If you want to display the information on ns0, you will end up needing the same fields that the Index: page is using now. So why not to have the same solution for both? It could also be a template with a reduced set of fields that expands to show Template:Book with linked data from Wikidata, no matter if they have supporting scans or not. Micru On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo alex.bro...@gmail.comwrote: Simply there is no need to store data twice or more, if they are dinamically imported from wikidata. Such data would be simply generated by a normal template. Something similar to Commons media sharing: most wikipedians but beginners know that when you want to edit a shared media file, you must do you edit in Commons; there's no need to host a media file locally. So, IMHO a good Lua wikidata-reading library could avoid at all to store data in wikisource, or wikipedia, or Commons. Alex 2013/6/10 David Cuenca dacu...@gmail.com @Alex: but what do you think of storing the source information in Index: pages for all works stored in Wikisource, even if they don't have a supporting scan? That was the original question :) About your proposed library, it would be more useful if it could modify data in Wikidata, not only import it. Besides, if the Wikidata client is installed in Wikisource, the inclusion syntax already takes care of displaying data... Micru On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo alex.bro...@gmail.comwrote: I don't see the need to change deeply Index/ns0 relationship, while I appreciate the idea promote coherence reducing redundance (many years ago I painfully used dBase III - dBase IV and I learned that principle by try and learn). Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a brief message about relationship among wikidata, commons, wikisource and any other project. Don't follow the link, it's so short that I copy it here (but if you like it, comment it there): Scribunto-Lua and Wikidata I'd like a library to get Wikidata content; it would be a good idea IMHO to access to Wikidata data in plain form, just as such data would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC) If such a Lua library could be built, to import data from wikidata would be as simple, as writing a template, and data will be self-aligned. Alex 2013/6/10 Aarti K. Dwivedi ellydwivedi2...@gmail.com Hi, There was a thread some time ago where there were talks of having books which were born digital. These pages wouldn't have scans. What the 'Index' page would have in these cases is something I am not very sure about. Cheers, Rtdwivedi On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca dacu...@gmail.comwrote: With the deployment of Wikidata it is a good moment to re-examine what Index pages are and what should be their function. The most direct transition to a Wikidata-supported Wikisource could be something like this: https://sites.google.com/site/dacuetu/BookData.pdf That would allow: - to share data book data between Commons, Wikisource and Wikipedia - to update it, when any of the sites has been updated - to facilitate better search functions (like searches by author, or topic, limiting the date range or the language) That would only apply to those texts which use a Index: page, so now the question is, what do we do with books that do not have supporting scans (and therefore no index page)? Some possible options: a) ignore pages without sources and focus only on works with supporting scans b) use ns0 pages also as data containers (instead of, or in addition to Index pages) c) create Index: pages for all works, with or without scans. Use that instead of Template:Textinfo Personally I prefer option c, even if it