Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-13 Thread Andrea Zanni
On Wed, Jun 12, 2013 at 4:47 PM, Aarti K. Dwivedi  wrote:

> If I am not wrong, as of today, most books that were born digital, are
> still under copyright. Of course, they are available freely on the
> internet. But we can't use the pirated copies. How would we go about the
> procurement of these books?
> If we procure these copyrighted books, then the only we would have to do
> is to check for proper formatting. Isn't it?
>

You are thinking of *books*, which are not the only documents Wikisource
can host.
For example, I am thinking about Open Access literature, which counts in
hundred thousands CC-BY licensed articles, for example.
Just look in DOAJ: http://www.doaj.org/

One of the wikimedians most involved in Open Access - Wiki collaboration is
Daniel Mietchen (cc'ed).
He's working on a bot who could grab the XML/HTML of an online article,
format it in wikicode, and post it wherever he wants (maybe, Wikisources).
The bot is aming to download automatically all images within the articles,
and post them on Commons.

I personally think that this project is beyond awesomeness,
IF we manage to solve particular and specific issues (as converting
hyperlinks to other articles in wikilinks to those articles posted on
WIkisource...)

As I said before, I see Wikisource as a broad, international, connected,
hypertextual digital library,
which has a thing no other digital library in the world has: a dedicated
community[*].

It is my personal opinion, I know some people don't see it that way (like
Alex :-D)


Aubrey

[*] there is Project Gutenberg, but I would argue they are not a digital
library...
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Alex Brollo
When we tried to convert into wiki code (a needed step to add links and to
convert files into a "wiki hypertext") a pdf file, that's a opaque, closed
format, such a work turned off in a nightmare. If we simply load free pdf
books "as they are", I don't see any advantage, but "feed wikisource
numbers/statistics" nd this in presently far from my personal interest.

As you guess, I'm one of users who don't support Aubrey's enthusiasm about
 texts born digital, even if free. :-)

Alex


2013/6/12 David Cuenca 

> Nobody is saying anything about using copyrighted works, there are many
> books that have an open license that would allow to include them in
> Wikisource.
>
> For instance in ca-ws we have this translation from 2009:
>
> http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%282009%29.djvu
>
> The original is in the PD, and the translator gave away his rights. It
> would have been much easier to work directly with the pdf, instead of
> converting to djvu.
>
> Micru
>
>
> On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi <
> ellydwivedi2...@gmail.com> wrote:
>
>> If I am not wrong, as of today, most books that were born digital, are
>> still under copyright. Of course, they are available freely on the
>> internet. But we can't use the pirated copies. How would we go about the
>> procurement of these books?
>> If we procure these copyrighted books, then the only we would have to do
>> is to check for proper formatting. Isn't it?
>>
>>
>> On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson  wrote:
>>
>>> On 06/12/2013 02:48 PM, Andrea Zanni wrote:
>>>
 We could define some tasks as
 * corrected the page
 * OPTIONAL added optional templates/links/annotations
 *...

>>>
>>> Geotagged all the photos, ...
>>>
>>> The list doesn't end. You need a generic mechanism
>>> for any new feature you can invent. But aren't our
>>> existing templates and categories the best way to
>>> do this? You could just add to each page:
>>> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}}
>>>
>>>
>>> --
>>>   Lars Aronsson (l...@aronsson.se)
>>>   Project Runeberg - free Nordic literature - http://runeberg.org/
>>>
>>>
>>>
>>>
>>> __**_
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.**org 
>>> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l
>>>
>>
>>
>>
>> --
>> Aarti K. Dwivedi
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> --
> Etiamsi omnes, ego non
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread David Cuenca
Nobody is saying anything about using copyrighted works, there are many
books that have an open license that would allow to include them in
Wikisource.

For instance in ca-ws we have this translation from 2009:
http://ca.wikisource.org/wiki/Llibre:El_secret_de_l%E2%80%99or_que_creix_%282009%29.djvu

The original is in the PD, and the translator gave away his rights. It
would have been much easier to work directly with the pdf, instead of
converting to djvu.

Micru

On Wed, Jun 12, 2013 at 10:47 AM, Aarti K. Dwivedi <
ellydwivedi2...@gmail.com> wrote:

> If I am not wrong, as of today, most books that were born digital, are
> still under copyright. Of course, they are available freely on the
> internet. But we can't use the pirated copies. How would we go about the
> procurement of these books?
> If we procure these copyrighted books, then the only we would have to do
> is to check for proper formatting. Isn't it?
>
>
> On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson  wrote:
>
>> On 06/12/2013 02:48 PM, Andrea Zanni wrote:
>>
>>> We could define some tasks as
>>> * corrected the page
>>> * OPTIONAL added optional templates/links/annotations
>>> *...
>>>
>>
>> Geotagged all the photos, ...
>>
>> The list doesn't end. You need a generic mechanism
>> for any new feature you can invent. But aren't our
>> existing templates and categories the best way to
>> do this? You could just add to each page:
>> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}}
>>
>>
>> --
>>   Lars Aronsson (l...@aronsson.se)
>>   Project Runeberg - free Nordic literature - http://runeberg.org/
>>
>>
>>
>>
>> __**_
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.**org 
>> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l
>>
>
>
>
> --
> Aarti K. Dwivedi
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Etiamsi omnes, ego non
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Aarti K. Dwivedi
If I am not wrong, as of today, most books that were born digital, are
still under copyright. Of course, they are available freely on the
internet. But we can't use the pirated copies. How would we go about the
procurement of these books?
If we procure these copyrighted books, then the only we would have to do is
to check for proper formatting. Isn't it?


On Wed, Jun 12, 2013 at 7:58 PM, Lars Aronsson  wrote:

> On 06/12/2013 02:48 PM, Andrea Zanni wrote:
>
>> We could define some tasks as
>> * corrected the page
>> * OPTIONAL added optional templates/links/annotations
>> *...
>>
>
> Geotagged all the photos, ...
>
> The list doesn't end. You need a generic mechanism
> for any new feature you can invent. But aren't our
> existing templates and categories the best way to
> do this? You could just add to each page:
> {{done|proofread=user1|**validated=user2|geotagged=**user4|...}}
>
>
> --
>   Lars Aronsson (l...@aronsson.se)
>   Project Runeberg - free Nordic literature - http://runeberg.org/
>
>
>
>
> __**_
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.**org 
> https://lists.wikimedia.org/**mailman/listinfo/wikisource-l
>



-- 
Aarti K. Dwivedi
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Lars Aronsson

On 06/12/2013 02:48 PM, Andrea Zanni wrote:

We could define some tasks as
* corrected the page
* OPTIONAL added optional templates/links/annotations
*...


Geotagged all the photos, ...

The list doesn't end. You need a generic mechanism
for any new feature you can invent. But aren't our
existing templates and categories the best way to
do this? You could just add to each page:
{{done|proofread=user1|validated=user2|geotagged=user4|...}}


--
  Lars Aronsson (l...@aronsson.se)
  Project Runeberg - free Nordic literature - http://runeberg.org/



___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread David Cuenca
I think everything is doable, the problem is how to do it without
cluttering the interface and keeping things simple.

Some levels might be redundant and we could take the chance to think if
they are really necessary.

Some proposed changes:
- Proofread page levels: "Unused", "Proofread", "Proofread with format",
"Validated" (the "unused" level would mean: pages with no text, ocr text,
pages with irrelevant content).
- All pages would be created at start with the extracted ocr text at
"unused" level, so finally search engines could also find our texts even if
they are not started yet
- A checkbox list to tag pages: "damaged scan", "missing scan", "contains
media" (image, score, etc)
- Color codes: like now plus orange for "Proofread with format". Page with
tags would affect the color too. "damaged" would make the color half purple
and half the corresponding proofread level color, "contains media" could
add a (black?) square around the page number
- Proofread book levels should be automatic to the lowest page level, plus
two options, one to mark the book as "ready to export" and another one to
mark it as "digital source", which would bring all pages at "proofread"
level.

For the metadata interface I keep thinking about it, and my impression is
that we should start working from Template:Book [1] until having a version
that can be used across Commons, Index pages, and books without supporting
scans (in this last case it could be the same header template with an
option to expand it to show the whole template:book).
That template also might need some coloring/reorganizing to reflect the
Work/Edition distinction that Wikidata is bringing [2]
And if with Lua it is possible to read/write Wikidata, then the possible
migration towards a Wikidata-powered Wikisource shouldn't be that far away.

Cheers,
Micru

[1] http://commons.wikimedia.org/wiki/Template:Book
[2] http://www.wikidata.org/wiki/Wikidata:Books_task_force


On Wed, Jun 12, 2013 at 8:48 AM, Andrea Zanni wrote:

>
> On Wed, Jun 12, 2013 at 2:32 PM, Thibaut Horel wrote:
>
>> 3. The current system with 4 quality levels to represent the proofreading
>> state of a page is not sufficient to represent the diversity of
>> proofreading scenarios. Indeed, there is a distinction to make between the
>> *correctness* of the text and its *formatting*. In the case of a scanned
>> edition which has been OCRed, we do need several passes before reaching a
>> satisfying level of confidence about the correctness of the text as well as
>> a suitable formatting (proper use of the wikicode, etc.). For digital-born
>> documents however, as billinghurst said, we can automatically assume that
>> the extracted text is correct, but that still doesn't mean that the text is
>> correctly formatted and ready to be transcluded in the main namespace.
>> Maybe we should add another level meaning "text is correct, still needs
>> formatting"? Ideally, we should have to scales of quality levels: one
>> dealing with the correctness of the text, and one dealing with its
>> formatting. This would probably be too heavy and confusing though...
>
>
> I couldn't agree more.
> I think this could be an opportunity also to make task *smaller* and
> *clearer*
> (in the direction of "microtask", which are contributions in crowdsourcing
> projects which are small, definite and simple. eg GalaxyZoo, reCAPTCHA).
>
> We could define some tasks as
> * corrected the page
> * proofread the text
> * formatted the page
> * validated the formatting
> * OPTIONAL added optional templates/links/annotations
> *...
>
> We could even have qualifiers (all/part of the page, ...)
>
> Is this idea crazy, or somewhat doable?
>
> Aubrey
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Etiamsi omnes, ego non
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Andrea Zanni
On Wed, Jun 12, 2013 at 2:32 PM, Thibaut Horel wrote:

> 3. The current system with 4 quality levels to represent the proofreading
> state of a page is not sufficient to represent the diversity of
> proofreading scenarios. Indeed, there is a distinction to make between the
> *correctness* of the text and its *formatting*. In the case of a scanned
> edition which has been OCRed, we do need several passes before reaching a
> satisfying level of confidence about the correctness of the text as well as
> a suitable formatting (proper use of the wikicode, etc.). For digital-born
> documents however, as billinghurst said, we can automatically assume that
> the extracted text is correct, but that still doesn't mean that the text is
> correctly formatted and ready to be transcluded in the main namespace.
> Maybe we should add another level meaning "text is correct, still needs
> formatting"? Ideally, we should have to scales of quality levels: one
> dealing with the correctness of the text, and one dealing with its
> formatting. This would probably be too heavy and confusing though...


I couldn't agree more.
I think this could be an opportunity also to make task *smaller* and
*clearer*
(in the direction of "microtask", which are contributions in crowdsourcing
projects which are small, definite and simple. eg GalaxyZoo, reCAPTCHA).

We could define some tasks as
* corrected the page
* proofread the text
* formatted the page
* validated the formatting
* OPTIONAL added optional templates/links/annotations
*...

We could even have qualifiers (all/part of the page, ...)

Is this idea crazy, or somewhat doable?

Aubrey
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Thibaut Horel
Hi everybody,

Here is my attempt at giving my point of view while trying to summarize
the discussion:

1. I think the role of Index: pages should be to present the *source* of
a work. This is true whether the source is a scanned edition (as is most
often the case at the moment), or a digital PDF (that is, containing
text and not images) as is the case for most "digital-born" documents. I
think it is good to have a neat separation between the original source
and how Wikisource presents the work in the main namespace. Indeed, even
if Wikisource tries to be as true as possible to the original content,
there are very often some changes in the way it is presented in the main
namespace.

2. Ideally, the metadata about the source of a work (author, date of
printing, etc.) should be located in Wikidata. But metadata related to
proofreading (e.g. the proofreading level of each individual page),
being specific to the mission of Wikisource, should be located in
Wikisource. How to do this while keeping the interface simple (i.e. hide
it from the user so that she doesn't have to go from Wikisource to
Wikidata to Wikisource) is a valid and very important concern, but is
also beyond my current understanding of Wikidata and its integration
into Wikimedia projects.

3. The current system with 4 quality levels to represent the
proofreading state of a page is not sufficient to represent the
diversity of proofreading scenarios. Indeed, there is a distinction to
make between the *correctness* of the text and its *formatting*. In the
case of a scanned edition which has been OCRed, we do need several
passes before reaching a satisfying level of confidence about the
correctness of the text as well as a suitable formatting (proper use of
the wikicode, etc.). For digital-born documents however, as billinghurst
said, we can automatically assume that the extracted text is correct,
but that still doesn't mean that the text is correctly formatted and
ready to be transcluded in the main namespace. Maybe we should add
another level meaning "text is correct, still needs formatting"?
Ideally, we should have to scales of quality levels: one dealing with
the correctness of the text, and one dealing with its formatting. This
would probably be too heavy and confusing though...

Thibaut (user:Zaran on Wikisource)

On 06/12/2013 01:35 PM, Andrea Zanni wrote:
>
> On Wed, Jun 12, 2013 at 1:32 PM, billinghurst  > wrote:
>
> If you are talking about how we represent digitally prepared text
> with the
> validation process. I would have no issue with the text being
> ripped and
> having a bot run through and taking it straight to level 4
> (green), and
> then redefining green to say validated, or digitally prepared text not
> requiring validation.
>
> At the same time, if someone proposed and generates a fifth colour to
> represent digitally prepared text not requiring proofreading, then
> I will
> be happy with that. It may make someone happier in being a truer
> representation, but in the end to me it is a moot point. In the
> end, each
> of those is a local community decision, though one that should be
> made in
> consideration of how the other wikis interpret their processes.
>
>
> Thanks for clarifying this.
> I agree with you, and would welcome both solutions.
>
> But a lot of wikisourcerors don't think this way, 
> so better discuss :-)
>
> Aubrey
>
>
>
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread Andrea Zanni
On Wed, Jun 12, 2013 at 1:32 PM, billinghurst wrote:

> If you are talking about how we represent digitally prepared text with the
> validation process. I would have no issue with the text being ripped and
> having a bot run through and taking it straight to level 4 (green), and
> then redefining green to say validated, or digitally prepared text not
> requiring validation.
>
> At the same time, if someone proposed and generates a fifth colour to
> represent digitally prepared text not requiring proofreading, then I will
> be happy with that. It may make someone happier in being a truer
> representation, but in the end to me it is a moot point. In the end, each
> of those is a local community decision, though one that should be made in
> consideration of how the other wikis interpret their processes.
>

Thanks for clarifying this.
I agree with you, and would welcome both solutions.

But a lot of wikisourcerors don't think this way,
so better discuss :-)

Aubrey
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-12 Thread billinghurst
You need to be cautious talking about "PDF" documents, as it is not the
document presentation format, it is the source of the text. So I like to
talk as the source being digitally prepared (and not requiring validation,
though may require formatting), or OCR'd (requiring validation, and
probably formatting.)

If you are talking about how we represent digitally prepared text with the
validation process. I would have no issue with the text being ripped and
having a bot run through and taking it straight to level 4 (green), and
then redefining green to say validated, or digitally prepared text not
requiring validation.

At the same time, if someone proposed and generates a fifth colour to
represent digitally prepared text not requiring proofreading, then I will
be happy with that. It may make someone happier in being a truer
representation, but in the end to me it is a moot point. In the end, each
of those is a local community decision, though one that should be made in
consideration of how the other wikis interpret their processes.

Regards, Billinghurst


On Tue, 11 Jun 2013 15:12:41 -0400, David Cuenca 
wrote:
> @Billinghurst, I think Aubrey was referring mainly to pdf files, which
> sometimes have text and format but they are not that easy to represent
in
> Wikisource. The main problem is that our current workflow always assume
> that we are going to proofread a text and have it stored as a web page.
> 
> @others: for me it doesn't matter much if the representation of the
> metadata is done by a template, an index page, or something different
> (maybe related to the new Extension:BookManager?)
> However I think that from the user point of view it is better to have a
> consistent system that can handle:
> 1) representation of book/source metadata
> 2) give access to export/visualization options
> 
> I'm preparing a document with some ideas that we can discuss here.
> 
> Micru
> 
> On Tue, Jun 11, 2013 at 7:48 AM, billinghurst
> wrote:
> 
>> On Tue, 11 Jun 2013 12:16:54 +0530, "Aarti K. Dwivedi"
>>  wrote:
>> > A slighly off-topic question: Even if we modify the extension to
>> proofread
>> > books which do not have scans( I am assuming books that were born
>> digital
>> > ), against what
>> > will these books be proofread?
>> >
>>
>> I am not sure why we are looking to proofread a digital only file,
unless
>> of course it never had a text layer and it had to be OCR'd. 
Proofreading
>> surely only relates to scanned images where there has been the need to
>> proofread.
>>
>> Regards, Billinghurst
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread David Cuenca
@Billinghurst, I think Aubrey was referring mainly to pdf files, which
sometimes have text and format but they are not that easy to represent in
Wikisource. The main problem is that our current workflow always assume
that we are going to proofread a text and have it stored as a web page.

@others: for me it doesn't matter much if the representation of the
metadata is done by a template, an index page, or something different
(maybe related to the new Extension:BookManager?)
However I think that from the user point of view it is better to have a
consistent system that can handle:
1) representation of book/source metadata
2) give access to export/visualization options

I'm preparing a document with some ideas that we can discuss here.

Micru

On Tue, Jun 11, 2013 at 7:48 AM, billinghurst wrote:

> On Tue, 11 Jun 2013 12:16:54 +0530, "Aarti K. Dwivedi"
>  wrote:
> > A slighly off-topic question: Even if we modify the extension to
> proofread
> > books which do not have scans( I am assuming books that were born
> digital
> > ), against what
> > will these books be proofread?
> >
>
> I am not sure why we are looking to proofread a digital only file, unless
> of course it never had a text layer and it had to be OCR'd.  Proofreading
> surely only relates to scanned images where there has been the need to
> proofread.
>
> Regards, Billinghurst
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>



-- 
Etiamsi omnes, ego non
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread billinghurst
On Tue, 11 Jun 2013 12:16:54 +0530, "Aarti K. Dwivedi"
 wrote:
> A slighly off-topic question: Even if we modify the extension to
proofread
> books which do not have scans( I am assuming books that were born
digital
> ), against what
> will these books be proofread?
> 

I am not sure why we are looking to proofread a digital only file, unless
of course it never had a text layer and it had to be OCR'd.  Proofreading
surely only relates to scanned images where there has been the need to
proofread.

Regards, Billinghurst

___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread Alex Brollo
I apologyze

" The server too can't solve *some* apostrophes concatenation"

Alex


2013/6/11 Alex Brollo 

> You're right Aubrey nevertheless while promoving a user friendly interface
> the result is that data and wiki code is extremely difficult to use as a
> clean "data base". Think only to wiki markup and the "simple" trick to mark
> bold and italic text with apostophes very user friendly, but something
> like a nightmare for a poor programmer which needs to find the algorithm to
> understand which apostophes are text and which are code. The server too
> can't solve solve apostrophes concatenation. Was it less user friendly to
> use something like ...? Yes; but how much cleaner raw wiki text
> would be!
>
> Distributed Proofreaders uses a completely different approach: there's a
> rigid set of increasing abilitations for users, and unexperienced users can
> do simple task only. This is far from "wiki mentality", but we can't expect
> to keep things too much easy.
>
> Alex
>
>
> 2013/6/11 Andrea Zanni 
>
>> On Tue, Jun 11, 2013 at 8:41 AM, Thomas PT  wrote:
>>
>>> Sorry if my answer is off-topic but if metadata are stored in WIkidata,
>>> is it really needed to create index pages to store the same data as
>>> Wikidata?
>>> As I see the things, we'll have bibliographical metadata on Wikidata
>>> (title, author, date of publication...) and data related to proofreading
>>> (proofreading level, table of content...) on the Index: pages. More, as the
>>> Proofread Page extension considers that an Index page is about a scan (ie
>>> one or more files) I'm not sure that Index pages about books without scan
>>> will be managed well by the extension.
>>>
>>> I think that this is a matter of usability and user experience.
>> If we are going to use Index pages, we'll let users *stay on Wikisource*
>> the whole time, while the complexity and data workflow would be hidden to
>> them.
>> It's a *bad* thing to ask newbies to navigate through Wikisource (entry),
>> then Commons (file upload), the Wikisource(create Index page), then
>> Wikidata(fetch data), then Wikisource(start working on the book) again to
>> work on just a book.
>>
>> For me this is one of the main obstacles to beginners, and we should try
>> to ease things for people, IMHO.
>>
>>  Aubrey
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread Alex Brollo
You're right Aubrey nevertheless while promoving a user friendly interface
the result is that data and wiki code is extremely difficult to use as a
clean "data base". Think only to wiki markup and the "simple" trick to mark
bold and italic text with apostophes very user friendly, but something
like a nightmare for a poor programmer which needs to find the algorithm to
understand which apostophes are text and which are code. The server too
can't solve solve apostrophes concatenation. Was it less user friendly to
use something like ...? Yes; but how much cleaner raw wiki text
would be!

Distributed Proofreaders uses a completely different approach: there's a
rigid set of increasing abilitations for users, and unexperienced users can
do simple task only. This is far from "wiki mentality", but we can't expect
to keep things too much easy.

Alex


2013/6/11 Andrea Zanni 

> On Tue, Jun 11, 2013 at 8:41 AM, Thomas PT  wrote:
>
>> Sorry if my answer is off-topic but if metadata are stored in WIkidata,
>> is it really needed to create index pages to store the same data as
>> Wikidata?
>> As I see the things, we'll have bibliographical metadata on Wikidata
>> (title, author, date of publication...) and data related to proofreading
>> (proofreading level, table of content...) on the Index: pages. More, as the
>> Proofread Page extension considers that an Index page is about a scan (ie
>> one or more files) I'm not sure that Index pages about books without scan
>> will be managed well by the extension.
>>
>> I think that this is a matter of usability and user experience.
> If we are going to use Index pages, we'll let users *stay on Wikisource*
> the whole time, while the complexity and data workflow would be hidden to
> them.
> It's a *bad* thing to ask newbies to navigate through Wikisource (entry),
> then Commons (file upload), the Wikisource(create Index page), then
> Wikidata(fetch data), then Wikisource(start working on the book) again to
> work on just a book.
>
> For me this is one of the main obstacles to beginners, and we should try
> to ease things for people, IMHO.
>
>  Aubrey
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread Andrea Zanni
On Tue, Jun 11, 2013 at 8:41 AM, Thomas PT  wrote:

> Sorry if my answer is off-topic but if metadata are stored in WIkidata, is
> it really needed to create index pages to store the same data as Wikidata?
> As I see the things, we'll have bibliographical metadata on Wikidata
> (title, author, date of publication...) and data related to proofreading
> (proofreading level, table of content...) on the Index: pages. More, as the
> Proofread Page extension considers that an Index page is about a scan (ie
> one or more files) I'm not sure that Index pages about books without scan
> will be managed well by the extension.
>
> I think that this is a matter of usability and user experience.
If we are going to use Index pages, we'll let users *stay on Wikisource*
the whole time, while the complexity and data workflow would be hidden to
them.
It's a *bad* thing to ask newbies to navigate through Wikisource (entry),
then Commons (file upload), the Wikisource(create Index page), then
Wikidata(fetch data), then Wikisource(start working on the book) again to
work on just a book.

For me this is one of the main obstacles to beginners, and we should try to
ease things for people, IMHO.

 Aubrey
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-11 Thread Andrea Zanni
@aarti: sometimes some books/text/documents are born-digital.
Think about all the scientific literature, or Phd thesis. These files (if
cc-by/sa licensed) could be stored in Wikisource, and be useful for the
wikicommunity.
We already have some means to link those text to their source (with a URL).

It's a long time "controversy" if we must or must not allow documents
without scans on Wikisource.
Every community should decide by itself. My personal POV (also as a
"librarian"), is that if we leave out born digital documents we are
forgetting the bulk of the stuff.
I think that one of the most important added values of Wikisource is
integrating texts with other Wikimedia projects, and (wiki)linking and
connecting each other.
No other digital library do that on the Internet, and we can do it because
we have a community.

So, these texts will have a source. I do think that proofreading a born
digital PDF is a waste of time.

Aubrey



On Tue, Jun 11, 2013 at 8:46 AM, Aarti K. Dwivedi  wrote:

> A slighly off-topic question: Even if we modify the extension to proofread
> books which do not have scans( I am assuming books that were born digital
> ), against what
> will these books be proofread?
>
>
> On Tue, Jun 11, 2013 at 12:11 PM, Thomas PT  wrote:
>
>> Sorry if my answer is off-topic but if metadata are stored in WIkidata,
>> is it really needed to create index pages to store the same data as
>> Wikidata?
>> As I see the things, we'll have bibliographical metadata on Wikidata
>> (title, author, date of publication...) and data related to proofreading
>> (proofreading level, table of content...) on the Index: pages. More, as the
>> Proofread Page extension considers that an Index page is about a scan (ie
>> one or more files) I'm not sure that Index pages about books without scan
>> will be managed well by the extension.
>>
>> {{header|index name}} is already done, for books with scan, by the
>> Proofread Page extension with the header=1 feature. In fr Wikisource, we
>> already use a Lua module to manage the
>> Mediawiki:Proofreadpage_header_template template used by the header=1
>> feature. https://fr.wikisource.org/wiki/Module:Header_template This
>> template outputs automatically metadata and navigation from the index page
>> TOC (but it allows also to override data).
>>
>> Tpt
>>
>> --------------
>> Date: Tue, 11 Jun 2013 01:33:39 +0200
>> From: alex.bro...@gmail.com
>> To: wikisource-l@lists.wikimedia.org
>> Subject: Re: [Wikisource-l] About texts without supporting files and
>> "Index:" pages
>>
>>
>> I'm going to test what you are telling in a real Lua script; as you know,
>> Lua can read the code of any page with one "expensive" server function
>> only, so that a simple {{header|index name}} ns0 template call could read
>> all the wiki code from index page, parse it, extract all its data content,
>> and use it to build any html you like. No other field is needed. In
>> it.wikisource we are testing something more complex, since we are exporting
>> Index data into a local Lua data module, to be loaded with a mw.loadData
>> function that is not listed  as "server-expensive"; but I presume that wiki
>> servers would not be overloaded by *one* server expensive call
>>
>> If Im not going wrong, such a script could be written tomorrow by a good
>> Lua programmer I'll need some more time as a beginner.  I'll test
>> a "MediaWiki:Proofreadpage_index_template" Lua loader & parser working into
>> ns0, just to see if all runs as I guess, then I'll tell you in this thread.
>> In which wikisource project do you work usually?
>>
>> Alex
>>
>>
>>
>> 2013/6/11 David Cuenca 
>>
>> No, it won't be stored in Wikisource, but still there is the need to
>> present the information in a consistent manner.
>> If you want to display the information on ns0, you will end up needing
>> the same fields that the "Index:" page is using now.
>> So why not to have the same solution for both?
>>
>> It could also be a template with a reduced set of fields that expands to
>> show "Template:Book" with linked data from Wikidata, no matter if they have
>> supporting scans or not.
>>
>> Micru
>>
>>
>> On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo wrote:
>>
>> Simply there is no need to store data twice or more, if they are
>> dinamically imported from wikidata. Such data would be simply generated by
>> a normal template. Something similar to Commons media shari

Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Aarti K. Dwivedi
A slighly off-topic question: Even if we modify the extension to proofread
books which do not have scans( I am assuming books that were born digital
), against what
will these books be proofread?


On Tue, Jun 11, 2013 at 12:11 PM, Thomas PT  wrote:

> Sorry if my answer is off-topic but if metadata are stored in WIkidata, is
> it really needed to create index pages to store the same data as Wikidata?
> As I see the things, we'll have bibliographical metadata on Wikidata
> (title, author, date of publication...) and data related to proofreading
> (proofreading level, table of content...) on the Index: pages. More, as the
> Proofread Page extension considers that an Index page is about a scan (ie
> one or more files) I'm not sure that Index pages about books without scan
> will be managed well by the extension.
>
> {{header|index name}} is already done, for books with scan, by the
> Proofread Page extension with the header=1 feature. In fr Wikisource, we
> already use a Lua module to manage the
> Mediawiki:Proofreadpage_header_template template used by the header=1
> feature. https://fr.wikisource.org/wiki/Module:Header_template This
> template outputs automatically metadata and navigation from the index page
> TOC (but it allows also to override data).
>
> Tpt
>
> --
> Date: Tue, 11 Jun 2013 01:33:39 +0200
> From: alex.bro...@gmail.com
> To: wikisource-l@lists.wikimedia.org
> Subject: Re: [Wikisource-l] About texts without supporting files and
> "Index:" pages
>
>
> I'm going to test what you are telling in a real Lua script; as you know,
> Lua can read the code of any page with one "expensive" server function
> only, so that a simple {{header|index name}} ns0 template call could read
> all the wiki code from index page, parse it, extract all its data content,
> and use it to build any html you like. No other field is needed. In
> it.wikisource we are testing something more complex, since we are exporting
> Index data into a local Lua data module, to be loaded with a mw.loadData
> function that is not listed  as "server-expensive"; but I presume that wiki
> servers would not be overloaded by *one* server expensive call
>
> If Im not going wrong, such a script could be written tomorrow by a good
> Lua programmer I'll need some more time as a beginner.  I'll test
> a "MediaWiki:Proofreadpage_index_template" Lua loader & parser working into
> ns0, just to see if all runs as I guess, then I'll tell you in this thread.
> In which wikisource project do you work usually?
>
> Alex
>
>
>
> 2013/6/11 David Cuenca 
>
> No, it won't be stored in Wikisource, but still there is the need to
> present the information in a consistent manner.
> If you want to display the information on ns0, you will end up needing the
> same fields that the "Index:" page is using now.
> So why not to have the same solution for both?
>
> It could also be a template with a reduced set of fields that expands to
> show "Template:Book" with linked data from Wikidata, no matter if they have
> supporting scans or not.
>
> Micru
>
>
> On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo wrote:
>
> Simply there is no need to store data twice or more, if they are
> dinamically imported from wikidata. Such data would be simply generated by
> a normal template. Something similar to Commons media sharing: most
> wikipedians but beginners know that when you want to edit a shared media
> file, you must do you edit in Commons; there's no need to host a media file
> locally.
>
> So, IMHO a good Lua wikidata-reading library could avoid at all to store
> data in wikisource, or wikipedia, or Commons.
>
> Alex
>
>
> 2013/6/10 David Cuenca 
>
> @Alex: but what do you think of storing the source information in "Index:"
> pages for all works stored in Wikisource, even if they don't have a
> supporting scan?
>
> That was the original question :)
>
> About your proposed library, it would be more useful if it could modify
> data in Wikidata, not only import it. Besides, if the Wikidata client is
> installed in Wikisource, the inclusion syntax already takes care of
> displaying data...
>
> Micru
>
>
> On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo wrote:
>
> I don't see the need to change deeply Index/ns0 relationship, while I
> appreciate the idea "promote coherence reducing redundance" (many years ago
> I painfully used dBase III - dBase IV and I learned that principle by "try
> and learn").
>
> Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
> brief message about rela

Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Thomas PT
Sorry if my answer is off-topic but if metadata are stored in WIkidata, is it 
really needed to create index pages to store the same data as Wikidata?
As I see the things, we'll have bibliographical metadata on Wikidata (title, 
author, date of publication...) and data related to proofreading (proofreading 
level, table of content...) on the Index: pages. More, as the Proofread Page 
extension considers that an Index page is about a scan (ie one or more files) 
I'm not sure that Index pages about books without scan will be managed well by 
the extension.

{{header|index name}} is already done, for books with scan, by the Proofread 
Page extension with the header=1 feature. In fr Wikisource, we already use a 
Lua module to manage the Mediawiki:Proofreadpage_header_template template used 
by the header=1 feature. https://fr.wikisource.org/wiki/Module:Header_template 
This template outputs automatically metadata and navigation from the index page 
TOC (but it allows also to override data).

Tpt

Date: Tue, 11 Jun 2013 01:33:39 +0200
From: alex.bro...@gmail.com
To: wikisource-l@lists.wikimedia.org
Subject: Re: [Wikisource-l] About texts without supporting files and    
"Index:" pages

I'm going to test what you are telling in a real Lua script; as you know, Lua 
can read the code of any page with one "expensive" server function only, so 
that a simple {{header|index name}} ns0 template call could read all the wiki 
code from index page, parse it, extract all its data content, and use it to 
build any html you like. No other field is needed. In it.wikisource we are 
testing something more complex, since we are exporting Index data into a local 
Lua data module, to be loaded with a mw.loadData function that is not listed  
as "server-expensive"; but I presume that wiki servers would not be overloaded 
by one server expensive call

If Im not going wrong, such a script could be written tomorrow by a good Lua 
programmer I'll need some more time as a beginner.  I'll test a 
"MediaWiki:Proofreadpage_index_template" Lua loader & parser working into ns0, 
just to see if all runs as I guess, then I'll tell you in this thread. In which 
wikisource project do you work usually?

Alex


2013/6/11 David Cuenca 

No, it won't be stored in Wikisource, but still there is the need to present 
the information in a consistent manner.

If you want to display the information on ns0, you will end up needing the same 
fields that the "Index:" page is using now. 


So why not to have the same solution for both? 

It could also be a template with a reduced set of fields that expands to show 
"Template:Book" with linked data from Wikidata, no matter if they have 
supporting scans or not.




Micru

On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo  wrote:



Simply there is no need to store data twice or more, if they are dinamically 
imported from wikidata. Such data would be simply generated by a normal 
template. Something similar to Commons media sharing: most wikipedians but 
beginners know that when you want to edit a shared media file, you must do you 
edit in Commons; there's no need to host a media file locally. 




So, IMHO a good Lua wikidata-reading library could avoid at all to store data 
in wikisource, or wikipedia, or Commons. 
Alex




2013/6/10 David Cuenca 




@Alex: but what do you think of storing the source information in "Index:" 
pages for all works stored in Wikisource, even if they don't have a supporting 
scan?

That was the original question :)







About your proposed library, it would be more useful if it could modify data in 
Wikidata, not only import it. Besides, if the Wikidata client is installed in 
Wikisource, the inclusion syntax already takes care of displaying data...







Micru

On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo  wrote:



I don't see the need to change deeply Index/ns0 relationship, while I 
appreciate the idea "promote coherence reducing redundance" (many years ago I 
painfully used dBase III - dBase IV and I learned that principle by "try and 
learn").







Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a 
brief message about relationship among wikidata, commons, wikisource and any 
other project. Don't follow the link, it's so short that I copy it here (but if 
you like it, comment it there):







Scribunto-Lua and WikidataI'd like a library to get Wikidata content; it would 
be a good idea IMHO to access to Wikidata data in plain form, just as such data 
would be Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)







If such a Lua library could be built, to import data from wikidata would be as 
simple, as writing a template, and data will be self-aligned. 

Alex

2013/6/10 Aarti K. Dwivedi 







Hi,
There was a thread some time ago where there were talks o

Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Alex Brollo
I'm going to test what you are telling in a real Lua script; as you know,
Lua can read the code of any page with one "expensive" server function
only, so that a simple {{header|index name}} ns0 template call could read
all the wiki code from index page, parse it, extract all its data content,
and use it to build any html you like. No other field is needed. In
it.wikisource we are testing something more complex, since we are exporting
Index data into a local Lua data module, to be loaded with a mw.loadData
function that is not listed  as "server-expensive"; but I presume that wiki
servers would not be overloaded by *one* server expensive call

If Im not going wrong, such a script could be written tomorrow by a good
Lua programmer I'll need some more time as a beginner.  I'll test
a "MediaWiki:Proofreadpage_index_template" Lua loader & parser working into
ns0, just to see if all runs as I guess, then I'll tell you in this thread.
In which wikisource project do you work usually?

Alex



2013/6/11 David Cuenca 

> No, it won't be stored in Wikisource, but still there is the need to
> present the information in a consistent manner.
> If you want to display the information on ns0, you will end up needing the
> same fields that the "Index:" page is using now.
> So why not to have the same solution for both?
>
> It could also be a template with a reduced set of fields that expands to
> show "Template:Book" with linked data from Wikidata, no matter if they have
> supporting scans or not.
>
> Micru
>
>
> On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo wrote:
>
>> Simply there is no need to store data twice or more, if they are
>> dinamically imported from wikidata. Such data would be simply generated by
>> a normal template. Something similar to Commons media sharing: most
>> wikipedians but beginners know that when you want to edit a shared media
>> file, you must do you edit in Commons; there's no need to host a media file
>> locally.
>>
>> So, IMHO a good Lua wikidata-reading library could avoid at all to store
>> data in wikisource, or wikipedia, or Commons.
>>
>> Alex
>>
>>
>> 2013/6/10 David Cuenca 
>>
>>> @Alex: but what do you think of storing the source information in
>>> "Index:" pages for all works stored in Wikisource, even if they don't have
>>> a supporting scan?
>>>
>>> That was the original question :)
>>>
>>> About your proposed library, it would be more useful if it could modify
>>> data in Wikidata, not only import it. Besides, if the Wikidata client is
>>> installed in Wikisource, the inclusion syntax already takes care of
>>> displaying data...
>>>
>>> Micru
>>>
>>>
>>> On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo wrote:
>>>
 I don't see the need to change deeply Index/ns0 relationship, while I
 appreciate the idea "promote coherence reducing redundance" (many years ago
 I painfully used dBase III - dBase IV and I learned that principle by "try
 and learn").

 Here:
 http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
 brief message about relationship among wikidata, commons, wikisource and
 any other project. Don't follow the link, it's so short that I copy it here
 (but if you like it, comment it there):

 Scribunto-Lua and Wikidata
 I'd like a library to get Wikidata content; it would be a good idea
 IMHO to access to Wikidata data in plain form, just as such data would be
 Lua tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)


 If such a Lua library could be built, to import data from wikidata
 would be as simple, as writing a template, and data will be self-aligned.

 Alex


 2013/6/10 Aarti K. Dwivedi 

 Hi,
>
> There was a thread some time ago where there were talks of having
> books which were born digital. These pages wouldn't have scans.
> What the 'Index' page would have in these cases is something I am not
> very sure about.
>
> Cheers,
> Rtdwivedi
>
>
> On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca wrote:
>
>> With the deployment of Wikidata it is a good moment to re-examine
>> what "Index" pages are and what should be their function.
>> The most direct transition to a Wikidata-supported Wikisource could
>> be something like this:
>> https://sites.google.com/site/dacuetu/BookData.pdf
>>
>> That would allow:
>> - to share data book data between Commons, Wikisource and Wikipedia
>> - to update it, when any of the sites has been updated
>> - to facilitate better search functions (like searches by author, or
>> topic, limiting the date range or the language)
>>
>> That would only apply to those texts which use a "Index:" page, so
>> now the question is, what do we do with books that do not have supporting
>> scans (and therefore no index page)?
>>
>> Some possible options:
>> a) ignore pages without sources and focus only

Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread David Cuenca
No, it won't be stored in Wikisource, but still there is the need to
present the information in a consistent manner.
If you want to display the information on ns0, you will end up needing the
same fields that the "Index:" page is using now.
So why not to have the same solution for both?

It could also be a template with a reduced set of fields that expands to
show "Template:Book" with linked data from Wikidata, no matter if they have
supporting scans or not.

Micru

On Mon, Jun 10, 2013 at 6:00 PM, Alex Brollo  wrote:

> Simply there is no need to store data twice or more, if they are
> dinamically imported from wikidata. Such data would be simply generated by
> a normal template. Something similar to Commons media sharing: most
> wikipedians but beginners know that when you want to edit a shared media
> file, you must do you edit in Commons; there's no need to host a media file
> locally.
>
> So, IMHO a good Lua wikidata-reading library could avoid at all to store
> data in wikisource, or wikipedia, or Commons.
>
> Alex
>
>
> 2013/6/10 David Cuenca 
>
>> @Alex: but what do you think of storing the source information in
>> "Index:" pages for all works stored in Wikisource, even if they don't have
>> a supporting scan?
>>
>> That was the original question :)
>>
>> About your proposed library, it would be more useful if it could modify
>> data in Wikidata, not only import it. Besides, if the Wikidata client is
>> installed in Wikisource, the inclusion syntax already takes care of
>> displaying data...
>>
>> Micru
>>
>>
>> On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo wrote:
>>
>>> I don't see the need to change deeply Index/ns0 relationship, while I
>>> appreciate the idea "promote coherence reducing redundance" (many years ago
>>> I painfully used dBase III - dBase IV and I learned that principle by "try
>>> and learn").
>>>
>>> Here:
>>> http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
>>> brief message about relationship among wikidata, commons, wikisource and
>>> any other project. Don't follow the link, it's so short that I copy it here
>>> (but if you like it, comment it there):
>>>
>>> Scribunto-Lua and Wikidata
>>> I'd like a library to get Wikidata content; it would be a good idea IMHO
>>> to access to Wikidata data in plain form, just as such data would be Lua
>>> tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)
>>>
>>>
>>> If such a Lua library could be built, to import data from wikidata would
>>> be as simple, as writing a template, and data will be self-aligned.
>>>
>>> Alex
>>>
>>>
>>> 2013/6/10 Aarti K. Dwivedi 
>>>
>>> Hi,

 There was a thread some time ago where there were talks of having
 books which were born digital. These pages wouldn't have scans.
 What the 'Index' page would have in these cases is something I am not
 very sure about.

 Cheers,
 Rtdwivedi


 On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca wrote:

> With the deployment of Wikidata it is a good moment to re-examine what
> "Index" pages are and what should be their function.
> The most direct transition to a Wikidata-supported Wikisource could be
> something like this:
> https://sites.google.com/site/dacuetu/BookData.pdf
>
> That would allow:
> - to share data book data between Commons, Wikisource and Wikipedia
> - to update it, when any of the sites has been updated
> - to facilitate better search functions (like searches by author, or
> topic, limiting the date range or the language)
>
> That would only apply to those texts which use a "Index:" page, so now
> the question is, what do we do with books that do not have supporting 
> scans
> (and therefore no index page)?
>
> Some possible options:
> a) ignore pages without sources and focus only on works with
> supporting scans
> b) use ns0 pages also as data containers (instead of, or in addition
> to "Index" pages)
> c) create "Index:" pages for all works, with or without scans. Use
> that instead of "Template:Textinfo"
>
> Personally I prefer "option c", even if it would require to rename
> "Index:" to "Source:" to make more clear what are those pages, however I
> would like to hear the opinion of other wikisourcerors about this.
>
> Cheers,
> Micru
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


 --
 Aarti K. Dwivedi


 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l


>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo

Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Alex Brollo
Simply there is no need to store data twice or more, if they are
dinamically imported from wikidata. Such data would be simply generated by
a normal template. Something similar to Commons media sharing: most
wikipedians but beginners know that when you want to edit a shared media
file, you must do you edit in Commons; there's no need to host a media file
locally.

So, IMHO a good Lua wikidata-reading library could avoid at all to store
data in wikisource, or wikipedia, or Commons.

Alex


2013/6/10 David Cuenca 

> @Alex: but what do you think of storing the source information in "Index:"
> pages for all works stored in Wikisource, even if they don't have a
> supporting scan?
>
> That was the original question :)
>
> About your proposed library, it would be more useful if it could modify
> data in Wikidata, not only import it. Besides, if the Wikidata client is
> installed in Wikisource, the inclusion syntax already takes care of
> displaying data...
>
> Micru
>
>
> On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo wrote:
>
>> I don't see the need to change deeply Index/ns0 relationship, while I
>> appreciate the idea "promote coherence reducing redundance" (many years ago
>> I painfully used dBase III - dBase IV and I learned that principle by "try
>> and learn").
>>
>> Here:
>> http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
>> brief message about relationship among wikidata, commons, wikisource and
>> any other project. Don't follow the link, it's so short that I copy it here
>> (but if you like it, comment it there):
>>
>> Scribunto-Lua and Wikidata
>> I'd like a library to get Wikidata content; it would be a good idea IMHO
>> to access to Wikidata data in plain form, just as such data would be Lua
>> tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)
>>
>>
>> If such a Lua library could be built, to import data from wikidata would
>> be as simple, as writing a template, and data will be self-aligned.
>>
>> Alex
>>
>>
>> 2013/6/10 Aarti K. Dwivedi 
>>
>> Hi,
>>>
>>> There was a thread some time ago where there were talks of having
>>> books which were born digital. These pages wouldn't have scans.
>>> What the 'Index' page would have in these cases is something I am not
>>> very sure about.
>>>
>>> Cheers,
>>> Rtdwivedi
>>>
>>>
>>> On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca wrote:
>>>
 With the deployment of Wikidata it is a good moment to re-examine what
 "Index" pages are and what should be their function.
 The most direct transition to a Wikidata-supported Wikisource could be
 something like this:
 https://sites.google.com/site/dacuetu/BookData.pdf

 That would allow:
 - to share data book data between Commons, Wikisource and Wikipedia
 - to update it, when any of the sites has been updated
 - to facilitate better search functions (like searches by author, or
 topic, limiting the date range or the language)

 That would only apply to those texts which use a "Index:" page, so now
 the question is, what do we do with books that do not have supporting scans
 (and therefore no index page)?

 Some possible options:
 a) ignore pages without sources and focus only on works with supporting
 scans
 b) use ns0 pages also as data containers (instead of, or in addition to
 "Index" pages)
 c) create "Index:" pages for all works, with or without scans. Use that
 instead of "Template:Textinfo"

 Personally I prefer "option c", even if it would require to rename
 "Index:" to "Source:" to make more clear what are those pages, however I
 would like to hear the opinion of other wikisourcerors about this.

 Cheers,
 Micru

 ___
 Wikisource-l mailing list
 Wikisource-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikisource-l


>>>
>>>
>>> --
>>> Aarti K. Dwivedi
>>>
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> --
> Etiamsi omnes, ego non
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread David Cuenca
@Alex: but what do you think of storing the source information in "Index:"
pages for all works stored in Wikisource, even if they don't have a
supporting scan?

That was the original question :)

About your proposed library, it would be more useful if it could modify
data in Wikidata, not only import it. Besides, if the Wikidata client is
installed in Wikisource, the inclusion syntax already takes care of
displaying data...

Micru

On Mon, Jun 10, 2013 at 5:38 PM, Alex Brollo  wrote:

> I don't see the need to change deeply Index/ns0 relationship, while I
> appreciate the idea "promote coherence reducing redundance" (many years ago
> I painfully used dBase III - dBase IV and I learned that principle by "try
> and learn").
>
> Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
> brief message about relationship among wikidata, commons, wikisource and
> any other project. Don't follow the link, it's so short that I copy it here
> (but if you like it, comment it there):
>
> Scribunto-Lua and Wikidata
> I'd like a library to get Wikidata content; it would be a good idea IMHO
> to access to Wikidata data in plain form, just as such data would be Lua
> tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)
>
>
> If such a Lua library could be built, to import data from wikidata would
> be as simple, as writing a template, and data will be self-aligned.
>
> Alex
>
>
> 2013/6/10 Aarti K. Dwivedi 
>
> Hi,
>>
>> There was a thread some time ago where there were talks of having
>> books which were born digital. These pages wouldn't have scans.
>> What the 'Index' page would have in these cases is something I am not
>> very sure about.
>>
>> Cheers,
>> Rtdwivedi
>>
>>
>> On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca  wrote:
>>
>>> With the deployment of Wikidata it is a good moment to re-examine what
>>> "Index" pages are and what should be their function.
>>> The most direct transition to a Wikidata-supported Wikisource could be
>>> something like this:
>>> https://sites.google.com/site/dacuetu/BookData.pdf
>>>
>>> That would allow:
>>> - to share data book data between Commons, Wikisource and Wikipedia
>>> - to update it, when any of the sites has been updated
>>> - to facilitate better search functions (like searches by author, or
>>> topic, limiting the date range or the language)
>>>
>>> That would only apply to those texts which use a "Index:" page, so now
>>> the question is, what do we do with books that do not have supporting scans
>>> (and therefore no index page)?
>>>
>>> Some possible options:
>>> a) ignore pages without sources and focus only on works with supporting
>>> scans
>>> b) use ns0 pages also as data containers (instead of, or in addition to
>>> "Index" pages)
>>> c) create "Index:" pages for all works, with or without scans. Use that
>>> instead of "Template:Textinfo"
>>>
>>> Personally I prefer "option c", even if it would require to rename
>>> "Index:" to "Source:" to make more clear what are those pages, however I
>>> would like to hear the opinion of other wikisourcerors about this.
>>>
>>> Cheers,
>>> Micru
>>>
>>> ___
>>> Wikisource-l mailing list
>>> Wikisource-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>>
>>>
>>
>>
>> --
>> Aarti K. Dwivedi
>>
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Etiamsi omnes, ego non
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Alex Brollo
I don't see the need to change deeply Index/ns0 relationship, while I
appreciate the idea "promote coherence reducing redundance" (many years ago
I painfully used dBase III - dBase IV and I learned that principle by "try
and learn").

Here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto/Brainstorming a
brief message about relationship among wikidata, commons, wikisource and
any other project. Don't follow the link, it's so short that I copy it here
(but if you like it, comment it there):

Scribunto-Lua and Wikidata
I'd like a library to get Wikidata content; it would be a good idea IMHO to
access to Wikidata data in plain form, just as such data would be Lua
tables/variables. --Alex brollo (talk) 13:06, 10 June 2013 (UTC)


If such a Lua library could be built, to import data from wikidata would be
as simple, as writing a template, and data will be self-aligned.

Alex


2013/6/10 Aarti K. Dwivedi 

> Hi,
>
> There was a thread some time ago where there were talks of having
> books which were born digital. These pages wouldn't have scans.
> What the 'Index' page would have in these cases is something I am not very
> sure about.
>
> Cheers,
> Rtdwivedi
>
>
> On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca  wrote:
>
>> With the deployment of Wikidata it is a good moment to re-examine what
>> "Index" pages are and what should be their function.
>> The most direct transition to a Wikidata-supported Wikisource could be
>> something like this:
>> https://sites.google.com/site/dacuetu/BookData.pdf
>>
>> That would allow:
>> - to share data book data between Commons, Wikisource and Wikipedia
>> - to update it, when any of the sites has been updated
>> - to facilitate better search functions (like searches by author, or
>> topic, limiting the date range or the language)
>>
>> That would only apply to those texts which use a "Index:" page, so now
>> the question is, what do we do with books that do not have supporting scans
>> (and therefore no index page)?
>>
>> Some possible options:
>> a) ignore pages without sources and focus only on works with supporting
>> scans
>> b) use ns0 pages also as data containers (instead of, or in addition to
>> "Index" pages)
>> c) create "Index:" pages for all works, with or without scans. Use that
>> instead of "Template:Textinfo"
>>
>> Personally I prefer "option c", even if it would require to rename
>> "Index:" to "Source:" to make more clear what are those pages, however I
>> would like to hear the opinion of other wikisourcerors about this.
>>
>> Cheers,
>> Micru
>>
>> ___
>> Wikisource-l mailing list
>> Wikisource-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>>
>>
>
>
> --
> Aarti K. Dwivedi
>
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


Re: [Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread Aarti K. Dwivedi
Hi,

There was a thread some time ago where there were talks of having books
which were born digital. These pages wouldn't have scans.
What the 'Index' page would have in these cases is something I am not very
sure about.

Cheers,
Rtdwivedi


On Mon, Jun 10, 2013 at 10:47 PM, David Cuenca  wrote:

> With the deployment of Wikidata it is a good moment to re-examine what
> "Index" pages are and what should be their function.
> The most direct transition to a Wikidata-supported Wikisource could be
> something like this:
> https://sites.google.com/site/dacuetu/BookData.pdf
>
> That would allow:
> - to share data book data between Commons, Wikisource and Wikipedia
> - to update it, when any of the sites has been updated
> - to facilitate better search functions (like searches by author, or
> topic, limiting the date range or the language)
>
> That would only apply to those texts which use a "Index:" page, so now the
> question is, what do we do with books that do not have supporting scans
> (and therefore no index page)?
>
> Some possible options:
> a) ignore pages without sources and focus only on works with supporting
> scans
> b) use ns0 pages also as data containers (instead of, or in addition to
> "Index" pages)
> c) create "Index:" pages for all works, with or without scans. Use that
> instead of "Template:Textinfo"
>
> Personally I prefer "option c", even if it would require to rename
> "Index:" to "Source:" to make more clear what are those pages, however I
> would like to hear the opinion of other wikisourcerors about this.
>
> Cheers,
> Micru
>
> ___
> Wikisource-l mailing list
> Wikisource-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikisource-l
>
>


-- 
Aarti K. Dwivedi
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l


[Wikisource-l] About texts without supporting files and "Index:" pages

2013-06-10 Thread David Cuenca
With the deployment of Wikidata it is a good moment to re-examine what
"Index" pages are and what should be their function.
The most direct transition to a Wikidata-supported Wikisource could be
something like this:
https://sites.google.com/site/dacuetu/BookData.pdf

That would allow:
- to share data book data between Commons, Wikisource and Wikipedia
- to update it, when any of the sites has been updated
- to facilitate better search functions (like searches by author, or topic,
limiting the date range or the language)

That would only apply to those texts which use a "Index:" page, so now the
question is, what do we do with books that do not have supporting scans
(and therefore no index page)?

Some possible options:
a) ignore pages without sources and focus only on works with supporting
scans
b) use ns0 pages also as data containers (instead of, or in addition to
"Index" pages)
c) create "Index:" pages for all works, with or without scans. Use that
instead of "Template:Textinfo"

Personally I prefer "option c", even if it would require to rename "Index:"
to "Source:" to make more clear what are those pages, however I would like
to hear the opinion of other wikisourcerors about this.

Cheers,
Micru
___
Wikisource-l mailing list
Wikisource-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikisource-l