On 11/24/2014 11:13 PM, Federico Leva (Nemo) wrote:
When I think of this, I agree that OCR is the main issue. But it's not
necessarily the one which worries me most, because tesseract is
something living outside the wiki which can be improved even if the
wiki has design issues. If we try
On 24 November 2014 at 13:51, Andrea Zanni zanni.andre...@gmail.com wrote:
Another greataccomplishment could be *giving back proofread OCR* to GLAMs:
think about libraries (or Internet Archive!) give us ancient texts, and us
giving them back a perfect djvu or PDF with mapped text inside...
How would I do that now? Wikisource pages are not structured data (though
Wikimedia Commons image metadata will soon be!), so there is not a clear
way to use the Wikisource API to extract just the relevant transcribed text
on the page as a field. And on top of that, any text you do extract
On 25 November 2014 at 11:33, Andrea Zanni zanni.andre...@gmail.com wrote:
How would I do that now? Wikisource pages are not structured data (though
Wikimedia Commons image metadata will soon be!), so there is not a clear
way to use the Wikisource API to extract just the relevant transcribed
On Tue, Nov 25, 2014 at 6:34 PM, Dominic McDevitt-Parks mcdev...@gmail.com
wrote:
You have a good point, though. One of the differences between Wikisource
and most other platforms is that it is actually richly formatted. It's kind
of a shame to strip all that formatting information out when
Please keep up this good discussion :-)
We have the Wikisource contest on it.source right now,
so this mail is not going to be as long and detailed as I hoped.
I agree with Vigneron that the Survey report is a good start:
having written it myself, I'm well aware that it's not perfect, and that
However, in 5 years I've yet to find ONE person that says, yes
Nemo, you're right, Wikisource should be 10 or 50 times as big as
Wikipedia, let's plan for that. Probably I'm wrong. :)
+1. Count me in !
It will be hard and I'm afraid we might lose some good users in the process
if we
Hi strongly agree with everything, Nemo.
I also remember hearing Sj, in an official Board QA, say explicitly that
he foresaw Wikisource as bigger than Wikipedia!
But Wikisource is out of the strategy, we know that.
We ask, keep asking and will continue to ask, but we are still out of the
2014-11-23 2:55 GMT+01:00 Wiki Billinghurst billinghurstw...@gmail.com:
What do we see as the next components for Wikisource?
What are our major hurdles for system development?
If we were offered development help where do people think that we
should be making use of that help? Is it
In thinking further about this, I think one of our major hurdles in
getting casual transcription is the formatting and templates aspects.
So is the migration to Visual Editor one of our major progression
points?
Regards, Billinghurst
On Sun, Nov 23, 2014 at 9:08 PM, Nicolas VIGNERON
2014-11-23 13:02 GMT+01:00 Wiki Billinghurst billinghurstw...@gmail.com:
In thinking further about this, I think one of our major hurdles in
getting casual transcription is the formatting and templates aspects.
So is the migration to Visual Editor one of our major progression
points?
On 11/23/2014 02:55 AM, Wiki Billinghurst wrote:
What do we see as the next components for Wikisource?
What are our major hurdles for system development?
If we were offered development help where do people think that we
should be making use of that help? Is it incremental fixes,
transactional
What do we see as the next components for Wikisource?
What are our major hurdles for system development?
If we were offered development help where do people think that we
should be making use of that help? Is it incremental fixes,
transactional changes, or are we wanting transformational
13 matches
Mail list logo