Re: [Wikidata-l] Introduction and some questions on Wikidata

2012-11-15 Thread Lydia Pintscher
Hey Bernardo

On Wed, Nov 14, 2012 at 4:02 PM, Bernardo Najlis b52@gmail.com wrote:
 Hi Everyone,

 My name is Bernardo and I work as a Business Intelligence Consultant in
 Toronto, Canada. I've been reading pretty much everything about WikiData
 over the past few days, and I'm really excited about this project! I would
 love to participate and collaborate in developing tools to bring data from
 external sources, particularly the vast open data sets that are so common
 right now.

Have you already seen
http://meta.wikimedia.org/wiki/Wikidata/Data_collaborators ?

 I have some questions about the project and how can I participate, that you
 might be able to help me with:

 1) I know the data import is planned for a future phase 2. Do you think it
 would be useful to start working on some of that now?

It really depends on what you want to do right now. But the technical
basis for actual data beyond language links isn't there yet so there
isn't too much to do.
People have already written bots to import language links that you
might want to have a look at.

 2) I also learned that you guys had a round table on June 21 about this
 topic (wish I could have known about Wikidata and been there!) but I
 couldn't find what the outcome of that discussion was. Is there anywhere
 where I can find what was discussed there?

Hmm I seem to remember there was an etherpad. Anyone who was there got a link?

 3) Is there anyone else on the list working on the subject of how to bring
 data in? I have some ideas on the subject, but again, don't know if this is
 too early in the project.

A first step would be to define what data exactly you've got in mind
and then get consensus in the community that this is something that is
wanted. I think it's a bit too early still though for the latter. A
real decision can probably only be made after the second phase has
been in use for a bit by editors with real data.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata

Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata license (was Introduction and some questions on Wikidata)

2012-11-15 Thread Daniel Kinzler
On 15.11.2012 20:06, Helder . wrote:
 On Wed, Nov 14, 2012 at 9:50 PM, Marco Fleckinger
 marco.fleckin...@wikipedia.at wrote:
 (...)
 First of all the priority lies on data already present on Wikipedia. 
 Wikidata should not be a data storage for everything structured in the 
 world, so we should first start to transfer data already present on 
 Wikipedia to Wikidata.
 (...)
 Wouldn't that kind of transfer be a violation of the CC-BY-SA license
 used on Wikipedia, considering it is not compatible[1] with CC0?

If the data is actually copyrightable, then yes. Facts as such are not
copyrightable. But if there was a bot transferring stuff from infoboxes, it
should at least check for any actual text (e.g. long values with spaces), and
not transfer it, because of license reasons.

-- daniel


-- 
Daniel Kinzler, Softwarearchitekt
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata license (was Introduction and some questions on Wikidata)

2012-11-15 Thread Sven
Automatically copying over infoboxes is something I don't advise. Unlike 
current infoboxes, which are rarely sourced, every point of data on Wikidata 
should be DIRECTLY and INDIVIDUALLY sourced. We can use the same source 37 
times, but each bit of information that would ordinarily have a field on an 
infobox needs to have its own source, we can't just say everything on this 
page is from . If we do automatic importing, it's going to be an uphill 
battle from day one to source things.

On Nov 15, 2012, at 4:50 PM, Gregor Hagedorn g.m.haged...@gmail.com wrote:

 If the data is actually copyrightable, then yes. Facts as such are not
 copyrightable. But if there was a bot transferring stuff from infoboxes, it
 should at least check for any actual text (e.g. long values with spaces), and
 not transfer it, because of license reasons.
 
 I agree. Just to clarify what actual text should mean: Although a
 short sentence with several words may occasionally be a copyrightable
 text (e.g. a poem), it is very rarely so. On Wikipedia infoboxes, due
 to scope, purpose and style, this can almost be excluded.
 
 It is not desirable to exclude brief scope notes or source notes,
 which occasionally occur in Wikipedia infoboxes, just because they
 contain several words. I personally would recommend an extraction
 dryrun and manually check for parameters that have more than perhaps
 12-15 words, whether they are creative (= copyrightable) or plain
 expressions of fact or sources (= not copyrightable).
 
 Gregor
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Wikidata license (was Introduction and some questions on Wikidata)

2012-11-15 Thread Sven Manguard
The argument above is about automatically copying over content from other
projects. My point is that the license isn't the problem with it, but that
there is a problem with it.

Sven

On Thu, Nov 15, 2012 at 7:05 PM, Gregor Hagedorn g.m.haged...@gmail.comwrote:

 On 15 November 2012 23:35, Sven svenmangu...@gmail.com wrote:
  Automatically copying over infoboxes is something I don't advise. Unlike
 current infoboxes, which are rarely sourced, every point of data on
 Wikidata should be DIRECTLY and INDIVIDUALLY sourced. We can use the same
 source 37 times, but each bit of information that would ordinarily have a
 field on an infobox needs to have its own source, we can't just say
 everything on this page is from . If we do automatic importing, it's
 going to be an uphill battle from day one to source things.

 (The argument above is independent of licensing, so this should
 perhaps be discussed in a separate thread?)

 Gregor

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Wikidata demo repo is being spammed.

2012-11-15 Thread Snaevar
Hello,

I wanted to let you guys know that there is spam being added to items 
(descriptions and labels) on the demo repo.

Regards,
Snaevar

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l