Re: [Wikidata] Orphaned items

2015-05-31 Thread Stas Malyshev
Hi!

> As a practical suggestion for helping:
> http://tools.wmflabs.org/wikidata-todo/random_item_without_instance.php

I would also suggest
http://tools.wmflabs.org/wikidata-todo/important_blank_items.php which
lists most linked items from wikis that have no connection to other
items whatsoever. Some of them are tough to classify or link to
anything, but some are rather obvious.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Stas Malyshev
Hi!

> If an item has no statements, no sitelinks, and isn't used anywhere, how do 
> you
> tell what it even *is*? The label only? Is that sufficient and/or useful? What
> would be lost by deleting it? Maybe, if it has labels in many languages, with

Unless its purpose if obvious (i.e. label/description/talk page
describes it clearly) I'd say it might be more dangerous to keep it
around, as if some people start to use it in different meanings, and
then people add independent articles on Wiki which would produce
different items with the same meaning, in multiple languages, pretty
soon we'd have quite a mess on our hands. Empty item by itself with no
links, no good labels and no data or almost no data (like "John Smith,
human" and that's it) is not worth much, IMHO.

Yes, I don't have good formal criteria for "obvious" so I imagine we'd
have to take it on case basis or maybe think about some.

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Andre Engels
I hit 'send' by accident. What I wanted to say is:

To get back to a hopefully more fruitful discussion: My opinion is
that an item can be deleted if it cannot be determined by a reasonably
knowledgeable person whether or not the item is about a given
person/place/subject/whatever. Do you or do you not agree?

André Engels


On Sun, May 31, 2015 at 7:58 PM, Andre Engels  wrote:
>  And when you look at the discussion, you see that the message that
> you are referring to said:
>
> " If you have an item that says someone whon a nobel prize, but not
> when or which, and also does *noit* have a label, that items is quite
> useless; it'S impossible to tell which person it is even referring
> to."
>
> The only thing I can conclude is that you are against the removal of
> items without a label because they probably do have a label. Which in
> my opinion is UTTER BOLLOCKS.
>
> To get back to
>
>
> On Sun, May 31, 2015 at 6:46 PM, Gerard Meijssen
>  wrote:
>> Hoi
>> When you look at the statistics, you will find that we aggressively pursue
>> the inclusion of labels. When there is no label in your language it is
>> tough. When you use Reasonator there is no problem; you will always see a
>> label in whatever language is available.
>>
>> My problem is that we know that the prize was won. The item is likely to
>> have a label. For the rest ... as they say in double Dutch.. "search it but
>> out".
>> Thanks,
>>  GerardM
>>
>> On 31 May 2015 at 18:40, Andre Engels  wrote:
>>>
>>> And what if Q9 does not have a label? How am I going from the
>>> information "Q9 won the Nobel Prize in Literature" to "Q9
>>> is/is not Patrick Modiano"?
>>>
>>> André
>>>
>>> On Sun, May 31, 2015 at 6:29 PM, Gerard Meijssen
>>>  wrote:
>>> > Hoi,
>>> > Not enough data. Q9 may have a label that is "Patrick Modiano"..
>>> > your
>>> > first challenge is to find out that your Patrick Modiano is indeed that
>>> > particular one. Given that you know what award was won, you have a
>>> > start.
>>> > Thanks,
>>> >   GerardM
>>> >
>>> > On 31 May 2015 at 17:40, Andre Engels  wrote:
>>> >>
>>> >> And that helps me how? Most awards have been won by more than one
>>> >> person. If I know that Q9 has won the Nobel Prize in literature,
>>> >> and I know a fact about Patrick Modiano, should I add that fact to
>>> >> Q9 or should I create a new item?
>>> >>
>>> >> André
>>> >>
>>> >> On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
>>> >>  wrote:
>>> >> > Hoi,
>>> >> > Typically such items were created because the article about the award
>>> >> > mentions them. So it is all a matter of perspective. When the award
>>> >> > is
>>> >> > leading, the information about an award winner is in the article on
>>> >> > the
>>> >> > award. Having all these awardees on the article is not so great, it
>>> >> > is
>>> >> > not
>>> >> > what we do.
>>> >> >
>>> >> > Impossible? Certainly not. Reat the damn article (on the award).
>>> >> > Thanks,
>>> >> >  GerardM
>>> >> >
>>> >> > On 31 May 2015 at 17:06, Daniel Kinzler 
>>> >> > wrote:
>>> >> >>
>>> >> >> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
>>> >> >> > Hoi,
>>> >> >> > When someone or something received an award, it is needed if only
>>> >> >> > to
>>> >> >> > complete
>>> >> >> > the list of recipients of that award.. There is no benchmark for
>>> >> >> > enough
>>> >> >> > information. The notion that you a Nobel award winner is not
>>> >> >> > relevant
>>> >> >> > is
>>> >> >> > poppycock. With automated descriptions awards do show.
>>> >> >>
>>> >> >> If you have an item that says someone whon a nobel prize, but not
>>> >> >> when
>>> >> >> or
>>> >> >> which,
>>> >> >> and also does *noit* have a label, that items is quite useless; it'S
>>> >> >> impossible
>>> >> >> to tell which person it is even referring to.
>>> >> >>
>>> >> >> That is what markus is talking about. For people, if there is a
>>> >> >> label,
>>> >> >> we
>>> >> >> already have pretty good info. But if there is no label, we have a
>>> >> >> problem, and
>>> >> >> if there isn't any other identifying info,m the item is useless.
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Daniel Kinzler
>>> >> >> Senior Software Developer
>>> >> >>
>>> >> >> Wikimedia Deutschland
>>> >> >> Gesellschaft zur Förderung Freien Wissens e.V.
>>> >> >>
>>> >> >> ___
>>> >> >> Wikidata mailing list
>>> >> >> Wikidata@lists.wikimedia.org
>>> >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>> >> >
>>> >> >
>>> >> >
>>> >> > ___
>>> >> > Wikidata mailing list
>>> >> > Wikidata@lists.wikimedia.org
>>> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> André Engels, andreeng...@gmail.com
>>> >>
>>> >> ___
>>> >> Wikidata mailing list
>>> >> Wikidata@lists.wikimedia.org
>>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>> >

Re: [Wikidata] Orphaned items

2015-05-31 Thread Andre Engels
 And when you look at the discussion, you see that the message that
you are referring to said:

" If you have an item that says someone whon a nobel prize, but not
when or which, and also does *noit* have a label, that items is quite
useless; it'S impossible to tell which person it is even referring
to."

The only thing I can conclude is that you are against the removal of
items without a label because they probably do have a label. Which in
my opinion is UTTER BOLLOCKS.

To get back to


On Sun, May 31, 2015 at 6:46 PM, Gerard Meijssen
 wrote:
> Hoi
> When you look at the statistics, you will find that we aggressively pursue
> the inclusion of labels. When there is no label in your language it is
> tough. When you use Reasonator there is no problem; you will always see a
> label in whatever language is available.
>
> My problem is that we know that the prize was won. The item is likely to
> have a label. For the rest ... as they say in double Dutch.. "search it but
> out".
> Thanks,
>  GerardM
>
> On 31 May 2015 at 18:40, Andre Engels  wrote:
>>
>> And what if Q9 does not have a label? How am I going from the
>> information "Q9 won the Nobel Prize in Literature" to "Q9
>> is/is not Patrick Modiano"?
>>
>> André
>>
>> On Sun, May 31, 2015 at 6:29 PM, Gerard Meijssen
>>  wrote:
>> > Hoi,
>> > Not enough data. Q9 may have a label that is "Patrick Modiano"..
>> > your
>> > first challenge is to find out that your Patrick Modiano is indeed that
>> > particular one. Given that you know what award was won, you have a
>> > start.
>> > Thanks,
>> >   GerardM
>> >
>> > On 31 May 2015 at 17:40, Andre Engels  wrote:
>> >>
>> >> And that helps me how? Most awards have been won by more than one
>> >> person. If I know that Q9 has won the Nobel Prize in literature,
>> >> and I know a fact about Patrick Modiano, should I add that fact to
>> >> Q9 or should I create a new item?
>> >>
>> >> André
>> >>
>> >> On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
>> >>  wrote:
>> >> > Hoi,
>> >> > Typically such items were created because the article about the award
>> >> > mentions them. So it is all a matter of perspective. When the award
>> >> > is
>> >> > leading, the information about an award winner is in the article on
>> >> > the
>> >> > award. Having all these awardees on the article is not so great, it
>> >> > is
>> >> > not
>> >> > what we do.
>> >> >
>> >> > Impossible? Certainly not. Reat the damn article (on the award).
>> >> > Thanks,
>> >> >  GerardM
>> >> >
>> >> > On 31 May 2015 at 17:06, Daniel Kinzler 
>> >> > wrote:
>> >> >>
>> >> >> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
>> >> >> > Hoi,
>> >> >> > When someone or something received an award, it is needed if only
>> >> >> > to
>> >> >> > complete
>> >> >> > the list of recipients of that award.. There is no benchmark for
>> >> >> > enough
>> >> >> > information. The notion that you a Nobel award winner is not
>> >> >> > relevant
>> >> >> > is
>> >> >> > poppycock. With automated descriptions awards do show.
>> >> >>
>> >> >> If you have an item that says someone whon a nobel prize, but not
>> >> >> when
>> >> >> or
>> >> >> which,
>> >> >> and also does *noit* have a label, that items is quite useless; it'S
>> >> >> impossible
>> >> >> to tell which person it is even referring to.
>> >> >>
>> >> >> That is what markus is talking about. For people, if there is a
>> >> >> label,
>> >> >> we
>> >> >> already have pretty good info. But if there is no label, we have a
>> >> >> problem, and
>> >> >> if there isn't any other identifying info,m the item is useless.
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Daniel Kinzler
>> >> >> Senior Software Developer
>> >> >>
>> >> >> Wikimedia Deutschland
>> >> >> Gesellschaft zur Förderung Freien Wissens e.V.
>> >> >>
>> >> >> ___
>> >> >> Wikidata mailing list
>> >> >> Wikidata@lists.wikimedia.org
>> >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >> >
>> >> >
>> >> >
>> >> > ___
>> >> > Wikidata mailing list
>> >> > Wikidata@lists.wikimedia.org
>> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> André Engels, andreeng...@gmail.com
>> >>
>> >> ___
>> >> Wikidata mailing list
>> >> Wikidata@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>> >
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>>
>>
>>
>> --
>> André Engels, andreeng...@gmail.com
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> http

Re: [Wikidata] Orphaned items

2015-05-31 Thread Daniel Kinzler
Am 31.05.2015 um 17:23 schrieb Gerard Meijssen:
> Hoi,
> Typically such items were created because the article about the award mentions
> them. So it is all a matter of perspective. When the award is leading, the
> information about an award winner is in the article on the award. Having all
> these awardees on the article is not so great, it is not what we do.

You mean the Wikipedia article references a Wikidata Item as an in-text link? Is
that done? I have never heard of that, and it seems like a violation of the "no
interwiki links in the article body" rule.


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi
When you look at the statistics, you will find that we aggressively pursue
the inclusion of labels. When there is no label in your language it is
tough. When you use Reasonator there is no problem; you will always see a
label in whatever language is available.

My problem is that we know that the prize was won. The item is likely to
have a label. For the rest ... as they say in double Dutch.. "search it but
out".
Thanks,
 GerardM

On 31 May 2015 at 18:40, Andre Engels  wrote:

> And what if Q9 does not have a label? How am I going from the
> information "Q9 won the Nobel Prize in Literature" to "Q9
> is/is not Patrick Modiano"?
>
> André
>
> On Sun, May 31, 2015 at 6:29 PM, Gerard Meijssen
>  wrote:
> > Hoi,
> > Not enough data. Q9 may have a label that is "Patrick Modiano".. your
> > first challenge is to find out that your Patrick Modiano is indeed that
> > particular one. Given that you know what award was won, you have a start.
> > Thanks,
> >   GerardM
> >
> > On 31 May 2015 at 17:40, Andre Engels  wrote:
> >>
> >> And that helps me how? Most awards have been won by more than one
> >> person. If I know that Q9 has won the Nobel Prize in literature,
> >> and I know a fact about Patrick Modiano, should I add that fact to
> >> Q9 or should I create a new item?
> >>
> >> André
> >>
> >> On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
> >>  wrote:
> >> > Hoi,
> >> > Typically such items were created because the article about the award
> >> > mentions them. So it is all a matter of perspective. When the award is
> >> > leading, the information about an award winner is in the article on
> the
> >> > award. Having all these awardees on the article is not so great, it is
> >> > not
> >> > what we do.
> >> >
> >> > Impossible? Certainly not. Reat the damn article (on the award).
> >> > Thanks,
> >> >  GerardM
> >> >
> >> > On 31 May 2015 at 17:06, Daniel Kinzler 
> >> > wrote:
> >> >>
> >> >> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
> >> >> > Hoi,
> >> >> > When someone or something received an award, it is needed if only
> to
> >> >> > complete
> >> >> > the list of recipients of that award.. There is no benchmark for
> >> >> > enough
> >> >> > information. The notion that you a Nobel award winner is not
> relevant
> >> >> > is
> >> >> > poppycock. With automated descriptions awards do show.
> >> >>
> >> >> If you have an item that says someone whon a nobel prize, but not
> when
> >> >> or
> >> >> which,
> >> >> and also does *noit* have a label, that items is quite useless; it'S
> >> >> impossible
> >> >> to tell which person it is even referring to.
> >> >>
> >> >> That is what markus is talking about. For people, if there is a
> label,
> >> >> we
> >> >> already have pretty good info. But if there is no label, we have a
> >> >> problem, and
> >> >> if there isn't any other identifying info,m the item is useless.
> >> >>
> >> >>
> >> >> --
> >> >> Daniel Kinzler
> >> >> Senior Software Developer
> >> >>
> >> >> Wikimedia Deutschland
> >> >> Gesellschaft zur Förderung Freien Wissens e.V.
> >> >>
> >> >> ___
> >> >> Wikidata mailing list
> >> >> Wikidata@lists.wikimedia.org
> >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >> >
> >> >
> >> >
> >> > ___
> >> > Wikidata mailing list
> >> > Wikidata@lists.wikimedia.org
> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >> >
> >>
> >>
> >>
> >> --
> >> André Engels, andreeng...@gmail.com
> >>
> >> ___
> >> Wikidata mailing list
> >> Wikidata@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
>
>
> --
> André Engels, andreeng...@gmail.com
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Andre Engels
And what if Q9 does not have a label? How am I going from the
information "Q9 won the Nobel Prize in Literature" to "Q9
is/is not Patrick Modiano"?

André

On Sun, May 31, 2015 at 6:29 PM, Gerard Meijssen
 wrote:
> Hoi,
> Not enough data. Q9 may have a label that is "Patrick Modiano".. your
> first challenge is to find out that your Patrick Modiano is indeed that
> particular one. Given that you know what award was won, you have a start.
> Thanks,
>   GerardM
>
> On 31 May 2015 at 17:40, Andre Engels  wrote:
>>
>> And that helps me how? Most awards have been won by more than one
>> person. If I know that Q9 has won the Nobel Prize in literature,
>> and I know a fact about Patrick Modiano, should I add that fact to
>> Q9 or should I create a new item?
>>
>> André
>>
>> On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
>>  wrote:
>> > Hoi,
>> > Typically such items were created because the article about the award
>> > mentions them. So it is all a matter of perspective. When the award is
>> > leading, the information about an award winner is in the article on the
>> > award. Having all these awardees on the article is not so great, it is
>> > not
>> > what we do.
>> >
>> > Impossible? Certainly not. Reat the damn article (on the award).
>> > Thanks,
>> >  GerardM
>> >
>> > On 31 May 2015 at 17:06, Daniel Kinzler 
>> > wrote:
>> >>
>> >> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
>> >> > Hoi,
>> >> > When someone or something received an award, it is needed if only to
>> >> > complete
>> >> > the list of recipients of that award.. There is no benchmark for
>> >> > enough
>> >> > information. The notion that you a Nobel award winner is not relevant
>> >> > is
>> >> > poppycock. With automated descriptions awards do show.
>> >>
>> >> If you have an item that says someone whon a nobel prize, but not when
>> >> or
>> >> which,
>> >> and also does *noit* have a label, that items is quite useless; it'S
>> >> impossible
>> >> to tell which person it is even referring to.
>> >>
>> >> That is what markus is talking about. For people, if there is a label,
>> >> we
>> >> already have pretty good info. But if there is no label, we have a
>> >> problem, and
>> >> if there isn't any other identifying info,m the item is useless.
>> >>
>> >>
>> >> --
>> >> Daniel Kinzler
>> >> Senior Software Developer
>> >>
>> >> Wikimedia Deutschland
>> >> Gesellschaft zur Förderung Freien Wissens e.V.
>> >>
>> >> ___
>> >> Wikidata mailing list
>> >> Wikidata@lists.wikimedia.org
>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>> >
>> >
>> > ___
>> > Wikidata mailing list
>> > Wikidata@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>> >
>>
>>
>>
>> --
>> André Engels, andreeng...@gmail.com
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
André Engels, andreeng...@gmail.com

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi,
Not enough data. Q9 may have a label that is "Patrick Modiano".. your
first challenge is to find out that your Patrick Modiano is indeed that
particular one. Given that you know what award was won, you have a start.
Thanks,
  GerardM

On 31 May 2015 at 17:40, Andre Engels  wrote:

> And that helps me how? Most awards have been won by more than one
> person. If I know that Q9 has won the Nobel Prize in literature,
> and I know a fact about Patrick Modiano, should I add that fact to
> Q9 or should I create a new item?
>
> André
>
> On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
>  wrote:
> > Hoi,
> > Typically such items were created because the article about the award
> > mentions them. So it is all a matter of perspective. When the award is
> > leading, the information about an award winner is in the article on the
> > award. Having all these awardees on the article is not so great, it is
> not
> > what we do.
> >
> > Impossible? Certainly not. Reat the damn article (on the award).
> > Thanks,
> >  GerardM
> >
> > On 31 May 2015 at 17:06, Daniel Kinzler 
> wrote:
> >>
> >> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
> >> > Hoi,
> >> > When someone or something received an award, it is needed if only to
> >> > complete
> >> > the list of recipients of that award.. There is no benchmark for
> enough
> >> > information. The notion that you a Nobel award winner is not relevant
> is
> >> > poppycock. With automated descriptions awards do show.
> >>
> >> If you have an item that says someone whon a nobel prize, but not when
> or
> >> which,
> >> and also does *noit* have a label, that items is quite useless; it'S
> >> impossible
> >> to tell which person it is even referring to.
> >>
> >> That is what markus is talking about. For people, if there is a label,
> we
> >> already have pretty good info. But if there is no label, we have a
> >> problem, and
> >> if there isn't any other identifying info,m the item is useless.
> >>
> >>
> >> --
> >> Daniel Kinzler
> >> Senior Software Developer
> >>
> >> Wikimedia Deutschland
> >> Gesellschaft zur Förderung Freien Wissens e.V.
> >>
> >> ___
> >> Wikidata mailing list
> >> Wikidata@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
>
>
> --
> André Engels, andreeng...@gmail.com
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Andre Engels
And that helps me how? Most awards have been won by more than one
person. If I know that Q9 has won the Nobel Prize in literature,
and I know a fact about Patrick Modiano, should I add that fact to
Q9 or should I create a new item?

André

On Sun, May 31, 2015 at 5:23 PM, Gerard Meijssen
 wrote:
> Hoi,
> Typically such items were created because the article about the award
> mentions them. So it is all a matter of perspective. When the award is
> leading, the information about an award winner is in the article on the
> award. Having all these awardees on the article is not so great, it is not
> what we do.
>
> Impossible? Certainly not. Reat the damn article (on the award).
> Thanks,
>  GerardM
>
> On 31 May 2015 at 17:06, Daniel Kinzler  wrote:
>>
>> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
>> > Hoi,
>> > When someone or something received an award, it is needed if only to
>> > complete
>> > the list of recipients of that award.. There is no benchmark for enough
>> > information. The notion that you a Nobel award winner is not relevant is
>> > poppycock. With automated descriptions awards do show.
>>
>> If you have an item that says someone whon a nobel prize, but not when or
>> which,
>> and also does *noit* have a label, that items is quite useless; it'S
>> impossible
>> to tell which person it is even referring to.
>>
>> That is what markus is talking about. For people, if there is a label, we
>> already have pretty good info. But if there is no label, we have a
>> problem, and
>> if there isn't any other identifying info,m the item is useless.
>>
>>
>> --
>> Daniel Kinzler
>> Senior Software Developer
>>
>> Wikimedia Deutschland
>> Gesellschaft zur Förderung Freien Wissens e.V.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
André Engels, andreeng...@gmail.com

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi,
Typically such items were created because the article about the award
mentions them. So it is all a matter of perspective. When the award is
leading, the information about an award winner is in the article on the
award. Having all these awardees on the article is not so great, it is not
what we do.

Impossible? Certainly not. Reat the damn article (on the award).
Thanks,
 GerardM

On 31 May 2015 at 17:06, Daniel Kinzler  wrote:

> Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
> > Hoi,
> > When someone or something received an award, it is needed if only to
> complete
> > the list of recipients of that award.. There is no benchmark for enough
> > information. The notion that you a Nobel award winner is not relevant is
> > poppycock. With automated descriptions awards do show.
>
> If you have an item that says someone whon a nobel prize, but not when or
> which,
> and also does *noit* have a label, that items is quite useless; it'S
> impossible
> to tell which person it is even referring to.
>
> That is what markus is talking about. For people, if there is a label, we
> already have pretty good info. But if there is no label, we have a
> problem, and
> if there isn't any other identifying info,m the item is useless.
>
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Daniel Kinzler
Am 31.05.2015 um 17:01 schrieb Magnus Manske:
> I do agree with that notion, to a large degree. It is probably more important 
> to
> give at least some statements to items with associated Wikipedia articles, 
> that
> to delete empty items that, by their very definition, are not in the way of
> anything.

If an item has no statements, no sitelinks, and isn't used anywhere, how do you
tell what it even *is*? The label only? Is that sufficient and/or useful? What
would be lost by deleting it? Maybe, if it has labels in many languages, with
good descriptions, that gives enoug info for identifying the tiem, and it is
useful to keep it. But "James Herrod" / "Person", with no extra info... what use
is it?


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Daniel Kinzler
Am 31.05.2015 um 15:21 schrieb Gerard Meijssen:
> Hoi,
> When someone or something received an award, it is needed if only to complete
> the list of recipients of that award.. There is no benchmark for enough
> information. The notion that you a Nobel award winner is not relevant is
> poppycock. With automated descriptions awards do show.

If you have an item that says someone whon a nobel prize, but not when or which,
and also does *noit* have a label, that items is quite useless; it'S impossible
to tell which person it is even referring to.

That is what markus is talking about. For people, if there is a label, we
already have pretty good info. But if there is no label, we have a problem, and
if there isn't any other identifying info,m the item is useless.


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Jane Darnell
Gerard, that is a good point. I believe that percentage has been dropping,
no? The question is whether what is left are new items, or items still
dating from the mass upload of last year and the
previously-connected-wikipedia-articles have been deleted. If they are new,
maybe we should wait. If they are in the second group, I say they are
unsalvageable and should probably be deleted.

On Sun, May 31, 2015 at 4:38 PM, Gerard Meijssen 
wrote:

> Hoi,
> Given that 19,21% of all items have no statements whatsoever, it is a bit
> premature to come with such notions. Let us first fix this and then
> consider what we do not need.
> Thanks,
>  GerardM
>
> https://tools.wmflabs.org/wikidata-todo/stats.php?reverse
>
> On 31 May 2015 at 16:12, Romaine Wiki  wrote:
>
>> Hi Markus,
>>
>> I think there must always be some way to make an item unique. A way to
>> identify the item outside Wikidata. This can be a sitelink, for subjects
>> located on a fixed location on Earth it are the coordinates, etc. But only
>> coordinates without knowing what the subject is does not make sense either.
>> In some way the item must be able to be identified somewhere somehow.
>>
>> This subject can be compared with the subject of what we (on nl-wiki) see
>> as basic statements that need to be added to be able to identify a subject
>> on Wikidata and to be able to differ it from another subject. (To be able
>> to answer the question: the article X is not connected to Wikidata, to
>> which item should it be connected?)
>> For everything instance of. For geographical situated subjects we request
>> the country, located in the administrative territorial entity, location
>> (for towns, etc), coordinates. For people gender, birth/death date/place,
>> occupation, country. For living creatures the taxonomic rank, scientific
>> name, parent taxon. For creative works the author, date.
>>
>> Romaine
>>
>> 2015-05-29 17:42 GMT+02:00 Markus Krötzsch > >:
>>
>>> Hi Jane, hi Romaine,
>>>
>>> I think we agree that valuable information should be kept if at all
>>> possible. My chief concern is that orphaned items do not have a clear
>>> identity. It's not useful to know that "something" is at a certain
>>> location. The first thing we must determine is what this "thing" is that we
>>> are talking about. Links to Wikipedia are a good way of doing this. Without
>>> them, we need to come up with other identity providing sources. We
>>> certainly have the right infrastructure for this (with all the identifier
>>> properties that point to other databases and authority files).
>>>
>>> The first goal of anyone who wants to safe an orphan should be to
>>> connect it with the outside world so as to give it some grounding to build
>>> on.
>>>
>>> A weaker way to provide basic grounding is to make internal connections.
>>> There are cases where this is strong (one can identify items as "the author
>>> of War & Peace" or "the mother of Marie Skłodowska-Curie"), but there are
>>> other cases where it is too weak ("the town in Germany" or "the part of
>>> Europe" do not identify anything). One would need to give this more thought
>>> if one wanted to determine automatically if an item receives its identity
>>> from the incoming/outgoing links to other items.
>>>
>>> Cheers,
>>>
>>> Markus
>>>
>>>
>>> On 29.05.2015 17:05, Romaine Wiki wrote:
>>>
 Hi Markus,

 Indeed yes, that is also an issue. It can happen with new articles and
 with older articles.

 Some articles get deleted as they are a duplicate of another article, or
 worse written (to bad to keep), or not an encyclopaedic subject to have
 in an encyclopaedia.


 Every day, on nl-wiki we check new articles if they are connected on
 Wikidata. Almost all articles that have a template that marks it as
 nominated for deletion we ignore and we do not add them to Wikidata. On
 nl-wiki we do this by hand, to make sure all basic statements are added,
 but if this is done by bots, you get a situation that they may not check
 for templates that mark articles for deletion.

 If an deleted item has statements, the question is if this information
 is at itself valuable to keep to be used and/or for the future.

 Romaine



>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Magnus Manske
I do agree with that notion, to a large degree. It is probably more
important to give at least some statements to items with associated
Wikipedia articles, that to delete empty items that, by their very
definition, are not in the way of anything.

As a practical suggestion for helping:
http://tools.wmflabs.org/wikidata-todo/random_item_without_instance.php



On Sun, May 31, 2015 at 3:39 PM Gerard Meijssen 
wrote:

> Hoi,
> Given that 19,21% of all items have no statements whatsoever, it is a bit
> premature to come with such notions. Let us first fix this and then
> consider what we do not need.
> Thanks,
>  GerardM
>
> https://tools.wmflabs.org/wikidata-todo/stats.php?reverse
>
> On 31 May 2015 at 16:12, Romaine Wiki  wrote:
>
>> Hi Markus,
>>
>> I think there must always be some way to make an item unique. A way to
>> identify the item outside Wikidata. This can be a sitelink, for subjects
>> located on a fixed location on Earth it are the coordinates, etc. But only
>> coordinates without knowing what the subject is does not make sense either.
>> In some way the item must be able to be identified somewhere somehow.
>>
>> This subject can be compared with the subject of what we (on nl-wiki) see
>> as basic statements that need to be added to be able to identify a subject
>> on Wikidata and to be able to differ it from another subject. (To be able
>> to answer the question: the article X is not connected to Wikidata, to
>> which item should it be connected?)
>> For everything instance of. For geographical situated subjects we request
>> the country, located in the administrative territorial entity, location
>> (for towns, etc), coordinates. For people gender, birth/death date/place,
>> occupation, country. For living creatures the taxonomic rank, scientific
>> name, parent taxon. For creative works the author, date.
>>
>> Romaine
>>
>> 2015-05-29 17:42 GMT+02:00 Markus Krötzsch > >:
>>
>>> Hi Jane, hi Romaine,
>>>
>>> I think we agree that valuable information should be kept if at all
>>> possible. My chief concern is that orphaned items do not have a clear
>>> identity. It's not useful to know that "something" is at a certain
>>> location. The first thing we must determine is what this "thing" is that we
>>> are talking about. Links to Wikipedia are a good way of doing this. Without
>>> them, we need to come up with other identity providing sources. We
>>> certainly have the right infrastructure for this (with all the identifier
>>> properties that point to other databases and authority files).
>>>
>>> The first goal of anyone who wants to safe an orphan should be to
>>> connect it with the outside world so as to give it some grounding to build
>>> on.
>>>
>>> A weaker way to provide basic grounding is to make internal connections.
>>> There are cases where this is strong (one can identify items as "the author
>>> of War & Peace" or "the mother of Marie Skłodowska-Curie"), but there are
>>> other cases where it is too weak ("the town in Germany" or "the part of
>>> Europe" do not identify anything). One would need to give this more thought
>>> if one wanted to determine automatically if an item receives its identity
>>> from the incoming/outgoing links to other items.
>>>
>>> Cheers,
>>>
>>> Markus
>>>
>>>
>>> On 29.05.2015 17:05, Romaine Wiki wrote:
>>>
 Hi Markus,

 Indeed yes, that is also an issue. It can happen with new articles and
 with older articles.

 Some articles get deleted as they are a duplicate of another article, or
 worse written (to bad to keep), or not an encyclopaedic subject to have
 in an encyclopaedia.


 Every day, on nl-wiki we check new articles if they are connected on
 Wikidata. Almost all articles that have a template that marks it as
 nominated for deletion we ignore and we do not add them to Wikidata. On
 nl-wiki we do this by hand, to make sure all basic statements are added,
 but if this is done by bots, you get a situation that they may not check
 for templates that mark articles for deletion.

 If an deleted item has statements, the question is if this information
 is at itself valuable to keep to be used and/or for the future.

 Romaine



>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi,
Given that 19,21% of all items have no statements whatsoever, it is a bit
premature to come with such notions. Let us first fix this and then
consider what we do not need.
Thanks,
 GerardM

https://tools.wmflabs.org/wikidata-todo/stats.php?reverse

On 31 May 2015 at 16:12, Romaine Wiki  wrote:

> Hi Markus,
>
> I think there must always be some way to make an item unique. A way to
> identify the item outside Wikidata. This can be a sitelink, for subjects
> located on a fixed location on Earth it are the coordinates, etc. But only
> coordinates without knowing what the subject is does not make sense either.
> In some way the item must be able to be identified somewhere somehow.
>
> This subject can be compared with the subject of what we (on nl-wiki) see
> as basic statements that need to be added to be able to identify a subject
> on Wikidata and to be able to differ it from another subject. (To be able
> to answer the question: the article X is not connected to Wikidata, to
> which item should it be connected?)
> For everything instance of. For geographical situated subjects we request
> the country, located in the administrative territorial entity, location
> (for towns, etc), coordinates. For people gender, birth/death date/place,
> occupation, country. For living creatures the taxonomic rank, scientific
> name, parent taxon. For creative works the author, date.
>
> Romaine
>
> 2015-05-29 17:42 GMT+02:00 Markus Krötzsch 
> :
>
>> Hi Jane, hi Romaine,
>>
>> I think we agree that valuable information should be kept if at all
>> possible. My chief concern is that orphaned items do not have a clear
>> identity. It's not useful to know that "something" is at a certain
>> location. The first thing we must determine is what this "thing" is that we
>> are talking about. Links to Wikipedia are a good way of doing this. Without
>> them, we need to come up with other identity providing sources. We
>> certainly have the right infrastructure for this (with all the identifier
>> properties that point to other databases and authority files).
>>
>> The first goal of anyone who wants to safe an orphan should be to connect
>> it with the outside world so as to give it some grounding to build on.
>>
>> A weaker way to provide basic grounding is to make internal connections.
>> There are cases where this is strong (one can identify items as "the author
>> of War & Peace" or "the mother of Marie Skłodowska-Curie"), but there are
>> other cases where it is too weak ("the town in Germany" or "the part of
>> Europe" do not identify anything). One would need to give this more thought
>> if one wanted to determine automatically if an item receives its identity
>> from the incoming/outgoing links to other items.
>>
>> Cheers,
>>
>> Markus
>>
>>
>> On 29.05.2015 17:05, Romaine Wiki wrote:
>>
>>> Hi Markus,
>>>
>>> Indeed yes, that is also an issue. It can happen with new articles and
>>> with older articles.
>>>
>>> Some articles get deleted as they are a duplicate of another article, or
>>> worse written (to bad to keep), or not an encyclopaedic subject to have
>>> in an encyclopaedia.
>>>
>>>
>>> Every day, on nl-wiki we check new articles if they are connected on
>>> Wikidata. Almost all articles that have a template that marks it as
>>> nominated for deletion we ignore and we do not add them to Wikidata. On
>>> nl-wiki we do this by hand, to make sure all basic statements are added,
>>> but if this is done by bots, you get a situation that they may not check
>>> for templates that mark articles for deletion.
>>>
>>> If an deleted item has statements, the question is if this information
>>> is at itself valuable to keep to be used and/or for the future.
>>>
>>> Romaine
>>>
>>>
>>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Romaine Wiki
Hi Markus,

I think there must always be some way to make an item unique. A way to
identify the item outside Wikidata. This can be a sitelink, for subjects
located on a fixed location on Earth it are the coordinates, etc. But only
coordinates without knowing what the subject is does not make sense either.
In some way the item must be able to be identified somewhere somehow.

This subject can be compared with the subject of what we (on nl-wiki) see
as basic statements that need to be added to be able to identify a subject
on Wikidata and to be able to differ it from another subject. (To be able
to answer the question: the article X is not connected to Wikidata, to
which item should it be connected?)
For everything instance of. For geographical situated subjects we request
the country, located in the administrative territorial entity, location
(for towns, etc), coordinates. For people gender, birth/death date/place,
occupation, country. For living creatures the taxonomic rank, scientific
name, parent taxon. For creative works the author, date.

Romaine

2015-05-29 17:42 GMT+02:00 Markus Krötzsch :

> Hi Jane, hi Romaine,
>
> I think we agree that valuable information should be kept if at all
> possible. My chief concern is that orphaned items do not have a clear
> identity. It's not useful to know that "something" is at a certain
> location. The first thing we must determine is what this "thing" is that we
> are talking about. Links to Wikipedia are a good way of doing this. Without
> them, we need to come up with other identity providing sources. We
> certainly have the right infrastructure for this (with all the identifier
> properties that point to other databases and authority files).
>
> The first goal of anyone who wants to safe an orphan should be to connect
> it with the outside world so as to give it some grounding to build on.
>
> A weaker way to provide basic grounding is to make internal connections.
> There are cases where this is strong (one can identify items as "the author
> of War & Peace" or "the mother of Marie Skłodowska-Curie"), but there are
> other cases where it is too weak ("the town in Germany" or "the part of
> Europe" do not identify anything). One would need to give this more thought
> if one wanted to determine automatically if an item receives its identity
> from the incoming/outgoing links to other items.
>
> Cheers,
>
> Markus
>
>
> On 29.05.2015 17:05, Romaine Wiki wrote:
>
>> Hi Markus,
>>
>> Indeed yes, that is also an issue. It can happen with new articles and
>> with older articles.
>>
>> Some articles get deleted as they are a duplicate of another article, or
>> worse written (to bad to keep), or not an encyclopaedic subject to have
>> in an encyclopaedia.
>>
>>
>> Every day, on nl-wiki we check new articles if they are connected on
>> Wikidata. Almost all articles that have a template that marks it as
>> nominated for deletion we ignore and we do not add them to Wikidata. On
>> nl-wiki we do this by hand, to make sure all basic statements are added,
>> but if this is done by bots, you get a situation that they may not check
>> for templates that mark articles for deletion.
>>
>> If an deleted item has statements, the question is if this information
>> is at itself valuable to keep to be used and/or for the future.
>>
>> Romaine
>>
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi,
When someone or something received an award, it is needed if only to
complete the list of recipients of that award.. There is no benchmark for
enough information. The notion that you a Nobel award winner is not
relevant is poppycock. With automated descriptions awards do show.

When you only considered the current sub par fixed descriptions you lose
out big time.
Thanks,
 GerardM

On 31 May 2015 at 15:01, Markus Krötzsch 
wrote:

> On 31.05.2015 11:28, Daniel Kinzler wrote:
>
>> Am 30.05.2015 um 21:12 schrieb Gerard Meijssen:
>>
>>> Hoi,
>>> How about people who have received an award and complete the list of
>>> people who
>>> were awarded ? How about people who had a position and complete the list
>>> of
>>> people who held that position ? How about people who are parents between
>>> a
>>> famous grandfather and a famous grandchild ...
>>>
>>
>> In such a case, they would either have a statement stating the
>> award/position,
>> or have incoming, because they are used on statements on other items,
>> e.g. as
>> the father or child.
>>
>> If they have no statements, and are not used in statements, I do not see
>> how
>> they could be structurally significant.
>>
>>
>>
> Interesting example. Just having an award (but no incoming statements or
> sitelinks) might not be enough. It would just tell us "somebody received
> the award". We need some statements/sitelinks/descriptions that tell us who
> exactly that person was.
>
> Jane proposed a good benchmark question: do we have enough information
> about the item to detect and merge duplicates more or less automatically?
> Items where this is not the case should receive special attention -- and be
> either stabilised or deleted eventually. For persons (P31:Q5), the name
> (label) can go a long way to identify items. Awards are probably too weak
> to integrate information over (even specific things like "the 1981 Nobel
> prize in Chemistry" might not have a unique award winner; and the absence
> of a Nobel prize will not be noticed as an incompleteness for a person, so
> an item about the same person that misses the award statement will not be
> detected as duplicate).
>
> Regards,
>
> Markus
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Gerard Meijssen
Hoi,
Every item can easily be found. Many Wikipedias have "Wikidata search"
enabled. Some have it enabled on en.wp like For many humans I have added
statements like date of death and date of birth. When I cannot disambiguate
properly I often add statements to make it easy understand who is who.

Given the vagaries of Wikipedia notability I absolutely do not believe in
giving primacy to whatever is said elsewhere. What I do know is that I
prefer search in Reaonator.
Thanks,
  GerardM

On 31 May 2015 at 13:52, Jane Darnell  wrote:

> I think the key issue here is findability, as Yaroslav pointed out. If the
> incoming links should be there but are not (yet) there, then deletion is
> probably best, since anyone needing those items will probably create a
> double anyway, as the item's findability is zero.
>
> On Sun, May 31, 2015 at 11:28 AM, Daniel Kinzler <
> daniel.kinz...@wikimedia.de> wrote:
>
>> Am 30.05.2015 um 21:12 schrieb Gerard Meijssen:
>> > Hoi,
>> > How about people who have received an award and complete the list of
>> people who
>> > were awarded ? How about people who had a position and complete the
>> list of
>> > people who held that position ? How about people who are parents
>> between a
>> > famous grandfather and a famous grandchild ...
>>
>> In such a case, they would either have a statement stating the
>> award/position,
>> or have incoming, because they are used on statements on other items,
>> e.g. as
>> the father or child.
>>
>> If they have no statements, and are not used in statements, I do not see
>> how
>> they could be structurally significant.
>>
>>
>> --
>> Daniel Kinzler
>> Senior Software Developer
>>
>> Wikimedia Deutschland
>> Gesellschaft zur Förderung Freien Wissens e.V.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Markus Krötzsch

On 31.05.2015 11:28, Daniel Kinzler wrote:

Am 30.05.2015 um 21:12 schrieb Gerard Meijssen:

Hoi,
How about people who have received an award and complete the list of people who
were awarded ? How about people who had a position and complete the list of
people who held that position ? How about people who are parents between a
famous grandfather and a famous grandchild ...


In such a case, they would either have a statement stating the award/position,
or have incoming, because they are used on statements on other items, e.g. as
the father or child.

If they have no statements, and are not used in statements, I do not see how
they could be structurally significant.




Interesting example. Just having an award (but no incoming statements or 
sitelinks) might not be enough. It would just tell us "somebody received 
the award". We need some statements/sitelinks/descriptions that tell us 
who exactly that person was.


Jane proposed a good benchmark question: do we have enough information 
about the item to detect and merge duplicates more or less 
automatically? Items where this is not the case should receive special 
attention -- and be either stabilised or deleted eventually. For persons 
(P31:Q5), the name (label) can go a long way to identify items. Awards 
are probably too weak to integrate information over (even specific 
things like "the 1981 Nobel prize in Chemistry" might not have a unique 
award winner; and the absence of a Nobel prize will not be noticed as an 
incompleteness for a person, so an item about the same person that 
misses the award statement will not be detected as duplicate).


Regards,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Jane Darnell
I think the key issue here is findability, as Yaroslav pointed out. If the
incoming links should be there but are not (yet) there, then deletion is
probably best, since anyone needing those items will probably create a
double anyway, as the item's findability is zero.

On Sun, May 31, 2015 at 11:28 AM, Daniel Kinzler <
daniel.kinz...@wikimedia.de> wrote:

> Am 30.05.2015 um 21:12 schrieb Gerard Meijssen:
> > Hoi,
> > How about people who have received an award and complete the list of
> people who
> > were awarded ? How about people who had a position and complete the list
> of
> > people who held that position ? How about people who are parents between
> a
> > famous grandfather and a famous grandchild ...
>
> In such a case, they would either have a statement stating the
> award/position,
> or have incoming, because they are used on statements on other items, e.g.
> as
> the father or child.
>
> If they have no statements, and are not used in statements, I do not see
> how
> they could be structurally significant.
>
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Daniel Kinzler
Am 30.05.2015 um 21:12 schrieb Gerard Meijssen:
> Hoi,
> How about people who have received an award and complete the list of people 
> who
> were awarded ? How about people who had a position and complete the list of
> people who held that position ? How about people who are parents between a
> famous grandfather and a famous grandchild ... 

In such a case, they would either have a statement stating the award/position,
or have incoming, because they are used on statements on other items, e.g. as
the father or child.

If they have no statements, and are not used in statements, I do not see how
they could be structurally significant.


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-31 Thread Yaroslav M. Blanter

On 2015-05-30 20:42, Stas Malyshev wrote:

Hi!

The problems are often items which never had any links. Many of them 
are
spam, but some of them can be used for structural needs and can be 
kept.


If they are structural (like wikidata-only classes, etc.) shouldn't 
they

have some incoming links? After all, structure is supposed to be used
for something...


They should but very often they do not. I understand that most of these 
items will never be found when they are actually needed, on the other 
hand I am hesitant to go for blanc deletion, since users in good 
standing invested some time into creation of these items.


Cheers
Yaroslav

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-30 Thread Gerard Meijssen
Hoi,
How about people who have received an award and complete the list of people
who were awarded ? How about people who had a position and complete the
list of people who held that position ? How about people who are parents
between a famous grandfather and a famous grandchild ...

There are so many possibilities.. The thing is Wikipedia is not really the
yard stick of what is relevant in Wikidata
Thanks,
 GerardM

On 30 May 2015 at 20:42, Stas Malyshev  wrote:

> Hi!
>
> > The problems are often items which never had any links. Many of them are
> > spam, but some of them can be used for structural needs and can be kept.
>
> If they are structural (like wikidata-only classes, etc.) shouldn't they
> have some incoming links? After all, structure is supposed to be used
> for something...
>
> --
> Stas Malyshev
> smalys...@wikimedia.org
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-30 Thread Stas Malyshev
Hi!

> The problems are often items which never had any links. Many of them are
> spam, but some of them can be used for structural needs and can be kept.

If they are structural (like wikidata-only classes, etc.) shouldn't they
have some incoming links? After all, structure is supposed to be used
for something...

-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-30 Thread Jane Darnell
Yaroslav thanks for posting - I had no idea. Thanks for your work on this
too

On Sat, May 30, 2015 at 10:05 AM, Yaroslav M. Blanter 
wrote:

> On 2015-05-29 17:42, Markus Krötzsch wrote:
>
>> Hi Jane, hi Romaine,
>>
>> I think we agree that valuable information should be kept if at all
>> possible. My chief concern is that orphaned items do not have a clear
>> identity. It's not useful to know that "something" is at a certain
>> location. The first thing we must determine is what this "thing" is
>> that we are talking about. Links to Wikipedia are a good way of doing
>> this. Without them, we need to come up with other identity providing
>> sources. We certainly have the right infrastructure for this (with all
>> the identifier properties that point to other databases and authority
>> files).
>>
>> The first goal of anyone who wants to safe an orphan should be to
>> connect it with the outside world so as to give it some grounding to
>> build on.
>>
>> A weaker way to provide basic grounding is to make internal
>> connections. There are cases where this is strong (one can identify
>> items as "the author of War & Peace" or "the mother of Marie
>> Skłodowska-Curie"), but there are other cases where it is too weak
>> ("the town in Germany" or "the part of Europe" do not identify
>> anything). One would need to give this more thought if one wanted to
>> determine automatically if an item receives its identity from the
>> incoming/outgoing links to other items.
>>
>> Cheers,
>>
>> Markus
>>
>>
>>
> Actually, we already have tools designed by Pasleim to track such items:
>
> https://www.wikidata.org/wiki/User:Pasleim/notability
>
> https://www.wikidata.org/wiki/User:Pasleim/Items_for_deletion/Almost_empty
>
> I usually check that there are no backlinks, provided there are none check
> the history, and if it turns out the item is empty because of a
> non-automated merge I merge it, and if it is empty because the only
> interwiki link was deleted on the project I delete it as non-notable.
>
> The problems are often items which never had any links. Many of them are
> spam, but some of them can be used for structural needs and can be kept. It
> is not always easy to figure out in practice, especially if they are in
> non-Latin and non-Cyrillic alphabets.
>
> Cheers
> Yaroslav
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-30 Thread Yaroslav M. Blanter

On 2015-05-29 17:42, Markus Krötzsch wrote:

Hi Jane, hi Romaine,

I think we agree that valuable information should be kept if at all
possible. My chief concern is that orphaned items do not have a clear
identity. It's not useful to know that "something" is at a certain
location. The first thing we must determine is what this "thing" is
that we are talking about. Links to Wikipedia are a good way of doing
this. Without them, we need to come up with other identity providing
sources. We certainly have the right infrastructure for this (with all
the identifier properties that point to other databases and authority
files).

The first goal of anyone who wants to safe an orphan should be to
connect it with the outside world so as to give it some grounding to
build on.

A weaker way to provide basic grounding is to make internal
connections. There are cases where this is strong (one can identify
items as "the author of War & Peace" or "the mother of Marie
Skłodowska-Curie"), but there are other cases where it is too weak
("the town in Germany" or "the part of Europe" do not identify
anything). One would need to give this more thought if one wanted to
determine automatically if an item receives its identity from the
incoming/outgoing links to other items.

Cheers,

Markus




Actually, we already have tools designed by Pasleim to track such items:

https://www.wikidata.org/wiki/User:Pasleim/notability

https://www.wikidata.org/wiki/User:Pasleim/Items_for_deletion/Almost_empty

I usually check that there are no backlinks, provided there are none 
check the history, and if it turns out the item is empty because of a 
non-automated merge I merge it, and if it is empty because the only 
interwiki link was deleted on the project I delete it as non-notable.


The problems are often items which never had any links. Many of them are 
spam, but some of them can be used for structural needs and can be kept. 
It is not always easy to figure out in practice, especially if they are 
in non-Latin and non-Cyrillic alphabets.


Cheers
Yaroslav

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Markus Krötzsch

Hi Jane, hi Romaine,

I think we agree that valuable information should be kept if at all 
possible. My chief concern is that orphaned items do not have a clear 
identity. It's not useful to know that "something" is at a certain 
location. The first thing we must determine is what this "thing" is that 
we are talking about. Links to Wikipedia are a good way of doing this. 
Without them, we need to come up with other identity providing sources. 
We certainly have the right infrastructure for this (with all the 
identifier properties that point to other databases and authority files).


The first goal of anyone who wants to safe an orphan should be to 
connect it with the outside world so as to give it some grounding to 
build on.


A weaker way to provide basic grounding is to make internal connections. 
There are cases where this is strong (one can identify items as "the 
author of War & Peace" or "the mother of Marie Skłodowska-Curie"), but 
there are other cases where it is too weak ("the town in Germany" or 
"the part of Europe" do not identify anything). One would need to give 
this more thought if one wanted to determine automatically if an item 
receives its identity from the incoming/outgoing links to other items.


Cheers,

Markus


On 29.05.2015 17:05, Romaine Wiki wrote:

Hi Markus,

Indeed yes, that is also an issue. It can happen with new articles and
with older articles.

Some articles get deleted as they are a duplicate of another article, or
worse written (to bad to keep), or not an encyclopaedic subject to have
in an encyclopaedia.


Every day, on nl-wiki we check new articles if they are connected on
Wikidata. Almost all articles that have a template that marks it as
nominated for deletion we ignore and we do not add them to Wikidata. On
nl-wiki we do this by hand, to make sure all basic statements are added,
but if this is done by bots, you get a situation that they may not check
for templates that mark articles for deletion.

If an deleted item has statements, the question is if this information
is at itself valuable to keep to be used and/or for the future.

Romaine





___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Romaine Wiki
Hi Markus,

Indeed yes, that is also an issue. It can happen with new articles and with
older articles.

Some articles get deleted as they are a duplicate of another article, or
worse written (to bad to keep), or not an encyclopaedic subject to have in
an encyclopaedia.


Every day, on nl-wiki we check new articles if they are connected on
Wikidata. Almost all articles that have a template that marks it as
nominated for deletion we ignore and we do not add them to Wikidata. On
nl-wiki we do this by hand, to make sure all basic statements are added,
but if this is done by bots, you get a situation that they may not check
for templates that mark articles for deletion.

If an deleted item has statements, the question is if this information is
at itself valuable to keep to be used and/or for the future.

Romaine



2015-05-29 15:20 GMT+02:00 Markus Krötzsch :

> On 29.05.2015 13:42, Romaine Wiki wrote:
>
>> The problem that users face is that they experience the merging of items
>> to difficult or didn't know that that was possible. They understand
>> (with much annoyance) that they can only add a sitelink to one item.
>> Therefore they delete a sitelink on one item, and add it to another item.
>>
>> Personally I think that an afterwards merge would be recommended here.
>> Would it be possible to have a bot 1. determine what the original
>> sitelink was that has been removed from the item, 2. see if this
>> sitelink is added on another item, 3. check if the statements of both
>> items match (otherwise: a list for humans/tool to check if it is the
>> same), 4. if the same: automatically merge both items.
>>
>> I think it would be good to have more things being automated as much as
>> possible.
>>
>
> That's an important situation too, but I think in the example I gave
> something else happened: the sitelink was not moved, but the Wikipedia
> article that it was pointing to got deleted. So it's not just the link that
> vanished: all information about the item that might have been found on the
> deleted Wikipedia page is also gone. It's therefore quite hard to find out
> what the item might have been about.
>
> Regards,
>
> Markus
>
>
>  2015-05-29 13:23 GMT+02:00 Markus Krötzsch
>> mailto:mar...@semantic-mediawiki.org>>:
>>
>>
>> Hi all,
>>
>> I just noticed that we have a number of "orphaned items" which were
>> created and imported from some Wikipedia article that then got
>> deleted. The result is an item with almost no data, no sitelinks,
>> and all references claiming "imported from X Wikipedia".
>>
>> Example:
>> https://www.wikidata.org/wiki/Q9386774
>>
>> Here is what happened:
>> https://www.wikidata.org/w/index.php?title=Q9386774&action=history
>>
>> It would be good to have a process for dealing with such cases. I am
>> not saying that we must delete such items immediately, but it seems
>> obvious that they need some special attention to become
>> self-sustaining even without Wikipedia articles associated.
>>
>> Things that would be important to keep such items:
>> * Links to other external datasets that confirm the existence of the
>> thing.
>> * Links to authoritative web sites that confirm the existence of the
>> thing.
>> * Proper references for all data (we always want that, but here it's
>> even more critical: "imported from Wikipedia" is never great, but at
>> least it leaves some hope of finding proper references if the
>> Wikipedia page still exists).
>>
>> In cases like the above, deletion seems to be the most reasonable
>> solution (the little data that is there can easily be added again if
>> needed in the future). It seems that one could automatically collect
>> such candidates for deletion (pages that are not used as property
>> values, have no site links, have no identifier properties, were not
>> edited since more than a month, an have less than, say, ten
>> properties+labels+descriptions).
>>
>> Regards,
>>
>> Markus
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org 
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Jane Darnell
I think this is a problem with the current workflow for creating articles,
which starts with Wikipedia, then finishes with Wikidata, though it should
probably be the other way around. This problem will eventually solve
itself, given enough time, although I believe we still have lots of poorly
documented images on Commons that are currently being cleaned up that date
back to the early days of Commons, when people uploaded images there
because "they had to" but didn't spend any time on the meta data there and
stuffed it all into the Wikipedia articles they linked the image to. Since
then, lots of that metadata has found it's way back to the images on
Commons, where it should have been added in the first place. It may seem
like double work, but it is necessary due to lack of proper tools to
automate it. Right now there is a lot of double work needing to be done in
Wikidata as people create articles, and this can only be done by copying
most of the information in the leading paragraph to various statements on
Wikidata. This can be both annoying and confusing.

I think the idea of deletion with 0-2 statements is OK, but 10 statements?
With 10 statements there must be something salvageable, no?

On Fri, May 29, 2015 at 3:20 PM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> On 29.05.2015 13:42, Romaine Wiki wrote:
>
>> The problem that users face is that they experience the merging of items
>> to difficult or didn't know that that was possible. They understand
>> (with much annoyance) that they can only add a sitelink to one item.
>> Therefore they delete a sitelink on one item, and add it to another item.
>>
>> Personally I think that an afterwards merge would be recommended here.
>> Would it be possible to have a bot 1. determine what the original
>> sitelink was that has been removed from the item, 2. see if this
>> sitelink is added on another item, 3. check if the statements of both
>> items match (otherwise: a list for humans/tool to check if it is the
>> same), 4. if the same: automatically merge both items.
>>
>> I think it would be good to have more things being automated as much as
>> possible.
>>
>
> That's an important situation too, but I think in the example I gave
> something else happened: the sitelink was not moved, but the Wikipedia
> article that it was pointing to got deleted. So it's not just the link that
> vanished: all information about the item that might have been found on the
> deleted Wikipedia page is also gone. It's therefore quite hard to find out
> what the item might have been about.
>
> Regards,
>
> Markus
>
>
>  2015-05-29 13:23 GMT+02:00 Markus Krötzsch
>> mailto:mar...@semantic-mediawiki.org>>:
>>
>> Hi all,
>>
>> I just noticed that we have a number of "orphaned items" which were
>> created and imported from some Wikipedia article that then got
>> deleted. The result is an item with almost no data, no sitelinks,
>> and all references claiming "imported from X Wikipedia".
>>
>> Example:
>> https://www.wikidata.org/wiki/Q9386774
>>
>> Here is what happened:
>> https://www.wikidata.org/w/index.php?title=Q9386774&action=history
>>
>> It would be good to have a process for dealing with such cases. I am
>> not saying that we must delete such items immediately, but it seems
>> obvious that they need some special attention to become
>> self-sustaining even without Wikipedia articles associated.
>>
>> Things that would be important to keep such items:
>> * Links to other external datasets that confirm the existence of the
>> thing.
>> * Links to authoritative web sites that confirm the existence of the
>> thing.
>> * Proper references for all data (we always want that, but here it's
>> even more critical: "imported from Wikipedia" is never great, but at
>> least it leaves some hope of finding proper references if the
>> Wikipedia page still exists).
>>
>> In cases like the above, deletion seems to be the most reasonable
>> solution (the little data that is there can easily be added again if
>> needed in the future). It seems that one could automatically collect
>> such candidates for deletion (pages that are not used as property
>> values, have no site links, have no identifier properties, were not
>> edited since more than a month, an have less than, say, ten
>> properties+labels+descriptions).
>>
>> Regards,
>>
>> Markus
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org 
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia

Re: [Wikidata] Orphaned items

2015-05-29 Thread Markus Krötzsch

On 29.05.2015 13:42, Romaine Wiki wrote:

The problem that users face is that they experience the merging of items
to difficult or didn't know that that was possible. They understand
(with much annoyance) that they can only add a sitelink to one item.
Therefore they delete a sitelink on one item, and add it to another item.

Personally I think that an afterwards merge would be recommended here.
Would it be possible to have a bot 1. determine what the original
sitelink was that has been removed from the item, 2. see if this
sitelink is added on another item, 3. check if the statements of both
items match (otherwise: a list for humans/tool to check if it is the
same), 4. if the same: automatically merge both items.

I think it would be good to have more things being automated as much as
possible.


That's an important situation too, but I think in the example I gave 
something else happened: the sitelink was not moved, but the Wikipedia 
article that it was pointing to got deleted. So it's not just the link 
that vanished: all information about the item that might have been found 
on the deleted Wikipedia page is also gone. It's therefore quite hard to 
find out what the item might have been about.


Regards,

Markus



2015-05-29 13:23 GMT+02:00 Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>:

Hi all,

I just noticed that we have a number of "orphaned items" which were
created and imported from some Wikipedia article that then got
deleted. The result is an item with almost no data, no sitelinks,
and all references claiming "imported from X Wikipedia".

Example:
https://www.wikidata.org/wiki/Q9386774

Here is what happened:
https://www.wikidata.org/w/index.php?title=Q9386774&action=history

It would be good to have a process for dealing with such cases. I am
not saying that we must delete such items immediately, but it seems
obvious that they need some special attention to become
self-sustaining even without Wikipedia articles associated.

Things that would be important to keep such items:
* Links to other external datasets that confirm the existence of the
thing.
* Links to authoritative web sites that confirm the existence of the
thing.
* Proper references for all data (we always want that, but here it's
even more critical: "imported from Wikipedia" is never great, but at
least it leaves some hope of finding proper references if the
Wikipedia page still exists).

In cases like the above, deletion seems to be the most reasonable
solution (the little data that is there can easily be added again if
needed in the future). It seems that one could automatically collect
such candidates for deletion (pages that are not used as property
values, have no site links, have no identifier properties, were not
edited since more than a month, an have less than, say, ten
properties+labels+descriptions).

Regards,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Magnus Manske
Not quite sure...

On Fri, May 29, 2015 at 1:33 PM Andy Mabbett 
wrote:

> On 29 May 2015 at 12:39, Magnus Manske 
> wrote:
> > (in this case, it appears to be the "castle of Żagań", once located in
> > https://en.wikipedia.org/wiki/%C5%BBaga%C5%84 )
>
> This:
>
> http://www.poland.travel/en/gallery/palace-in-zagan ?
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Andy Mabbett
On 29 May 2015 at 12:39, Magnus Manske  wrote:
> (in this case, it appears to be the "castle of Żagań", once located in
> https://en.wikipedia.org/wiki/%C5%BBaga%C5%84 )

This:

http://www.poland.travel/en/gallery/palace-in-zagan ?

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Romaine Wiki
The problem that users face is that they experience the merging of items to
difficult or didn't know that that was possible. They understand (with much
annoyance) that they can only add a sitelink to one item. Therefore they
delete a sitelink on one item, and add it to another item.

Personally I think that an afterwards merge would be recommended here.
Would it be possible to have a bot 1. determine what the original sitelink
was that has been removed from the item, 2. see if this sitelink is added
on another item, 3. check if the statements of both items match (otherwise:
a list for humans/tool to check if it is the same), 4. if the same:
automatically merge both items.

I think it would be good to have more things being automated as much as
possible.

Romaine

2015-05-29 13:23 GMT+02:00 Markus Krötzsch :

> Hi all,
>
> I just noticed that we have a number of "orphaned items" which were
> created and imported from some Wikipedia article that then got deleted. The
> result is an item with almost no data, no sitelinks, and all references
> claiming "imported from X Wikipedia".
>
> Example:
> https://www.wikidata.org/wiki/Q9386774
>
> Here is what happened:
> https://www.wikidata.org/w/index.php?title=Q9386774&action=history
>
> It would be good to have a process for dealing with such cases. I am not
> saying that we must delete such items immediately, but it seems obvious
> that they need some special attention to become self-sustaining even
> without Wikipedia articles associated.
>
> Things that would be important to keep such items:
> * Links to other external datasets that confirm the existence of the thing.
> * Links to authoritative web sites that confirm the existence of the thing.
> * Proper references for all data (we always want that, but here it's even
> more critical: "imported from Wikipedia" is never great, but at least it
> leaves some hope of finding proper references if the Wikipedia page still
> exists).
>
> In cases like the above, deletion seems to be the most reasonable solution
> (the little data that is there can easily be added again if needed in the
> future). It seems that one could automatically collect such candidates for
> deletion (pages that are not used as property values, have no site links,
> have no identifier properties, were not edited since more than a month, an
> have less than, say, ten properties+labels+descriptions).
>
> Regards,
>
> Markus
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Orphaned items

2015-05-29 Thread Magnus Manske
(in this case, it appears to be the "castle of Żagań", once located in
https://en.wikipedia.org/wiki/%C5%BBaga%C5%84 )

On Fri, May 29, 2015 at 12:24 PM Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Hi all,
>
> I just noticed that we have a number of "orphaned items" which were
> created and imported from some Wikipedia article that then got deleted.
> The result is an item with almost no data, no sitelinks, and all
> references claiming "imported from X Wikipedia".
>
> Example:
> https://www.wikidata.org/wiki/Q9386774
>
> Here is what happened:
> https://www.wikidata.org/w/index.php?title=Q9386774&action=history
>
> It would be good to have a process for dealing with such cases. I am not
> saying that we must delete such items immediately, but it seems obvious
> that they need some special attention to become self-sustaining even
> without Wikipedia articles associated.
>
> Things that would be important to keep such items:
> * Links to other external datasets that confirm the existence of the thing.
> * Links to authoritative web sites that confirm the existence of the thing.
> * Proper references for all data (we always want that, but here it's
> even more critical: "imported from Wikipedia" is never great, but at
> least it leaves some hope of finding proper references if the Wikipedia
> page still exists).
>
> In cases like the above, deletion seems to be the most reasonable
> solution (the little data that is there can easily be added again if
> needed in the future). It seems that one could automatically collect
> such candidates for deletion (pages that are not used as property
> values, have no site links, have no identifier properties, were not
> edited since more than a month, an have less than, say, ten
> properties+labels+descriptions).
>
> Regards,
>
> Markus
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata