Re: Can you help me to merge your avatars?

2010-11-24 Thread Johannes Schmid
Hi!

 Have you a correspondency table such that using it I can say this
 person and this one are actually the same physical person? I'm
 focused on the committers, mailers and bug reporters.

I am pretty sure gnome.org has this in the LDAP but I wouldn't like
gnome.org to give this information because I consider that confident and
private.

Actually nearly all commits to git.gnome.org happen with the full
name/email which means it is absolutely no problem to link nickname and
name here for people that give their correct name. Anyway gnome.org
doesn't check in any way that people give their real name so it
perfectly possible that people don't use their real name for
registration and nobody will ever find out.

Regards,
Johannes

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Can you help me to merge your avatars?

2010-11-24 Thread Mathieu Goeminne
Hello,

Thanks for your very precious and rapid feedback. I talked to my PhD
supervisor, Professor Tom Mens about it (head of the software engineering
group of the university, in CC of this mail), and he proposed to sign a
non-disclosure agreement in which we promise not to make available the
information about the physical persons involved (or the identities and
logins they use).
For our research purposes, the results we will produce will not contain any
personal information.Essentially, our results will be primarily numerical
and visual results that will be analysed statistically. We will ensure to
respect any privacy policy that will be imposed.

Concerning your second remark about git.gnome.org, we already use the
guidelines you suggest, but they are not sufficient for our analysis
purposes, since we still find quite a number of false positives and false
negatives during our data analysis, and moreover this data does not contain
information about identities used by the maillers and bug trackers.

Our goal is to have a really unified view on the different data sources used
during OSS development (committers, bug trackers, mailers), which is why the
information contained in your LDAP will be very useful to us. Of course, we
do not need *all* the information stored in the LDAP, only the information
that will allow us to link identities to real persons. (Things like
passwords and so on are entirely irrelevant for us, of course.)

We therefore hope that we still will be able to obtain the information
stored in LDAP (or a reduced version of it, or a similar data source), as it
will really help us in understanding how open source software ecosystems
evolve.

If you wish, feel free to directly discuss with my supervisor about it. In
fact, he is sitting next to me while I am writing this mail ;-)

Mathieu Goeminne

2010/11/24 Johannes Schmid j...@jsschmid.de

 Hi!

  Have you a correspondency table such that using it I can say this
  person and this one are actually the same physical person? I'm
  focused on the committers, mailers and bug reporters.

 I am pretty sure gnome.org has this in the LDAP but I wouldn't like
 gnome.org to give this information because I consider that confident and
 private.

 Actually nearly all commits to git.gnome.org happen with the full
 name/email which means it is absolutely no problem to link nickname and
 name here for people that give their correct name. Anyway gnome.org
 doesn't check in any way that people give their real name so it
 perfectly possible that people don't use their real name for
 registration and nobody will ever find out.

 Regards,
 Johannes


___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Can you help me to merge your avatars?

2010-11-24 Thread Johannes Schmid
Hi!

 Concerning your second remark about git.gnome.org, we already use the
 guidelines you suggest, but they are not sufficient for our analysis
 purposes, since we still find quite a number of false positives and
 false negatives during our data analysis, and moreover this data does
 not contain information about identities used by the maillers and bug
 trackers.

You will have to ask the sysadmins about it. Anyway, I am pretty sure
that git.gnome.org, blogs.gnome.org and bugzilla.gnome.org use different
databases and as such don't connect people so you won't get the
information you are looking for. irc.gimp.org doesn't use any
registration at all.

The LDAP is used for git, mail and shell accounts only AFAIK.

Regards,
Johannes


signature.asc
Description: This is a digitally signed message part
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Can you help me to merge your avatars?

2010-11-24 Thread Milan Bouchet-Valat
Le mardi 23 novembre 2010 à 15:00 +0100, Mathieu Goeminne a écrit :
 Hello there, 
 
 First of all, thanks for your work!
 
 I send you a mail because I asked a question on the Brasero IRC chat
 which said to ask it to you.
 
 I'm Mathieu Goeminne, a phd student (at the UMONS, Belgium) and I'm
 studying the free software ecosystems. A part of my job consists in a
 merge of all persons involved in free software (including Brasero and
 Evince). I designed some new algorithms, and implemented others to try
 to detect real physical persons behind a mail adress, a svn login, a
 pseudo, etc. In order to validate my algorithms, I need a merge
 reference, ie, a list of correspondences between all evolved people.
 I firstly did it manually but there are still errors on my reference. 
 
 Have you a correspondency table such that using it I can say this
 person and this one are actually the same physical person? I'm
 focused on the committers, mailers and bug reporters.
I think that as Johannes said, there is no such list. You have a git
account, and Bugzilla account, and a mail account, but you're not forced
at all to use this e-mail address for the two other accounts.

 If not, have you an idea about a mean to build this kind of table?
I'm afraid you cannot, at least if you mean to get an exact information.
People that wanted this information just did like you, by using a few
tricks to approach the reality, but we can't be sure their succeeded.
See for example the GNOME Census, which has required some reflections
about this problem:
http://blogs.gnome.org/bolsh/2010/07/28/gnome-census/

The only solution I can see is to validate a part of your data by hand,
and then get a measure of what your algorithm got wrong. Then,
statistical tests will give you confidence intervals that you'll be able
to apply to the broader survey. But I'm sure you thought about this
already. ;-)


Regards


___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Can you help me to merge your avatars?

2010-11-24 Thread Germán Póo-Caamaño
On Wed, 2010-11-24 at 10:18 +0100, Mathieu Goeminne wrote:
 Hello,
 
 Thanks for your very precious and rapid feedback. I talked to my PhD
 supervisor, Professor Tom Mens about it (head of the software engineering
 group of the university, in CC of this mail), and he proposed to sign a
 non-disclosure agreement in which we promise not to make available the
 information about the physical persons involved (or the identities and
 logins they use).
 For our research purposes, the results we will produce will not contain any
 personal information.Essentially, our results will be primarily numerical
 and visual results that will be analysed statistically. We will ensure to
 respect any privacy policy that will be imposed.
 
 Concerning your second remark about git.gnome.org, we already use the
 guidelines you suggest, but they are not sufficient for our analysis
 purposes, since we still find quite a number of false positives and false
 negatives during our data analysis, and moreover this data does not contain
 information about identities used by the maillers and bug trackers.

What do you mean for false positives and false negatives?  (Not in the
statistical definition, in the samples).

You can always apply Pareto here: 80% of the code is written by 20% of
the total of contributors.  And for all of them, it is not hard to fix
them (it is harder when you are not used to contributors in the project,
but not that hard anyway).

You will face bigger problems when mining GNOME git repositories, and
you might double/triple count contributions in particular repositories
(specially in the Subversion's era). You will find some huge commits
with no new code at all (thousands of line of code), or the same history
repeated across several repositories with different hashes, and so on.

 Our goal is to have a really unified view on the different data sources used
 during OSS development (committers, bug trackers, mailers), which is why the
 information contained in your LDAP will be very useful to us. Of course, we
 do not need *all* the information stored in the LDAP, only the information
 that will allow us to link identities to real persons. (Things like
 passwords and so on are entirely irrelevant for us, of course.)

Peter Rigby has worked in unifying committers and mailing lists, and -as
far as I know- he used some techniques proposed by Chris Bird (I do not
have the reference at hand, but you will find them).

Regards,

-- 
Germán Póo-Caamaño
http://www.gnome.org/~gpoo/


signature.asc
Description: This is a digitally signed message part
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Can you help me to merge your avatars?

2010-11-24 Thread Olav Vitters
On Wed, Nov 24, 2010 at 09:58:23AM +0100, Johannes Schmid wrote:
  Have you a correspondency table such that using it I can say this
  person and this one are actually the same physical person? I'm
  focused on the committers, mailers and bug reporters.
 
 I am pretty sure gnome.org has this in the LDAP but I wouldn't like
 gnome.org to give this information because I consider that confident and
 private.

It is not in LDAP.

LDAP contains userids, names and email addresses for SSH accounts. We do
not store peoples configured git identity setting, nor their Bugzilla
id, nor the email address used on mailing lists.

The LDAP email addresses might:
 * reflect what a user configured in Git, but it doesn't have to be the
   case
 * be the same as their Bugzilla email address, but might not (e.g. for me)
 * be the same as what is used on a mailing list, but not required

But in every case: could be the same, could be different.

Also, we update LDAP details when requested. History is not kept.

-- 
Regards,
Olav
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Can you help me to merge your avatars?

2010-11-23 Thread Tomeu Vizoso
On Tue, Nov 23, 2010 at 15:00, Mathieu Goeminne mgoemi...@gmail.com wrote:
 Hello there,

 First of all, thanks for your work!

 I send you a mail because I asked a question on the Brasero IRC chat which
 said to ask it to you.

 I'm Mathieu Goeminne, a phd student (at the UMONS, Belgium) and I'm studying
 the free software ecosystems. A part of my job consists in a merge of all
 persons involved in free software (including Brasero and Evince). I designed
 some new algorithms, and implemented others to try to detect real physical
 persons behind a mail adress, a svn login, a pseudo, etc. In order to
 validate my algorithms, I need a merge reference, ie, a list of
 correspondences between all evolved people. I firstly did it manually but
 there are still errors on my reference.

 Have you a correspondency table such that using it I can say this person
 and this one are actually the same physical person? I'm focused on the
 committers, mailers and bug reporters.

 If not, have you an idea about a mean to build this kind of table?

I think Ohloh has this information?

Regards,

Tomeu

 Thanks a lot.

 Mathieu Goeminne

 ___
 desktop-devel-list mailing list
 desktop-devel-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/desktop-devel-list

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list