Re: [R] Gender balance in R

2014-11-25 Thread Maarten Blaauw

Thanks for the responses so far.

 The gender ratio in R should reflect the gender ratio of the potential
 users, as this is the pool the R users / developers are coming from.

I agree with this, but then again I don't think R really has 0% female 
users/developers as the R member list suggests. I'd rather expect to see 
10-50% women (my quick guess of gender balance in STEM areas, depending 
on where on the ladder and in which country one samples). Perhaps the R 
community should be assessing if there's some additional bias applied 
during the selection of supporting or ordinary members?


Cheers,

Maarten

On 25/11/14 09:15, Rainer M Krug wrote:

Sarah Goslee sarah.gos...@gmail.com writes:


I took a look at apparent gender among list participants a few years ago:
https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html

Same general thing: very few regular participants on the list were
women. I don't see any sign that that has changed in the last three
years. The bar to participation in the R-help list is much, much lower
than that to become a developer.

It would be interesting to look at the stats for CRAN packages as well.

The very low percentage of regular female participants is one of the
things that keeps me active on this list: to demonstrate that it's not
only men who use R and participate in the community.


Apart from that, your input is very valuable and your answers very
hands-on helpful - and this is why I am glad that you are on the list -
and not because you are female.

Looking at R developers / CRAN package developers / list posts gender ratios 
might be
interesting, but I don't think it tells you anything: If there is a
skewed ratio in any of these, the question is if this is the gender
ratio in the user base and, more importantly, in the pool of potential
users.

I have no idea about the gender ratios in potential users, but I would
guess that some disciplines already have a skewed gender ratio, which is
then reflected in R.

The gender ratio in R should reflect the gender ratio of the potential
users, as this is the pool the R users / developers are coming from.

As long as nobody is excluded because of their gender, background, hair
or eye color, OS usage, or whatever ridiculous excuse one could find, I
think R will thrive.
Don't get me wring - nothing against promoting R to new user groups.

But anyway - interesting question.

I was teaching True Basic for several years, and I definitely did not
see a gender bias in their programming abilities - the differences was
in many cases that males thought they could do it, and females thought
they could not do it because it involves maths... But I was able to
prove quite a few wrong.

Cheers,

Rainer



(If you decide to do the stats for 2014, be aware that I've been out
on medical leave for the past two months, so the numbers are even
lower than usual.)

Sarah

On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw
maarten.bla...@qub.ac.uk wrote:

Hi there,

I can't help to notice that the gender balance among R developers and
ordinary members is extremely skewed (as it is with open source software in
general).

Have a look at http://www.r-project.org/foundation/memberlist.html - at most
a handful of women are listed among the 'supporting members', and none at
all among the 29 'ordinary members'.

On the other hand I personally know many happy R users of both genders.

My questions are thus: Should R developers (and users) be worried that the
'other half' is excluded? If so, how could female R users/developers be
persuaded to become more visible (e.g. added as supporting or ordinary
members)?

Thanks,

Maarten





--
| Dr. Maarten Blaauw
| Lecturer in Chronology
|
| School of Geography, Archaeology  Palaeoecology
| Queen's University Belfast, UK
|
| www  http://www.chrono.qub.ac.uk/blaauw
| tel  +44 (0)28 9097 3895

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gender balance in R

2014-11-25 Thread Maarten Blaauw

Nice graph, Scott, thanks!

Based on your code I plotted not the absolute numbers but the ratios, 
which show slowly increasing relative participation of female Rhelpers 
over time (red = women, blue=men, black=unknown). After a c. 5% female 
contribution in 1998, this has grown to about 15% now. At this rate 
we'll reach parity around AD 2080.


My code:

if (!require(gender)) {
library(devtools)
install_github(scottkosty/gender)
library(gender)
}
rHelp - rHelpNames
rHelp[is.na(rHelp$gender), gender] - unknown

yr - unique(rHelp$year)

helpers - list(dates, M=rep(0, length(yr)), F=rep(0, length(yr)), 
unkn=rep(0, length(yr)))


for(i in 1:nrow(rHelp))
 {
  j - which(yr == rHelp$year[i])
  gender - rHelp$gender[i]
  if(gender == M)
   helpers$M[[j]] - helpers$M[[j]]+1 else
if(gender == F)
 helpers$F[[j]] - helpers$F[[j]]+1 else
  if(gender == unknown)
   helpers$unkn[[j]] - helpers$unkn[[j]]+1
 }
plot(yr, helpers$M / (helpers$M+helpers$F+helpers$unkn), type=l, 
col=4, ylim=c(0,1), ylab=proportions, yaxs=i)
lines(yr, helpers$F / (helpers$M+helpers$F+helpers$unkn), col=2) 


lines(yr, helpers$unkn / (helpers$M+helpers$F+helpers$unkn))

Cheers,

Maarten

On 25/11/14 12:11, Scott Kostyshak wrote:

On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee sarah.gos...@gmail.com wrote:

I took a look at apparent gender among list participants a few years ago:
https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html

Same general thing: very few regular participants on the list were
women. I don't see any sign that that has changed in the last three
years. The bar to participation in the R-help list is much, much lower
than that to become a developer.


I plotted the gender of posters on r-help over time. The plot is here:
https://twitter.com/scottkosty/status/449933971644633088

The code to reproduce that plot is here:
https://github.com/scottkosty/genderAnalysis
The R file there will call devtools::install_github to install a
package from Github used for guessing the gender based on the first
name (https://github.com/scottkosty/gender).

Note also on that tweet that Gabriela de Queiroz posted it, who is the
founder of R-ladies; and that David Smith showed interest in
discussing the topic. So there is definitely demand for some data
analysis and discussion on the topic.


It would be interesting to look at the stats for CRAN packages as well.

The very low percentage of regular female participants is one of the
things that keeps me active on this list: to demonstrate that it's not
only men who use R and participate in the community.


Thank you for that!

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University


(If you decide to do the stats for 2014, be aware that I've been out
on medical leave for the past two months, so the numbers are even
lower than usual.)

Sarah

On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw
maarten.bla...@qub.ac.uk wrote:

Hi there,

I can't help to notice that the gender balance among R developers and
ordinary members is extremely skewed (as it is with open source software in
general).

Have a look at http://www.r-project.org/foundation/memberlist.html - at most
a handful of women are listed among the 'supporting members', and none at
all among the 29 'ordinary members'.

On the other hand I personally know many happy R users of both genders.

My questions are thus: Should R developers (and users) be worried that the
'other half' is excluded? If so, how could female R users/developers be
persuaded to become more visible (e.g. added as supporting or ordinary
members)?

Thanks,

Maarten


--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
| Dr. Maarten Blaauw
| Lecturer in Chronology
|
| School of Geography, Archaeology  Palaeoecology
| Queen's University Belfast, UK
|
| www  http://www.chrono.qub.ac.uk/blaauw
| tel  +44 (0)28 9097 3895


gendeR.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Gender balance in R

2014-11-24 Thread Maarten Blaauw

Hi there,

I can't help to notice that the gender balance among R developers and 
ordinary members is extremely skewed (as it is with open source software 
in general).


Have a look at http://www.r-project.org/foundation/memberlist.html - at 
most a handful of women are listed among the 'supporting members', and 
none at all among the 29 'ordinary members'.


On the other hand I personally know many happy R users of both genders.

My questions are thus: Should R developers (and users) be worried that 
the 'other half' is excluded? If so, how could female R users/developers 
be persuaded to become more visible (e.g. added as supporting or 
ordinary members)?


Thanks,

Maarten

--
| Dr. Maarten Blaauw
| Lecturer in Chronology
|
| School of Geography, Archaeology  Palaeoecology
| Queen's University Belfast, UK
|
| www  http://www.chrono.qub.ac.uk/blaauw
| tel  +44 (0)28 9097 3895

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: C stack usage is too close to the limit

2008-01-26 Thread Maarten Blaauw
Lately R has been behaving strange on my Linux (Ubuntu 7.10) machine,  
with occasional segfaults. Today something else and reproducible  
happened:

If I type the code below (meant for calibrating data), I get the error  
message that the C stack usage is too close to the limit.

calcurve - cbind(1:2e4, 1:2e4, 1:2e3); #dummy curve, real one is more complex

caldist - function(cage=Cage, error=Error, sdev=Sdev, times=Times, By=By)
  {
   theta - seq(min(calcurve[,1]), max(calcurve[,1]), by=By);

   interpolate - function(th, col)
{
 if(th==calcurve[1,1]) {calcurve[1,col]}else
 if(th==calcurve[nrow(calcurve),1]) {calcurve[nrow(calcurve),col]}else
  {
   k - min(which(calcurve[,1]  th));
   slope -  
(calcurve[k-1,col]-calcurve[k,col])/(calcurve[k-1,1]-calcurve[k,1]);
   calcurve[k-1,col] + slope*(th-calcurve[k-1,1]);
  }
}

   mu - c();
   cerror - c();
   for(i in 1:length(theta))
{
 mu[i] - interpolate(theta[i],2);
 cerror[i] - interpolate(theta[i],3);
}

   caldist - dnorm(mu, cage, (error^2+cerror^2)^.5);
   cbind(theta, caldist/sum(caldist));
  }

caldist(1e3,1e2);

Unfortunately I am no huge computer wizard. Has anyone got any idea  
why this happens? Is it reproducible on other machines? How can I  
solve this problem?

My R:
R version 2.6.1 (2007-11-26)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] rcompgen_0.1-17

Cstack_info()
   sizecurrent  direction eval_depth
8388608   2404  1  2

Many thanks,

Maarten Blaauw

-- 
Dr. Maarten Blaauw
School of Geography, Archaeology  Palaeoecology
Queen's University Belfast, U.K.
On leave from Department of Earth Sciences
Uppsala University, Sweden
[EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: C stack usage is too close to the limit

2008-01-26 Thread Maarten Blaauw
Sorry, indeed I forgot to put some of the factors in the code. Here it  
is again, now updated:

calcurve - cbind(1:2e4, 1:2e4, rep(100, length=2e4));

caldist - function(cage, error, sdev=2, times=5, By=1)
  {
   calcurve - calcurve[which((calcurve[,2]+calcurve[,3]) =  
cage-(times*error)),];
   calcurve - calcurve[which((calcurve[,2]-calcurve[,3]) =  
cage+(times*error)),];
   theta - seq(min(calcurve[,1]), max(calcurve[,1]), by=By);

   interpolate - function(th, col)
{
 if(th==calcurve[1,1]) {calcurve[1,col]}else
 if(th==calcurve[nrow(calcurve),1]) {calcurve[nrow(calcurve),col]}else
  {
   k - min(which(calcurve[,1]  th));
   slope -  
(calcurve[k-1,col]-calcurve[k,col])/(calcurve[k-1,1]-calcurve[k,1]);
   calcurve[k-1,col] + slope*(th-calcurve[k-1,1]);
  }
}

   mu - c();
   cerror - c();
   for(i in 1:length(theta))
{
 mu[i] - interpolate(theta[i],2);
 cerror[i] - interpolate(theta[i],3);
}

   caldist - dnorm(mu, cage, (error^2+cerror^2)^.5);
   cbind(theta, caldist/sum(caldist));
  }

caldist(2450,50);

Strangely enough the stacking error message seems not to happen every  
time. It also has happened on the WinXP partition of the same Toshiba  
laptop. So it is not as reproducible as I first hoped/feared.

-- 
Dr. Maarten Blaauw
School of Geography, Archaeology  Palaeoecology
Queen's University Belfast, U.K.
On leave from Department of Earth Sciences
Uppsala University, Sweden
[EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.