Re: [R] Gender balance in R
Thanks for the responses so far. The gender ratio in R should reflect the gender ratio of the potential users, as this is the pool the R users / developers are coming from. I agree with this, but then again I don't think R really has 0% female users/developers as the R member list suggests. I'd rather expect to see 10-50% women (my quick guess of gender balance in STEM areas, depending on where on the ladder and in which country one samples). Perhaps the R community should be assessing if there's some additional bias applied during the selection of supporting or ordinary members? Cheers, Maarten On 25/11/14 09:15, Rainer M Krug wrote: Sarah Goslee sarah.gos...@gmail.com writes: I took a look at apparent gender among list participants a few years ago: https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html Same general thing: very few regular participants on the list were women. I don't see any sign that that has changed in the last three years. The bar to participation in the R-help list is much, much lower than that to become a developer. It would be interesting to look at the stats for CRAN packages as well. The very low percentage of regular female participants is one of the things that keeps me active on this list: to demonstrate that it's not only men who use R and participate in the community. Apart from that, your input is very valuable and your answers very hands-on helpful - and this is why I am glad that you are on the list - and not because you are female. Looking at R developers / CRAN package developers / list posts gender ratios might be interesting, but I don't think it tells you anything: If there is a skewed ratio in any of these, the question is if this is the gender ratio in the user base and, more importantly, in the pool of potential users. I have no idea about the gender ratios in potential users, but I would guess that some disciplines already have a skewed gender ratio, which is then reflected in R. The gender ratio in R should reflect the gender ratio of the potential users, as this is the pool the R users / developers are coming from. As long as nobody is excluded because of their gender, background, hair or eye color, OS usage, or whatever ridiculous excuse one could find, I think R will thrive. Don't get me wring - nothing against promoting R to new user groups. But anyway - interesting question. I was teaching True Basic for several years, and I definitely did not see a gender bias in their programming abilities - the differences was in many cases that males thought they could do it, and females thought they could not do it because it involves maths... But I was able to prove quite a few wrong. Cheers, Rainer (If you decide to do the stats for 2014, be aware that I've been out on medical leave for the past two months, so the numbers are even lower than usual.) Sarah On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw maarten.bla...@qub.ac.uk wrote: Hi there, I can't help to notice that the gender balance among R developers and ordinary members is extremely skewed (as it is with open source software in general). Have a look at http://www.r-project.org/foundation/memberlist.html - at most a handful of women are listed among the 'supporting members', and none at all among the 29 'ordinary members'. On the other hand I personally know many happy R users of both genders. My questions are thus: Should R developers (and users) be worried that the 'other half' is excluded? If so, how could female R users/developers be persuaded to become more visible (e.g. added as supporting or ordinary members)? Thanks, Maarten -- | Dr. Maarten Blaauw | Lecturer in Chronology | | School of Geography, Archaeology Palaeoecology | Queen's University Belfast, UK | | www http://www.chrono.qub.ac.uk/blaauw | tel +44 (0)28 9097 3895 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gender balance in R
Nice graph, Scott, thanks! Based on your code I plotted not the absolute numbers but the ratios, which show slowly increasing relative participation of female Rhelpers over time (red = women, blue=men, black=unknown). After a c. 5% female contribution in 1998, this has grown to about 15% now. At this rate we'll reach parity around AD 2080. My code: if (!require(gender)) { library(devtools) install_github(scottkosty/gender) library(gender) } rHelp - rHelpNames rHelp[is.na(rHelp$gender), gender] - unknown yr - unique(rHelp$year) helpers - list(dates, M=rep(0, length(yr)), F=rep(0, length(yr)), unkn=rep(0, length(yr))) for(i in 1:nrow(rHelp)) { j - which(yr == rHelp$year[i]) gender - rHelp$gender[i] if(gender == M) helpers$M[[j]] - helpers$M[[j]]+1 else if(gender == F) helpers$F[[j]] - helpers$F[[j]]+1 else if(gender == unknown) helpers$unkn[[j]] - helpers$unkn[[j]]+1 } plot(yr, helpers$M / (helpers$M+helpers$F+helpers$unkn), type=l, col=4, ylim=c(0,1), ylab=proportions, yaxs=i) lines(yr, helpers$F / (helpers$M+helpers$F+helpers$unkn), col=2) lines(yr, helpers$unkn / (helpers$M+helpers$F+helpers$unkn)) Cheers, Maarten On 25/11/14 12:11, Scott Kostyshak wrote: On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee sarah.gos...@gmail.com wrote: I took a look at apparent gender among list participants a few years ago: https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html Same general thing: very few regular participants on the list were women. I don't see any sign that that has changed in the last three years. The bar to participation in the R-help list is much, much lower than that to become a developer. I plotted the gender of posters on r-help over time. The plot is here: https://twitter.com/scottkosty/status/449933971644633088 The code to reproduce that plot is here: https://github.com/scottkosty/genderAnalysis The R file there will call devtools::install_github to install a package from Github used for guessing the gender based on the first name (https://github.com/scottkosty/gender). Note also on that tweet that Gabriela de Queiroz posted it, who is the founder of R-ladies; and that David Smith showed interest in discussing the topic. So there is definitely demand for some data analysis and discussion on the topic. It would be interesting to look at the stats for CRAN packages as well. The very low percentage of regular female participants is one of the things that keeps me active on this list: to demonstrate that it's not only men who use R and participate in the community. Thank you for that! Scott -- Scott Kostyshak Economics PhD Candidate Princeton University (If you decide to do the stats for 2014, be aware that I've been out on medical leave for the past two months, so the numbers are even lower than usual.) Sarah On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw maarten.bla...@qub.ac.uk wrote: Hi there, I can't help to notice that the gender balance among R developers and ordinary members is extremely skewed (as it is with open source software in general). Have a look at http://www.r-project.org/foundation/memberlist.html - at most a handful of women are listed among the 'supporting members', and none at all among the 29 'ordinary members'. On the other hand I personally know many happy R users of both genders. My questions are thus: Should R developers (and users) be worried that the 'other half' is excluded? If so, how could female R users/developers be persuaded to become more visible (e.g. added as supporting or ordinary members)? Thanks, Maarten -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- | Dr. Maarten Blaauw | Lecturer in Chronology | | School of Geography, Archaeology Palaeoecology | Queen's University Belfast, UK | | www http://www.chrono.qub.ac.uk/blaauw | tel +44 (0)28 9097 3895 gendeR.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gender balance in R
Hi there, I can't help to notice that the gender balance among R developers and ordinary members is extremely skewed (as it is with open source software in general). Have a look at http://www.r-project.org/foundation/memberlist.html - at most a handful of women are listed among the 'supporting members', and none at all among the 29 'ordinary members'. On the other hand I personally know many happy R users of both genders. My questions are thus: Should R developers (and users) be worried that the 'other half' is excluded? If so, how could female R users/developers be persuaded to become more visible (e.g. added as supporting or ordinary members)? Thanks, Maarten -- | Dr. Maarten Blaauw | Lecturer in Chronology | | School of Geography, Archaeology Palaeoecology | Queen's University Belfast, UK | | www http://www.chrono.qub.ac.uk/blaauw | tel +44 (0)28 9097 3895 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: C stack usage is too close to the limit
Lately R has been behaving strange on my Linux (Ubuntu 7.10) machine, with occasional segfaults. Today something else and reproducible happened: If I type the code below (meant for calibrating data), I get the error message that the C stack usage is too close to the limit. calcurve - cbind(1:2e4, 1:2e4, 1:2e3); #dummy curve, real one is more complex caldist - function(cage=Cage, error=Error, sdev=Sdev, times=Times, By=By) { theta - seq(min(calcurve[,1]), max(calcurve[,1]), by=By); interpolate - function(th, col) { if(th==calcurve[1,1]) {calcurve[1,col]}else if(th==calcurve[nrow(calcurve),1]) {calcurve[nrow(calcurve),col]}else { k - min(which(calcurve[,1] th)); slope - (calcurve[k-1,col]-calcurve[k,col])/(calcurve[k-1,1]-calcurve[k,1]); calcurve[k-1,col] + slope*(th-calcurve[k-1,1]); } } mu - c(); cerror - c(); for(i in 1:length(theta)) { mu[i] - interpolate(theta[i],2); cerror[i] - interpolate(theta[i],3); } caldist - dnorm(mu, cage, (error^2+cerror^2)^.5); cbind(theta, caldist/sum(caldist)); } caldist(1e3,1e2); Unfortunately I am no huge computer wizard. Has anyone got any idea why this happens? Is it reproducible on other machines? How can I solve this problem? My R: R version 2.6.1 (2007-11-26) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] rcompgen_0.1-17 Cstack_info() sizecurrent direction eval_depth 8388608 2404 1 2 Many thanks, Maarten Blaauw -- Dr. Maarten Blaauw School of Geography, Archaeology Palaeoecology Queen's University Belfast, U.K. On leave from Department of Earth Sciences Uppsala University, Sweden [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: C stack usage is too close to the limit
Sorry, indeed I forgot to put some of the factors in the code. Here it is again, now updated: calcurve - cbind(1:2e4, 1:2e4, rep(100, length=2e4)); caldist - function(cage, error, sdev=2, times=5, By=1) { calcurve - calcurve[which((calcurve[,2]+calcurve[,3]) = cage-(times*error)),]; calcurve - calcurve[which((calcurve[,2]-calcurve[,3]) = cage+(times*error)),]; theta - seq(min(calcurve[,1]), max(calcurve[,1]), by=By); interpolate - function(th, col) { if(th==calcurve[1,1]) {calcurve[1,col]}else if(th==calcurve[nrow(calcurve),1]) {calcurve[nrow(calcurve),col]}else { k - min(which(calcurve[,1] th)); slope - (calcurve[k-1,col]-calcurve[k,col])/(calcurve[k-1,1]-calcurve[k,1]); calcurve[k-1,col] + slope*(th-calcurve[k-1,1]); } } mu - c(); cerror - c(); for(i in 1:length(theta)) { mu[i] - interpolate(theta[i],2); cerror[i] - interpolate(theta[i],3); } caldist - dnorm(mu, cage, (error^2+cerror^2)^.5); cbind(theta, caldist/sum(caldist)); } caldist(2450,50); Strangely enough the stacking error message seems not to happen every time. It also has happened on the WinXP partition of the same Toshiba laptop. So it is not as reproducible as I first hoped/feared. -- Dr. Maarten Blaauw School of Geography, Archaeology Palaeoecology Queen's University Belfast, U.K. On leave from Department of Earth Sciences Uppsala University, Sweden [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.