Re: [Tutor] Top posters to tutor list for 2008
I think I find it most interesting that the greatest percent is still under 15% and then it tapers rapidly. I'm curious what % of people posted 5 or less messages... perhaps it will become a personal project somewhere down the road ;) -Wayne On Fri, Jan 2, 2009 at 7:28 AM, Kent Johnson wrote: > On Fri, Jan 2, 2009 at 8:13 AM, Sander Sweers > wrote: > > On Fri, Jan 2, 2009 at 13:52, Kent Johnson wrote: > >> Or ask more questions, that works too! > > > > So you and Alan ask the most questions ;-) > > No, that honor goes to Dick Moores. He is in the top 10 in 4 of the > last 5 years! > > > Thanks to all the Tutors for year of great support :-) > > You're welcome, we couldn't do it without you! > > Kent > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > -- To be considered stupid and to be told so is more painful than being called gluttonous, mendacious, violent, lascivious, lazy, cowardly: every weakness, every vice, has found its defenders, its rhetoric, its ennoblement and exaltation, but stupidity hasn't. - Primo Levi ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
On Fri, Jan 2, 2009 at 8:13 AM, Sander Sweers wrote: > On Fri, Jan 2, 2009 at 13:52, Kent Johnson wrote: >> Or ask more questions, that works too! > > So you and Alan ask the most questions ;-) No, that honor goes to Dick Moores. He is in the top 10 in 4 of the last 5 years! > Thanks to all the Tutors for year of great support :-) You're welcome, we couldn't do it without you! Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
On Fri, Jan 2, 2009 at 13:52, Kent Johnson wrote: > Or ask more questions, that works too! So you and Alan ask the most questions ;-) Seriously now, this really shows the power of Python and I'll have a good time figuring out how this exactly works. Thanks to all the Tutors for year of great support :-) Greets Sander ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
On Fri, Jan 2, 2009 at 5:34 AM, Alan Gauld wrote: > I think the figures reflect the general level of activity on the list. > We seem to have peaked in 2005... > Statistics, don't you love 'em :-) I couldn't resist adding a total number of posts and percent to the calculations. Statistics + python = time sink :-) I re-ran the program back to 2003. New program and results below. 2005 was a banner year. 2008 was down considerably from 2007 and that does account for our smaller numbers. BTW your historical counts are up a bit in this set because this is the first year I had the name folding. Maybe I should add a set of known aliases also... Kent ''' Counts all posts to Python-tutor by author''' # -*- coding: latin-1 -*- from datetime import date, timedelta import operator, urllib2 from BeautifulSoup import BeautifulSoup today = date.today() for year in range(2003, 2009): startDate = date(year, 1, 1) endDate = date(year, 12, 31) thirtyOne = timedelta(days=31) counts = {} # Collect all the counts for a year by scraping the monthly author archive pages while startDate < endDate and startDate < today: dateString = startDate.strftime('%Y-%B') url = 'http://mail.python.org/pipermail/tutor/%s/author.html' % dateString data = urllib2.urlopen(url).read() soup = BeautifulSoup(data) li = soup.findAll('li')[2:-2] for l in li: name = l.i.string.strip() counts[name] = counts.get(name, 0) + 1 startDate += thirtyOne totalPosts = sum(counts.itervalues()) # Consolidate names that vary by case under the most popular spelling nameMap = dict() # Map lower-case name to most popular name for name, count in sorted(counts.iteritems(), key=operator.itemgetter(1), reverse=True): lower = name.lower() if lower in nameMap: # Add counts for a name we have seen already counts[nameMap[lower]] += count else: nameMap[lower] = name print print '%s (%s posts)' % (year, totalPosts) print '' for name, count in sorted(counts.iteritems(), key=operator.itemgetter(1), reverse=True)[:20]: pct = round(100.0*count/totalPosts, 1) print '%s %s (%s%%)' % (name.encode('utf-8', 'xmlcharrefreplace'), count, pct) print # Results as of 12/31/2008: ''' 2003 (7745 posts) Danny Yoo 617 (8.0%) Alan Gauld 421 (5.4%) Jeff Shannon 283 (3.7%) Magnus Lycka 242 (3.1%) Bob Gailer 195 (2.5%) Magnus =?iso-8859-1?Q?Lyck=E5?= 166 (2.1%) alan.ga...@bt.com 161 (2.1%) Kirk Bailey 155 (2.0%) Gregor Lingl 152 (2.0%) Lloyd Kvam 142 (1.8%) Andrei 118 (1.5%) Sean 'Shaleh' Perry 117 (1.5%) Magnus Lyckå 113 (1.5%) Michael Janssen 113 (1.5%) Erik Price 100 (1.3%) Lee Harr 88 (1.1%) Terry Carroll 87 (1.1%) Daniel Ehrenberg 78 (1.0%) Abel Daniel 76 (1.0%) Don Arnold 75 (1.0%) 2004 (7178 posts) Alan Gauld 699 (9.7%) Danny Yoo 530 (7.4%) Kent Johnson 451 (6.3%) Lloyd Kvam 146 (2.0%) Dick Moores 145 (2.0%) Liam Clarke 140 (2.0%) Brian van den Broek 122 (1.7%) Karl Pflästerer 109 (1.5%) Jacob S. 101 (1.4%) Andrei 99 (1.4%) Chad Crabtree 93 (1.3%) Bob Gailer 91 (1.3%) Magnus Lycka 91 (1.3%) Terry Carroll 88 (1.2%) Marilyn Davis 84 (1.2%) Gregor Lingl 73 (1.0%) Dave S 73 (1.0%) Bill Mill 71 (1.0%) Isr Gish 71 (1.0%) Lee Harr 67 (0.9%) 2005 (9705 posts) Kent Johnson 1189 (12.3%) Danny Yoo 767 (7.9%) Alan Gauld 565 (5.8%) Alan G 317 (3.3%) Liam Clarke 298 (3.1%) Max Noel 203 (2.1%) Nathan Pinno 197 (2.0%) Brian van den Broek 190 (2.0%) Jacob S. 154 (1.6%) jfouhy at paradise.net.nz 135 (1.4%) Alberto Troiano 128 (1.3%) Bernard Lebel 119 (1.2%) Joseph Quigley 101 (1.0%) Terry Carroll 93 (1.0%) Andrei 79 (0.8%) D. Hartley 77 (0.8%) John Fouhy 73 (0.8%) bob 73 (0.8%) Hugo González Monteverde 72 (0.7%) Orri Ganel 69 (0.7%) 2006 (7521 posts) Kent Johnson 913 (12.1%) Alan Gauld 821 (10.9%) Danny Yoo 448 (6.0%) Luke Paireepinart 242 (3.2%) John Fouhy 187 (2.5%) Chris Hengge 166 (2.2%) Bob Gailer 134 (1.8%) Dick Moores 129 (1.7%) Asrarahmed Kadri 119 (1.6%) Terry Carroll 111 (1.5%) Python 94 (1.2%) Mike Hansen 74 (1.0%) Liam Clarke 72 (1.0%) Carroll, Barry 67 (0.9%) Kermit Rose 66 (0.9%) anil maran 66 (0.9%) Hugo González Monteverde 65 (0.9%) wesley chun 63 (0.8%) Dave S 58 (0.8%) Christopher Spears 53 (0.7%) 2007 (7600 posts) Kent Johnson 1052 (13.8%) Alan Gauld 977 (12.9%) Luke Paireepinart 260 (3.4%) Dick Moores 203 (2.7%) Eric Brunson 164 (2.2%) Bob Gailer 144 (1.9%) Terry Carroll 128 (1.7%) Tiger12506 112 (1.5%) John Fouhy 105 (1.4%) Ricardo Aráoz 93 (1.2%) Rikard Bosnjakovic 93 (1.2%) bhaaluu 88 (1.2%) elis aeris 83 (1.1%) Andreas Kostyrka 77 (1.0%) Michael Langford 68 (0.9%) shawn bright 63 (0.8%) Tim Golden 62 (0.8%) Dave Kuhlman 62 (0.8%) wormwood_3 53 (0.7%) wesley chun 53 (0.7%) 2008 (6624 posts) Kent Johnson 931 (14.1%) Alan Gauld 820 (12.4%) bob gailer 247 (3.7%) Dick Moores 191 (2.9%) W W 142 (2.1%) Wayne Watson 106 (1.6%) John F
Re: [Tutor] Top posters to tutor list for 2008
On Fri, Jan 2, 2009 at 2:06 AM, Luke Paireepinart wrote: > Yeah, I agree. Interesting script, Kent. Surprisingly short. > > I didn't realize I wasn't in the top 5 posters for 2008! I guess I > have a new year's resolution to be more helpful. Or ask more questions, that works too! Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
"Kent Johnson" wrote that generates it. The lists for previous years (back to 2003) are at the end so everyone on the list doesn't hit the archives to find out Alan, I thought you might have passed me this year but we are both off a little :-) I think the figures reflect the general level of activity on the list. We seem to have peaked in 2005... Statistics, don't you love 'em :-) 2003 Danny Yoo 617 Alan Gauld 421 Jeff Shannon 283 Magnus Lycka 242 Total =~ 1500 2004 Alan Gauld 699 Danny Yoo 530 Kent Johnson 451 Lloyd Kvam 146 Total =~ 1800 2005 Kent Johnson 1189 Danny Yoo 767 Alan Gauld 565 Alan G 317 Total =~ 2800 (If you count both of my totals I get back into 2nd place :-) 2006 Kent Johnson 913 Alan Gauld 815 Danny Yoo 448 Luke Paireepinart 242 Total =~ 2400 2007 Kent Johnson 1052 Alan Gauld 938 Luke Paireepinart 260 Dick Moores 203 Total =~ 2400 2008 Kent Johnson 931 Alan Gauld 820 bob gailer 247 Dick Moores 191 Total =~ 2200 Alan G ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
Yeah, I agree. Interesting script, Kent. Surprisingly short. I didn't realize I wasn't in the top 5 posters for 2008! I guess I have a new year's resolution to be more helpful. Happy New Year, everyone! On Thu, Jan 1, 2009 at 9:23 AM, jadrifter wrote: > On Thu, 2009-01-01 at 09:34 -0500, Kent Johnson wrote: >> For several years I have been using a simple script to find the top 20 >> posters to the tutor list by web-scraping the archive pages. I thought >> others might be interested so here is the list for 2008 and the script >> that generates it. The lists for previous years (back to 2003) are at >> the end so everyone on the list doesn't hit the archives to find out >> :-) >> >> The script gives a simple example of datetime, urllib2 and >> BeautifulSoup. It consolidates names that vary by case but other >> variations are not detected. > > Kent, > > Thank you for this. I've been thinking about a web scraping script but > didn't have a clue how to go about it. Seeing someone else's practical > implementation is a huge help! > > A little serendipity to start 2009 off with. > > Happy New Year to all. > > John > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Top posters to tutor list for 2008
On Thu, 2009-01-01 at 09:34 -0500, Kent Johnson wrote: > For several years I have been using a simple script to find the top 20 > posters to the tutor list by web-scraping the archive pages. I thought > others might be interested so here is the list for 2008 and the script > that generates it. The lists for previous years (back to 2003) are at > the end so everyone on the list doesn't hit the archives to find out > :-) > > The script gives a simple example of datetime, urllib2 and > BeautifulSoup. It consolidates names that vary by case but other > variations are not detected. Kent, Thank you for this. I've been thinking about a web scraping script but didn't have a clue how to go about it. Seeing someone else's practical implementation is a huge help! A little serendipity to start 2009 off with. Happy New Year to all. John ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Top posters to tutor list for 2008
For several years I have been using a simple script to find the top 20 posters to the tutor list by web-scraping the archive pages. I thought others might be interested so here is the list for 2008 and the script that generates it. The lists for previous years (back to 2003) are at the end so everyone on the list doesn't hit the archives to find out :-) The script gives a simple example of datetime, urllib2 and BeautifulSoup. It consolidates names that vary by case but other variations are not detected. Alan, I thought you might have passed me this year but we are both off a little :-) Somehow I have posted an average of 2.8 times per day for the last four years... Happy New Year everyone! Kent 2008 Kent Johnson 931 Alan Gauld 820 bob gailer 247 Dick Moores 191 W W 142 Wayne Watson 106 John Fouhy 97 Steve Willoughby 91 Lie Ryan 88 bhaaluu 85 Marc Tompkins 83 Michael Langford 71 Tiger12506 70 Andreas Kostyrka 64 Dinesh B Vadhia 64 wesley chun 58 Tim Golden 57 Chris Fuller 54 Ricardo Aráoz 53 spir 53 # ''' Counts all posts to Python-tutor by author''' # -*- coding: latin-1 -*- from datetime import date, timedelta import operator, urllib2 from BeautifulSoup import BeautifulSoup today = date.today() for year in [2008]: startDate = date(year, 1, 1) endDate = date(year, 12, 31) thirtyOne = timedelta(days=31) counts = {} # Collect all the counts for a year by scraping the monthly author archive pages while startDate < endDate and startDate < today: dateString = startDate.strftime('%Y-%B') url = 'http://mail.python.org/pipermail/tutor/%s/author.html' % dateString data = urllib2.urlopen(url).read() soup = BeautifulSoup(data) li = soup.findAll('li')[2:-2] for l in li: name = l.i.string.strip() counts[name] = counts.get(name, 0) + 1 startDate += thirtyOne # Consolidate names that vary by case under the most popular spelling nameMap = dict() # Map lower-case name to most popular name for name, count in sorted(counts.iteritems(), key=operator.itemgetter(1), reverse=True): lower = name.lower() if lower in nameMap: # Add counts for a name we have seen already counts[nameMap[lower]] += count else: nameMap[lower] = name print print year print '' for name, count in sorted(counts.iteritems(), key=operator.itemgetter(1), reverse=True)[:20]: print name.encode('latin-1', 'xmlcharrefreplace'), count print # Results as of 12/31/2008: ''' 2003 Danny Yoo 617 Alan Gauld 421 Jeff Shannon 283 Magnus Lycka 242 Bob Gailer 195 Magnus =?iso-8859-1?Q?Lyck=E5?= 166 alan.ga...@bt.com 161 Kirk Bailey 155 Gregor Lingl 152 Lloyd Kvam 142 Andrei 118 Sean 'Shaleh' Perry 117 Magnus Lyckå 113 Michael Janssen 113 Erik Price 100 Lee Harr 88 Terry Carroll 87 Daniel Ehrenberg 78 Abel Daniel 76 Charlie Clark 74 2004 Alan Gauld 699 Danny Yoo 530 Kent Johnson 451 Lloyd Kvam 146 Dick Moores 145 Liam Clarke 140 Brian van den Broek 122 Karl Pflästerer 109 Jacob S. 101 Andrei 99 Chad Crabtree 93 Bob Gailer 91 Magnus Lycka 91 Terry Carroll 88 Marilyn Davis 84 Gregor Lingl 73 Dave S 73 Bill Mill 71 Isr Gish 71 Lee Harr 67 2005 Kent Johnson 1189 Danny Yoo 767 Alan Gauld 565 Alan G 317 Liam Clarke 298 Max Noel 203 Nathan Pinno 197 Brian van den Broek 190 Jacob S. 154 jfouhy at paradise.net.nz 135 Alberto Troiano 128 Bernard Lebel 119 Joseph Quigley 101 Terry Carroll 93 Andrei 79 D. Hartley 77 John Fouhy 73 bob 73 Hugo González Monteverde 72 Orri Ganel 69 2006 Kent Johnson 913 Alan Gauld 815 Danny Yoo 448 Luke Paireepinart 242 John Fouhy 187 Chris Hengge 166 Bob Gailer 134 Dick Moores 129 Asrarahmed Kadri 119 Terry Carroll 111 Python 94 Mike Hansen 74 Liam Clarke 72 Carroll, Barry 67 Kermit Rose 66 anil maran 66 Hugo González Monteverde 65 wesley chun 63 Christopher Spears 53 Michael Lange 51 2007 Kent Johnson 1052 Alan Gauld 938 Luke Paireepinart 260 Dick Moores 203 Eric Brunson 164 Terry Carroll 128 Tiger12506 112 John Fouhy 105 Bob Gailer 97 Ricardo Aráoz 93 Rikard Bosnjakovic 93 bhaaluu 88 elis aeris 83 Andreas Kostyrka 77 Michael Langford 68 shawn bright 63 Tim Golden 62 Dave Kuhlman 62 wormwood_3 53 wesley chun 53 2008 Kent Johnson 931 Alan Gauld 820 bob gailer 247 Dick Moores 191 W W 142 Wayne Watson 106 John Fouhy 97 Steve Willoughby 91 Lie Ryan 88 bhaaluu 85 Marc Tompkins 83 Michael Langford 71 Tiger12506 70 Andreas Kostyrka 64 Dinesh B Vadhia 64 wesley chun 58 Tim Golden 57 Chris Fuller 54 Ricardo Aráoz 53 spir 53 ''' ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor