This is very exciting and something I have been scratching at in science publishing. Now the Openbiblio team has produced Bibsoup/bibserver and it seems your application could be very well suited to for BibSoup
On Tue, Jan 24, 2012 at 4:25 PM, Guo Xu <[email protected]> wrote: > Hi folks, > > I have been working on visualizing the networks of academic publishing > in economics. Here's an example for the Quarterly Journal of > Economics: > > http://www.guoxu.org/econmap/map.html > > A link indicates that two economists have published together in the > QJE. The strength of a link is defined by how many times they have > published together. > > The size of the node indicates how many times an author has published > in the QJE. Bigger nodes have published more often. > > Finally, the color indicates the ranking of the economist's alma > mater. Blue indicates that the author obtained his/her PhD from a top > 10 university (according to > > http://www.topuniversities.com/university-rankings/world-university-rankings/2011/subject-rankings/social-sciences/economics > ); > orange indicates a top 11-20 university; green is for top 21-30 and > red is for all universities beyond top 30. > > Couple of interesting points: > > - It seems that the core (those at the centre) are almost all made up > by top 10 authors. They tend to be well-connected. > In the UK this might be called "the old boy network" - the unofficial network of (men) who have been to the same school / university. It does not necessarily indicate absolute vaue but it is often correlated with getting grants, etc. [I have been in both Blue and Red universities (in science)] > > - The hubs are: Phillipe Aghion, Daron Acemoglu, Marianne Bertrand > > - There are rarely authors beyond the top 30 who get published in the QJE. > > The visualization is done with D3. But it is very slow on older > computers. Does anyone have ideas for optimizing this? > Yes. This is a dynamics exercise and (I assume) you have a pairwise repulsion term to spread the points out. Many of your points are 0-connected and so you spend a lot of time computing them for nothing. Unless there is some other hidden coordinate I would just separate into the disjoint graphs. It will be hugely fast as instead of O[N*N] you have O[N] or less (there is a power law distrinution of cluster size) > > Also, I have a lot more characteristics lying around that can be > displayed (e.g. gender - btw only 10% of the authors are female), but > I do not really know how to do it dynamically. > > Finally, I would ideally like to do the same visualization for the > *entire* network of economist. I have a 300 MB dataset scraped from > Repec that gives me information on co-authoring for virtually all > economics journals and working paper series. But obviously this will > be too slow to visualize so it would be great if someone had > experience in working with such big datasets (the whole dataset has > ~30.000 economists, which results in a 30.000 x 30.000 data matrix!!) > > You will certainly find interest on openbiblio-dev as we are looking for bibliographic data sets and things to do with them > Anyway, let me know what you think and looking forward to suggestions! > > Guo > > _______________________________________________ > okfn-discuss mailing list > [email protected] > http://lists.okfn.org/mailman/listinfo/okfn-discuss > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069
_______________________________________________ okfn-discuss mailing list [email protected] http://lists.okfn.org/mailman/listinfo/okfn-discuss
