PS: took 2 hours to parse the dataset into the linearized version (stored as "parsed.json") on my notebook.
-- Paul Eipper On Thu, Mar 9, 2017 at 7:39 PM, Paul Eipper <[email protected]> wrote: > I had some fun parsing and plotting the data (very simple, just the top > packages for now). See here: > https://github.com/lkraider/requirements-dataset/blob/master/index.ipynb > > Let me know if you would accept a pull request so others can use that as a > starting point. > > att, > > > -- > Paul Eipper > > On Wed, Mar 8, 2017 at 1:36 PM, Nick Timkovich <[email protected]> > wrote: > >> Looks like a fun chunk of data, what's the query you used? Can you add a >> README to the repo with some description if others want to iterate on it >> (maybe look into setup.py's?) >> >> Nick >> >> On Tue, Mar 7, 2017 at 5:06 AM, Jannis Gebauer <[email protected]> wrote: >> >>> Hi, >>> >>> I ran a couple of queries against GitHubs public big query dataset [0] >>> last week. I’m interested in requirement files in particular, so I ran a >>> query extracting all available requirement files. >>> >>> Since queries against this dataset are rather expensive ($7 on all >>> repos), I thought I’d share the raw data here [1]. The data contains the >>> repo name, the requirements file path and the contents of the file. Every >>> line represents a JSON blob, read it with: >>> >>> with open('data.json') as f: >>> for line in f.readlines(): >>> data = json.loads(line) >>> >>> Maybe that’s of interest to some of you. >>> >>> If you have any ideas on what to do with the data, please let me know. >>> >>> — >>> >>> Jannis Gebauer >>> >>> >>> >>> [0]: https://cloud.google.com/bigquery/public-data/github >>> [1]: https://github.com/jayfk/requirements-dataset >>> >>> _______________________________________________ >>> Distutils-SIG maillist - [email protected] >>> https://mail.python.org/mailman/listinfo/distutils-sig >>> >>> >> >> _______________________________________________ >> Distutils-SIG maillist - [email protected] >> https://mail.python.org/mailman/listinfo/distutils-sig >> >> >
_______________________________________________ Distutils-SIG maillist - [email protected] https://mail.python.org/mailman/listinfo/distutils-sig
