Hey Caroline,

Tony Hirst has documented using various command line tools for working with
really large files in the post here:
http://blog.ouseful.info/2011/06/04/playing-with-large-ish-csv-files-and-using-them-as-a-database-edina-openurl-logs/and
it's first linked post.

I've found whilst Refine can cope with really big files, it does get a bit
sluggish, so using some of the tools Tony suggests to extract out subsets
of the data you want to play with, and then working from there, can be a
good approach.

All the best

Tim
@timdavies


> ---------- Forwarded message ----------
> From: Tom Steinberg <[email protected]>
> To: "mySociety public, general purpose discussion list" <
> [email protected]>
> Cc:
> Date: Wed, 11 Jan 2012 07:46:34 +0000
> Subject: Re: [mySociety:public] GP prescriptions
> Hi Caroline,
>
> A bit slow to reply, but I believe Google Refine is good at things too
> big for Excel:
>
> http://code.google.com/p/google-refine/
>
> best,
>
> Tom
>
> On 27 December 2011 15:55, Caroline Flyn <[email protected]> wrote:
> > Hi all,
> >
> > Does anyone know of anyone/ any organisations who are analysing the new
> > GP-practice level prescription data?
> >
> > Or could you recommend suitable software that could be used? It's all in
> CSV
> > format, but obviously an enormous file.
> >
> > The NHS warns:
> > "Note: Due to the large file size (over 500MB) standard spreadsheet
> > applications will not be able to handle the volumes of data contained in
> the
> > monthly datasets. Data users will need to analyse the information using
> > specialist data-handling software."
> >
> >
> http://www.ic.nhs.uk/news-and-events/news/gp-practice-level-prescribing-data-now-available
> >
> > All advice welcomed...
> >
> > _______________________________________________
> > developers-public mailing list
> > [email protected]
> >
> https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
> >
> > Unsubscribe:
> >
> https://secure.mysociety.org/admin/lists/mailman/options/developers-public/tom%40tomsteinberg.co.uk
>
>
>
> _______________________________________________
> Mailing list [email protected]
> Unsubscribe, archive or settings:
> https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public
>



-- 


+44 (0)7834 856 303
@timdavies
http://www.timdavies.org.uk
_______________________________________________
developers-public mailing list
[email protected]
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Unsubscribe: 
https://secure.mysociety.org/admin/lists/mailman/options/developers-public/archive%40mail-archive.com

Reply via email to