On Sep 2, 7:06 pm, Steven D'Aprano <[EMAIL PROTECTED]
cybersource.com.au> wrote:
> On Tue, 02 Sep 2008 09:48:32 -0700, cnb wrote:
> > I have a bunch of files consisting of moviereviews.
>
> > For each file I construct a list of reviews and then for each new file I
> > merge the reviews so that in the end have a list of reviewers and for
> > each reviewer all their reviews.
>
> > What is the fastest way to do this?
>
> Use the timeit module to find out.
>
> > 1. Create one file with reviews, open next file an for each review see
> > if the reviewer exists, then add the review else create new reviewer.
>
> > 2. create all the separate files with reviews then mergesort them?
>
> The answer will depend on whether you have three reviews or three
> million, whether each review is twenty words or twenty thousand words,
> and whether you have to do the merging once only or over and over again.
>
> --
> Steven



I merge once. each review has 3 fields, date rating customerid. in
total ill be parsing between 10K and 100K, eventually 450K reviews.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to