Following is the code I use. I got it from web, but forgot the link. def k_fold_cross_validation(X, K, randomise = False): """ Generates K (training, validation) pairs from the items in X.
Each pair is a partition of X, where validation is an iterable of length len(X)/K. So each training iterable is of length (K-1)*len(X)/K. If randomise is true, a copy of X is shuffled before partitioning, otherwise its order is preserved in training and validation. """ if randomise: from random import shuffle; X=list(X); shuffle(X) for k in xrange(K): training = [x for i, x in enumerate(X) if i % K != k] validation = [x for i, x in enumerate(X) if i % K == k] yield training, validation Cheers, dksr On Feb 20, 1:15 am, Mark Livingstone <livingstonem...@gmail.com> wrote: > Hello, > > I am doing research as part of a Uni research Scholarship into using > data compression for classification. What I am looking for is python > code to handle the crossfold validation side of things for me - that > will take my testing / training corpus and create the testing / > training files after asking me for number of folds and number of times > (or maybe allow me to enter a random seed or offset instead of times.) > I could then either hook my classifier into the program or use it in a > separate step. > > Probably not very hard to write, but why reinvent the wheel ;-) > > Thanks in advance, > > MarkL -- http://mail.python.org/mailman/listinfo/python-list