Following is the code I use. I got it from web, but forgot the link.

def k_fold_cross_validation(X, K, randomise = False):
        """
        Generates K (training, validation) pairs from the items in X.

        Each pair is a partition of X, where validation is an iterable
        of length len(X)/K. So each training iterable is of length
(K-1)*len(X)/K.

        If randomise is true, a copy of X is shuffled before partitioning,
        otherwise its order is preserved in training and validation.
        """
        if randomise: from random import shuffle; X=list(X); shuffle(X)
        for k in xrange(K):
                training = [x for i, x in enumerate(X) if i % K != k]
                validation = [x for i, x in enumerate(X) if i % K == k]
                yield training, validation


Cheers,
dksr

On Feb 20, 1:15 am, Mark Livingstone <livingstonem...@gmail.com>
wrote:
> Hello,
>
> I am doing research as part of a Uni research Scholarship into using
> data compression for classification. What I am looking for is python
> code to handle the crossfold validation side of things for me - that
> will take my testing / training corpus and create the testing /
> training files after asking me for number of folds and number of times
> (or maybe allow me to enter a random seed or offset instead of times.)
> I could then either hook my classifier into the program or use it in a
> separate step.
>
> Probably not very hard to write, but why reinvent the wheel ;-)
>
> Thanks in advance,
>
> MarkL

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to