It is sometimes difficult to obtain realistic "Big" data sets. A Revolution Analytics blog post yesterday
http://blog.revolutionanalytics.com/2014/04/predict-which-shoppers-will-become-repeat-buyers.html mentioned the competition http://www.kaggle.com/c/acquire-valued-shoppers-challenge with a very large data set, which may be useful in looking at performance bottlenecks. You do need to sign up to be able to download the data and you must agree only to use the data for the purposes of the competition and to remove the data once the competition is over.