Hi , I entered this Kaggle's CTR challenge using scikit python framework. Although , it gave me a reasonable score , I am just wondering to explore Spark Mlib which I haven't used it before. Tried with Vowpal Wobbit also .
Can someone who has already worked with MLIB ,help me if Spark Mlib supports online learning or batch SGD, if so how it performs. I don't have a cluster of spark , just the laptop. Any suggestions? The training data has close to 45 million rows in csv format and test data close to 4.2 million rows in same format. Thanks, Niranjan