Howdy, learners!
I've got a quick question about the Naive Bayes that I couldn't find any
answers/examples for.
I have a whole lot of data which I need to fit with an NB. What's the best
way to do this - sequentially, or all as one massive array? I have
classified all of my data and have it stored in a database and it
vectorizes nicely, so no worries there.
Which of these is more appropriate (in pseudo)?
gnb = GaussianNB()
for featureset, category in featuresets_from_database:
gnb.fit(arrayify(featureset), category)
gnb.predict(target)
===== OR ======
gnb = GaussianNB()
features = array()
categories = []
for featureset, category in featuresets_from_database:
features.append(arrayify(featureset))
categories.append(category)
gnb.fit(features, categories)
gnb.predict(target)
I'd greatly prefer the first scenario for a variety of reasons, but I
thought it would be best to ask here since I couldn't see any examples of
multiple-fittings.
Which avenue should I pursue? This project is F/OSS and will ultimately be
written up as a tutorial as well, so I'd like to make sure I get all of my
practices correct.
Thanks very much!,
Rich
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general