Howdy, learners!

I've got a quick question about the Naive Bayes that I couldn't find any
answers/examples for.

I have a whole lot of data which I need to fit with an NB. What's the best
way to do this - sequentially, or all as one massive array? I have
classified all of my data and have it stored in a database and it
vectorizes nicely, so no worries there.

Which of these is more appropriate (in pseudo)?

gnb = GaussianNB()
for featureset, category in featuresets_from_database:
    gnb.fit(arrayify(featureset), category)

gnb.predict(target)

===== OR ======

gnb = GaussianNB()
features = array()
categories = []

for featureset, category in featuresets_from_database:
     features.append(arrayify(featureset))
     categories.append(category)

gnb.fit(features, categories)
gnb.predict(target)

I'd greatly prefer the first scenario for a variety of reasons, but I
thought it would be best to ask here since I couldn't see any examples of
multiple-fittings.

Which avenue should I pursue? This project is F/OSS and will ultimately be
written up as a tutorial as well, so I'd like to make sure I get all of my
practices correct.

Thanks very much!,
Rich
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to