Thanks for noticing.  I will commit the patch shortly.

On Mon, Jan 24, 2011 at 1:56 PM, Chris Schilling
<[email protected]>wrote:

> Thanks a lot Ted!  I haven't had too much time to investigate.  I
> appreciate the patch.
>
> Chris
>
> On Jan 23, 2011, at 10:38 PM, Ted Dunning wrote:
>
> > Chris,
> >
> > This looks better:
> >
> > 0.58    189545.00       63692.00        3551.55 1.0000002e-08
> > 1.0007058e-08   3000    -1.447  69.79   none
> > body=gun        1.3     talk.politics.guns      16.0
>  -0.2414000554631249
> >    17.0    -0.129711508160663
> > body=windows    1.2     comp.os.ms-windows.misc 3.0
> -0.19417418927173208
> >   14.0    -0.16734214222498917
> > body=sale       1.2     misc.forsale    1.0     -0.141236520301033
>  4.0
> >    -0.13078372920403203
> > body=car        1.2     rec.autos       13.0    -0.15182947211484465
>  10.0
> >   -0.12953193026882154
> > body=bike       1.1     rec.motorcycles 6.0     -0.1188138958409118
> 5.0
> >    -0.10294447109840156
> > body=x  1.1     comp.windows.x  0.0     0.16208435089363293     13.0
> > -0.1515644646592169
> > body=israel     1.0     talk.politics.mideast   15.0
>  -0.14829080862283103
> >   18.0    -0.12856657991660764
> > body=space      1.0     sci.space       4.0     -0.13411179253228006
>  15.0
> >   -0.13059284295356896
> > body=mac        0.9     comp.sys.mac.hardware   11.0
>  -0.15080978041551327
> >   4.0     -0.08952300323041343
> > body=apple      0.9     comp.sys.mac.hardware   2.0
> -0.08917827214495963
> >   12.0    -0.08678780456560442
> > body=god        0.9     soc.religion.christian  16.0
>  -0.31170158967758943
> >   12.0    -0.23551822033335715
> >
> >
> > There was a bug introduced into the ModelDissector that caused it to show
> > you the least interesting features rather than the most interesting.
> >
> > Patch is forthcoming.
> >
> > On Tue, Dec 28, 2010 at 1:05 PM, Chris Schilling
> > <[email protected]>wrote:
> >
> >> Hey Ted,
> >>
> >> I went back in time a bit and found a version which returned reasonable
> >> looking results (at least results which are comparable to those in the
> >> book).  I ran 'svnversion .' and the older (apparently working version)
> >> returned 1004406M whereas the trunk version I am using is 1050223M.  In
> any
> >> case, the files in core/o.a.m.classifier.sgd are dated October 4th.  It
> >> looks like between the 4th of Oct and December 7th there was some
> >> refactoring going on.  For instance, the encoders were moved to the
> vectors
> >> package (as opposed to the vectorizer.encoders package).  I spent a
> little
> >> time comparing diffs in the core sgd package but not enough time to
> discover
> >> what could be causing this behavior.
> >>
> >> I hope this helps.
> >> Chris
> >>
> >>
> >>
> >> On Dec 20, 2010, at 4:25 PM, Ted Dunning wrote:
> >>
> >>> Yeah... it looks like I really need to jump into this.  These results
> are
> >>> not right.
> >>>
> >>> On Mon, Dec 20, 2010 at 2:11 PM, Chris Schilling <[email protected]>
> >> wrote:
> >>>
> >>>> Hey Ted,
> >>>>
> >>>> Just FYI,
> >>>>
> >>>> I changed the Weight subclass of the ModelDissector to sort by true
> >> value
> >>>> (rather than absolute value) and reran over the 20 newsgroups data.
> >> Here
> >>>> are the results of the dissector function:
> >>>>
> >>>> body=rt 0.042   comp.sys.mac.hardware
> >>>> body=computer   0.039   sci.electronics
> >>>> body=seem       0.035   talk.religion.misc
> >>>> body=mike       0.035   misc.forsale
> >>>> body=windows    0.034   misc.forsale
> >>>> body=just       0.032   sci.crypt
> >>>> body=supports   0.032   talk.politics.mideast
> >>>> body=x  0.032   talk.religion.misc
> >>>> body=do 0.029   rec.motorcycles
> >>>> body=university 0.028   comp.sys.mac.hardware
> >>>> body=slagle     0.028   rec.sport.hockey
> >>>>
> >>>> I prefer the results from MIA :)  Anyway, I know you are busy.  If
> there
> >> is
> >>>> anything I can do to help, let me know.  Still getting familiar with
> the
> >>>> code, but could help out with some guidance.
> >>>>
> >>>> Thanks a lot,
> >>>> Chris
> >>>>
> >>>> On Dec 17, 2010, at 7:37 PM, Ted Dunning wrote:
> >>>>
> >>>>> Hard to say what changed just off hand.  I was tweaking the SGD code
> >>>> pretty
> >>>>> regularly as I learned from the results users were getting.  I should
> >>>> look
> >>>>> at the history to review what happened... some changes may not have
> >> been
> >>>>> good.
> >>>>>
> >>>>> On Fri, Dec 17, 2010 at 5:28 PM, Chris Schilling
> >>>>> <[email protected]>wrote:
> >>>>>
> >>>>>> Thanks for the answers Ted.  Ill take a look inside the dissector.
>  I
> >>>> was
> >>>>>> just wondering because the results are quite a bit different from
> >> whats
> >>>> in
> >>>>>> the book - Listing 15.9.  Here are those results (where words have
> >>>> weights >
> >>>>>> 1).
> >>>>>>
> >>>>>> body=space 2.1 sci.space
> >>>>>> body=sale 1.9 misc.forsale
> >>>>>> body=car 1.9 rec.autos
> >>>>>> body=windows 1.8 comp.os.ms-windows.misc
> >>>>>> body=mac 1.7 comp.sys.mac.hardware
> >>>>>> body=bike 1.7 rec.motorcycles
> >>>>>> body=apple 1.5 comp.sys.mac.hardware
> >>>>>> body=gun 1.5 talk.politics.guns
> >>>>>> body=baseball 1.5 rec.sport.baseball
> >>>>>> body=graphics 1.5 comp.graphics
> >>>>>>
> >>>>>>
> >>>>>> I guess I mostly want to understand what changed.  Again, Ill take a
> >>>> look
> >>>>>> at the dissector, because the results of the training look pretty
> >> good.
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Reply via email to