Thanks for noticing. I will commit the patch shortly.
On Mon, Jan 24, 2011 at 1:56 PM, Chris Schilling <[email protected]>wrote: > Thanks a lot Ted! I haven't had too much time to investigate. I > appreciate the patch. > > Chris > > On Jan 23, 2011, at 10:38 PM, Ted Dunning wrote: > > > Chris, > > > > This looks better: > > > > 0.58 189545.00 63692.00 3551.55 1.0000002e-08 > > 1.0007058e-08 3000 -1.447 69.79 none > > body=gun 1.3 talk.politics.guns 16.0 > -0.2414000554631249 > > 17.0 -0.129711508160663 > > body=windows 1.2 comp.os.ms-windows.misc 3.0 > -0.19417418927173208 > > 14.0 -0.16734214222498917 > > body=sale 1.2 misc.forsale 1.0 -0.141236520301033 > 4.0 > > -0.13078372920403203 > > body=car 1.2 rec.autos 13.0 -0.15182947211484465 > 10.0 > > -0.12953193026882154 > > body=bike 1.1 rec.motorcycles 6.0 -0.1188138958409118 > 5.0 > > -0.10294447109840156 > > body=x 1.1 comp.windows.x 0.0 0.16208435089363293 13.0 > > -0.1515644646592169 > > body=israel 1.0 talk.politics.mideast 15.0 > -0.14829080862283103 > > 18.0 -0.12856657991660764 > > body=space 1.0 sci.space 4.0 -0.13411179253228006 > 15.0 > > -0.13059284295356896 > > body=mac 0.9 comp.sys.mac.hardware 11.0 > -0.15080978041551327 > > 4.0 -0.08952300323041343 > > body=apple 0.9 comp.sys.mac.hardware 2.0 > -0.08917827214495963 > > 12.0 -0.08678780456560442 > > body=god 0.9 soc.religion.christian 16.0 > -0.31170158967758943 > > 12.0 -0.23551822033335715 > > > > > > There was a bug introduced into the ModelDissector that caused it to show > > you the least interesting features rather than the most interesting. > > > > Patch is forthcoming. > > > > On Tue, Dec 28, 2010 at 1:05 PM, Chris Schilling > > <[email protected]>wrote: > > > >> Hey Ted, > >> > >> I went back in time a bit and found a version which returned reasonable > >> looking results (at least results which are comparable to those in the > >> book). I ran 'svnversion .' and the older (apparently working version) > >> returned 1004406M whereas the trunk version I am using is 1050223M. In > any > >> case, the files in core/o.a.m.classifier.sgd are dated October 4th. It > >> looks like between the 4th of Oct and December 7th there was some > >> refactoring going on. For instance, the encoders were moved to the > vectors > >> package (as opposed to the vectorizer.encoders package). I spent a > little > >> time comparing diffs in the core sgd package but not enough time to > discover > >> what could be causing this behavior. > >> > >> I hope this helps. > >> Chris > >> > >> > >> > >> On Dec 20, 2010, at 4:25 PM, Ted Dunning wrote: > >> > >>> Yeah... it looks like I really need to jump into this. These results > are > >>> not right. > >>> > >>> On Mon, Dec 20, 2010 at 2:11 PM, Chris Schilling <[email protected]> > >> wrote: > >>> > >>>> Hey Ted, > >>>> > >>>> Just FYI, > >>>> > >>>> I changed the Weight subclass of the ModelDissector to sort by true > >> value > >>>> (rather than absolute value) and reran over the 20 newsgroups data. > >> Here > >>>> are the results of the dissector function: > >>>> > >>>> body=rt 0.042 comp.sys.mac.hardware > >>>> body=computer 0.039 sci.electronics > >>>> body=seem 0.035 talk.religion.misc > >>>> body=mike 0.035 misc.forsale > >>>> body=windows 0.034 misc.forsale > >>>> body=just 0.032 sci.crypt > >>>> body=supports 0.032 talk.politics.mideast > >>>> body=x 0.032 talk.religion.misc > >>>> body=do 0.029 rec.motorcycles > >>>> body=university 0.028 comp.sys.mac.hardware > >>>> body=slagle 0.028 rec.sport.hockey > >>>> > >>>> I prefer the results from MIA :) Anyway, I know you are busy. If > there > >> is > >>>> anything I can do to help, let me know. Still getting familiar with > the > >>>> code, but could help out with some guidance. > >>>> > >>>> Thanks a lot, > >>>> Chris > >>>> > >>>> On Dec 17, 2010, at 7:37 PM, Ted Dunning wrote: > >>>> > >>>>> Hard to say what changed just off hand. I was tweaking the SGD code > >>>> pretty > >>>>> regularly as I learned from the results users were getting. I should > >>>> look > >>>>> at the history to review what happened... some changes may not have > >> been > >>>>> good. > >>>>> > >>>>> On Fri, Dec 17, 2010 at 5:28 PM, Chris Schilling > >>>>> <[email protected]>wrote: > >>>>> > >>>>>> Thanks for the answers Ted. Ill take a look inside the dissector. > I > >>>> was > >>>>>> just wondering because the results are quite a bit different from > >> whats > >>>> in > >>>>>> the book - Listing 15.9. Here are those results (where words have > >>>> weights > > >>>>>> 1). > >>>>>> > >>>>>> body=space 2.1 sci.space > >>>>>> body=sale 1.9 misc.forsale > >>>>>> body=car 1.9 rec.autos > >>>>>> body=windows 1.8 comp.os.ms-windows.misc > >>>>>> body=mac 1.7 comp.sys.mac.hardware > >>>>>> body=bike 1.7 rec.motorcycles > >>>>>> body=apple 1.5 comp.sys.mac.hardware > >>>>>> body=gun 1.5 talk.politics.guns > >>>>>> body=baseball 1.5 rec.sport.baseball > >>>>>> body=graphics 1.5 comp.graphics > >>>>>> > >>>>>> > >>>>>> I guess I mostly want to understand what changed. Again, Ill take a > >>>> look > >>>>>> at the dissector, because the results of the training look pretty > >> good. > >>>>>> > >>>> > >>>> > >> > >> > >
