I should rephrase one thing. Our current product \started out/ a lot like that. It wasn't good enough for the Google's of the world, so it started to grow hair. We're looking at a statistical retread because the hair gets harder and harder to comb.
On Sun, Feb 14, 2010 at 4:47 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > Benson, > > One more thing. I forget the actual reference, but the best Chinese > segmenter that I have seen in practice (whose name I forget) was able to get > away with a simple unweighted lexicon and 2-3 word look-ahead + average word > length for score. This indicates to me that you can depth bound your beam > search and turn it into an exhaustive search. The lesson of their success > is that garden path sentences (with regard to segmentation) are rare in > Chinese. > > On Sun, Feb 14, 2010 at 10:29 AM, Benson Margulies > <bimargul...@gmail.com>wrote: > >> Ted, thanks very much. >> >> Thoughts in response to both of your messages: >> >> 1: alpha-beta is being used here in the sense of E+M. Or, to be >> specific, alpha is the path sum from the beginning to the current >> 'time', and beta is the path sum from the current 'time' to the end. >> >> 2: I had read about that 'at the margin' idea and completely forgotten >> it. My starting point here is Miller and Guiness (one of whom used to >> work with me and the other of whom still does). They didn't report, >> and perhaps didn't measure, whether the examples selected via that >> 'gamma' calculation had high error rates (far from the margin?) or low >> error rates (close to the margin). They just observed that >> >> 3: A scatter-plot looks like what the doctor ordered. >> >> 4: That paper is new to me. My stack of papers in this neighborhood is >> Collins, Miller + Guiness, Crammer (on Passive-Aggressive) and the >> Oxford paper on segmentation. Thanks for the pointer. >> >> --benson >> >> On Sat, Feb 13, 2010 at 11:18 PM, Ted Dunning <ted.dunn...@gmail.com> >> wrote: >> > Benson, >> > >> > Are you using techniques related to this: >> > http://www.it.usyd.edu.au/~james/pubs/pdf/dlp07perc.pdf<http://www.it.usyd.edu.au/%7Ejames/pubs/pdf/dlp07perc.pdf>? >> > >> > >> > >> > On Sat, Feb 13, 2010 at 9:38 AM, Benson Margulies <bimargul...@gmail.com >> >wrote: >> > >> >> Folks, >> >> >> >> Here's one of my occasional questions in which I am, in essence, >> >> bartering my code wrangling efforts for expertise on hard stuff. >> >> >> >> Consider a sequence problem addressed with a perceptron model with an >> >> ordinary Viterbi decoder. There's a standard confidence estimation >> >> technique borrowed from HMMs: calculate gamma = alpha + beta for each >> >> state, take the difference of the gammas for the best and second best >> >> hypothesis for each column of the trellis, and take argmin of them as >> >> the overall confidence of the decode. (+, of course, because in a >> >> perceptron we're summing feature weights, not multiplying >> >> probabilities.) >> >> >> >> >> > >> > > > > -- > Ted Dunning, CTO > DeepDyve >