frank did you publish in the petitparser repo? because we should not lose that.
Stef >>>> >>> Didn't check the code, just the tally, and I think that >>> PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>digitsBase: is begging >>> for optimization. It's probably also the cause of the high amount of garbage >>> which causes significant amount of time spent with garbage collection. >>> It's also interesting is that the finalization process does so much work, >>> there may be something wrong with your image. >> >> Thanks for taking a look, Levente. >> >> I'd expect digitsBase: to dominate the running costs, given that we're >> parsing numbers. >> >> I do make a large number of throwaway "immutable" values with a >> Builder-like pattern... in PPSmalltalkNumberParser >> >> #makeNumberFrom:base:. That, I would imagine, could explain the >> garbage? >> >> If I may, what do you look for when reading the MessageTally? How do >> you tell, for instance, that there's excessive garbage production? >> That the incremental GCs take 7ms? (I'm reading Andreas' comments on >> http://wiki.squeak.org/squeak/4210 again.) > > Levente, you're quite right: #digitsBase: has now been optimised even > more, reducing the time taken to run my benchmark > > MessageTally spyOn: [Time millisecondsToRun: [100000 timesRepeat: > [PPSmalltalkNumberParser parse: '1234567890']]] > > from ~32 seconds to ~16 seconds. (Memoising was the answer: > #digitsBase: is effectively a higher-order production and, like OMeta, > PPCompositeParser doesn't memoise those. A simple class var dictionary > solves that problem. > > frank > >> frank >> >>> Levente >>> >>> >>>> >>>> frank >>>> >>>> On 14 September 2011 20:26, Frank Shearar <frank.shea...@gmail.com> wrote: >>>>> >>>>> On 3 September 2011 19:35, Nicolas Cellier >>>>> <nicolas.cellier.aka.n...@gmail.com> wrote: >>>>>> >>>>>> 2011/9/3 Frank Shearar <frank.shea...@gmail.com>: >>>>>>> >>>>>>> On 3 September 2011 18:50, Lukas Renggli <reng...@gmail.com> wrote: >>>>>>>> >>>>>>>> I think it is a good idea to have the number parser separate, after >>>>>>>> all it might also make sense to use it separately. >>>>>>>> >>>>>>>> It seems that the new Smalltalk grammar is significantly slower. The >>>>>>>> benchmark PPSmalltalkClassesTests class>>#benchmark: that uses the >>>>>>>> source code of the collection hierarchy and does not especially target >>>>>>>> number literals runs 30% slower. >>>>>>>> >>>>>>>> Also I see that "Number readFrom: ..." is still used within the >>>>>>>> grammar. This seems to be a bit strange, no? >>>>>>> >>>>>>> >>>>>>> Yes: it's a double-parse, which is a bit lame. First, we parse the >>>>>>> literal with PPSmalltalkNumberParser, which ensures that the thing >>>>>>> given to Number class >> #readFrom: is a well-formed token (so, in >>>>>>> particular, Squeak's Number doesn't get to see anything other than a >>>>>>> well-formed token). >>>>>>> >>>>>>> It sounds like you're happy with the basic concept, so maybe I should >>>>>>> remove the Number class >> #readFrom: stuff, see if I can't remove the >>>>>>> performance issues, and resubmit the patch. >>>>>>> >>>>>>> frank >>>>>>> >>>>>> >>>>>> Yes, a NumberParser is essentially parsing, and this duplication sounds >>>>>> useless. >>>>>> The main feature of interest in NumberParser that I consider a >>>>>> requirement and should find its equivalence in a PetitNumberParser is: >>>>>> - round a decimal representation to nearest Float >>>>>> It's simple, just convert a Fraction asFloat in a single final step to >>>>>> avoid cumulating round off errors - see >>>>>> #makeFloatFromMantissa:exponent:base: >>>>>> >>>>>> The second feature of interest in NumberParser is the ability to >>>>>> parser LargeInteger efficiently by avoiding (10 * largeValue + >>>>>> digitValue) loops, and replacing them with a log(n) cost. >>>>>> This would be a simple thing to implement in a functional language. >>>>> >>>>> >>>>> Hopefully this won't offend your sensibilities too much :). It does, >>>>> in fact, use 10* loops - I wrote an experimental "front half * rear >>>>> half" recursion, which was slower in my benchmarks. >>>>> >>>>> This version has the grammar and parser doing no string->number >>>>> conversion at all. PPSmalltalkNumberMaker supplies a number of utility >>>>> methods designed to stop one from making malformed numbers. It also >>>>> supplies a builder interface that the parser uses to construct >>>>> numbers. >>>>> >>>>> frank >>>>> >>>>>> Nicolas >>>>>> >>>>>>>> Lukas >>>>>>>> >>>>>>>> >>>>>>>> On 3 September 2011 17:18, Frank Shearar <frank.shea...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On 3 September 2011 15:56, Lukas Renggli <reng...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> On 3 September 2011 16:51, Frank Shearar <frank.shea...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Lukas, >>>>>>>>>>> >>>>>>>>>>> I haven't :) mainly because I'm unsure where to put it - is there >>>>>>>>>>> perhaps a PP Inbox, or shall I just post the merged version, or >>>>>>>>>>> what's >>>>>>>>>>> your preference? (How about an mcd between my merge and PP's head?) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Just put the .mcz at some public URL (dropbox, squeak source, ...) >>>>>>>>>> or >>>>>>>>>> attach it to a mail. >>>>>>>>> >>>>>>>>> >>>>>>>>> Ah, great - here it is. You'll see I've written the grammar as a >>>>>>>>> separate class. That was really more to make what I'd done more >>>>>>>>> obvious and to minimise the change to PPSmalltalkGrammar, but perhaps >>>>>>>>> it's not a bad idea anyway: it's easy to see the number literal >>>>>>>>> subgrammar. >>>>>>>>> >>>>>>>>> frank >>>>>>>>> >>>>>>>>>> Lukas >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Lukas Renggli >>>>>>>>>> www.lukas-renggli.ch >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Lukas Renggli >>>>>>>> www.lukas-renggli.ch >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>> > <PetitSmalltalk-fbs.63.mcz>