Re: [fpc-pascal] How to split file of whitespace separated numbers?
2016-12-23 15:27 GMT-03:00 Marco van de Voort : > In our previous episode, Graeme Geldenhuys said: > > For many other things, plain code could be faster, but often a lot more > > effort and time consuming to implement. Where as you could have written > > a regex expression in under 10 seconds and accomplish the same task 8 > > lines of code or less - very little effort required. > > Writing or even worse, reading/debugging regex is about the most intensive > effort there is IMHO. > Agree that Regex carries an extra mental overhead. This is why i kept away from it for a long time. Early this year i needed to use it in one of my projects, so i decided to bite the bullet and read Mastering Regular Expressions book. Once you understand the reasoning behind regex, it's a lot less intimidating. These days i use eventually For coincidence, yesterday, i was writing code to parse raw text to extract some data. Initially i did manually but when i needed to extract a new field i realized things would get even worse. Than rewrote with regex. See diff here: https://www.diffchecker.com/NDDa9gpH IMO much better. Not saying that is easy or should be used at will. But once you learn the basics, regex is a valuable tool. For debugging i use http://regexr.com/ and rely on unit tests to ensure correctness Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Am 24.12.2016 12:53 schrieb "Mark Morgan Lloyd" < markmll.fpc-pas...@telemetry.co.uk>: > > On 24/12/16 11:30, Lars wrote: >> >> On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: >>> >>> On 2016-12-23 18:27, Marco van de Voort wrote: >>> Writing or even worse, reading/debugging regex is about the most intensive effort there is IMHO. >>> >>> >>> So is standard programming code - if you don't know the syntax or how it >>> works. ;-) Also the reason why I posted a couple of links to regex sites >>> to get the original poster started (in case he doesn't know regex). Here >>> is another link (by the author of EditPad Pro), who really knows his >>> regex! >>> >>> http://www.regular-expressions.info/tutorial.html >>> >> >> Next thing todo: implement PERL inside pascal programs, compiled in perl. >> Then, realize, why you didn't originally want to go there ;-) > > > Or even allow FPC to to call Lua. You realize that we already have language bindings for Lua somewhere? ;) Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 24/12/16 11:30, Lars wrote: On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: On 2016-12-23 18:27, Marco van de Voort wrote: Writing or even worse, reading/debugging regex is about the most intensive effort there is IMHO. So is standard programming code - if you don't know the syntax or how it works. ;-) Also the reason why I posted a couple of links to regex sites to get the original poster started (in case he doesn't know regex). Here is another link (by the author of EditPad Pro), who really knows his regex! http://www.regular-expressions.info/tutorial.html Next thing todo: implement PERL inside pascal programs, compiled in perl. Then, realize, why you didn't originally want to go there ;-) Or even allow FPC to to call Lua. I know this is rare and probably wouldn't happen outside the "season of goodwill", but I actually agree with Graeme here: regexes are useful. BUT FFS DOCUMENT WHAT YOU'RE DOING FOR PEOPLE WHO DON'T UNDERSTAND THEM! -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: > On 2016-12-23 18:27, Marco van de Voort wrote: > >> Writing or even worse, reading/debugging regex is about the most >> intensive effort there is IMHO. > > So is standard programming code - if you don't know the syntax or how it > works. ;-) Also the reason why I posted a couple of links to regex sites > to get the original poster started (in case he doesn't know regex). Here > is another link (by the author of EditPad Pro), who really knows his > regex! > > http://www.regular-expressions.info/tutorial.html > Next thing todo: implement PERL inside pascal programs, compiled in perl. Then, realize, why you didn't originally want to go there ;-) ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Em 23 de dez de 2016 05:15, "Bo Berglund" escreveu: Is there a quick way to split a string of whitespace separated values into the separate members? Unir strutils Wordcount + extractword Or Extractsubstr in loop http://www.freepascal.org/docs-html/rtl/strutils/extractsubstr.html Luiz I have to create a function to process a number of big data files where numbers are stored in lines of 4-6 values using whitespace inbetween. First I got a sample looking like this: {code} 0.41670.3636-14.1483227.2260 {code} Here the separators were 4 spaces so on each line I used (slDecode is a TStringList): {code} sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} Worked fine if a bit slow... The stringlist items are then passed to a string to float function and stored into a dynamic array. But then it failed on a file containing lines like this: {code} 0.0000.0007.0000.000 29.6628 {code} Here there are 3 leading spaces plus one separator is only 2 spaces wide. So I had to modify the code: {code} sLine := Trim(sLine); sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); sLine := StringReplace(sLine, ' ', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} This works in this case but now I realize I need something better, which can deal with varying number of whitespace chars inbetween numbers. The test files are very big, like half a million lines and up, so I cannot introduce a lot of code in the loop since processing time will increase. Is there any good and quick way to extract real data from a space separated list without knowing beforehand the size of the whitespace separators? I guess that my next sample problem will be a file with TAB rather than space or even mixed TAB and space... -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, December 23, 2016 4:49 am, Howard Page-Clark wrote: > On 23/12/16 08:14, Bo Berglund wrote: > >> Is there a quick way to split a string of whitespace separated values >> into the separate members? > It is possible that a custom string parser (something along these lines) > might improve your processing speed: > > type TDoubleArray = array of Double; > > > function StrToDblArray(const aString: string): TDoubleArray; > var c: Char; And as soon as char is involved, unicode gets screwed up Am I right, am I right... But if he is not dealing with any unicode data and it is all simple chars, should be okay. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal