Re: [fpc-pascal] How to split file of whitespace separated numbers?
Hallo Bo, You are right. If you want to keep the numbers of one line together, then READ doesn't work for you. Markus Am 31.12.2016 um 12:18 schrieb Bo Berglund: On Sat, 31 Dec 2016 11:27:53 +0100, greim wrote: Hallo Bo, please try the simple READ(myfile, x); as mentioned in my posting from 23.dec. It works w/o any postprocessing. I saw that but the problem is that there are varying number of items on each line and a line is considered a record. So I would anyway have to keep track of the lines. With Read I guess it skips all whitespace including line endings and this causes problems in identifying the content because of varying number of items on each line. The Readln approach followed by splitting in a stringlist is enough of an improvement that I can use it. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Sat, 31 Dec 2016 11:30:26 -0300, Luiz Americo Pereira Camara wrote: >2016-12-31 8:18 GMT-03:00 Bo Berglund : > >> On Sat, 31 Dec 2016 11:27:53 +0100, greim >> >> The Readln approach followed by splitting in a stringlist is enough of >> an improvement that I can use it. >> >> >Did you look at StrUtils.ExtractSubStr ? > >Should be faster than TStrings; > >PosWord := 1; >Word := ExtractSubstr(Line, PosWord, [' ']); >while (Word <> '') do >begin > //do work > Word := ExtractSubstr(Line, PosWord, [' ']); >end; Thanks, I did not know about this... Unfortunately the function it is not part of the Delphi StrUtils unit and I need my solution to be usable on Delphi since the bulk of the program is probably too difficult to port to FPC. It contains a lot of graphics that uses an outdated version of GLScene, which I have had to make some minimal pathces to in order to port the program up to XE5. But I feel a port to FPC would tax my abilities too much because of the 3rd party stuff that has been used... It was a good suggestion, though! And I see that the delimiter argument can be set to StdWordDelims, which takes care of virtually all whitespace stuff. Will use it whenever I code in FPC and have this problem. -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
2016-12-31 8:18 GMT-03:00 Bo Berglund : > On Sat, 31 Dec 2016 11:27:53 +0100, greim > > The Readln approach followed by splitting in a stringlist is enough of > an improvement that I can use it. > > Did you look at StrUtils.ExtractSubStr ? Should be faster than TStrings; PosWord := 1; Word := ExtractSubstr(Line, PosWord, [' ']); while (Word <> '') do begin //do work Word := ExtractSubstr(Line, PosWord, [' ']); end; Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Sat, 31 Dec 2016 11:27:53 +0100, greim wrote: >Hallo Bo, > >please try the simple > >READ(myfile, x); > >as mentioned in my posting from 23.dec. >It works w/o any postprocessing. > I saw that but the problem is that there are varying number of items on each line and a line is considered a record. So I would anyway have to keep track of the lines. With Read I guess it skips all whitespace including line endings and this causes problems in identifying the content because of varying number of items on each line. The Readln approach followed by splitting in a stringlist is enough of an improvement that I can use it. -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Hallo Bo, please try the simple READ(myfile, x); as mentioned in my posting from 23.dec. It works w/o any postprocessing. Markus ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, 23 Dec 2016 11:53:58 +, Graeme Geldenhuys wrote: >That problem is perfectly suited for regular expressions. And a rather >simple one at than. The FPC's FCL packages include a regex unit too >which should suite your needs. I was away over Xmas so I have not seen all this regexp discussion for my problem until now In past times I have come across solutions using regular expressions for example in shellscripts or similar. In most cases I saw that they worked but had a hard time understanding *how* they worked, the syntax is too dense for me. The actual problem I had was that a data processing program, I did not write myself, was using up extremely long times just loading the input data file so I was looking at alternate ways to read that data in. The program was written originally using RAD Studio 2007 by someone else and I "ported" it to RAD Studio XE5 a few years ago. But I did not get into the working code, just making the transfer to Unicode and updating the GUI. All processing code was untouched (except for changing string to ansistring where needed). It is a very math intensive processing package and I am no mathematician... Anyway, the original author was no real coder but a scientist so things like file I/O was not optimal. This shows up when reading the large actual data files, which could be hundreds of Mbytes in length. In his code it takes minutes to do! And this was the cause of my original question. Since it seemed rather general in nature I posted both here and in the Embarcadero forum... Now I am down to just seconds using the ReadLn + StringList.Delimitedtext way to parse the data. My goal now is to simply create a utility that transforms these files into binary format instead and add code to load the data into dynamic float arrays. The tests I did timed the conversion at some few seconds and once the binary files are created the load of these resulting binary data is done in fractions of a second. So I am pretty much done with this problem (without resorting to regexp usage). Thanks anyway for pointing out an alternate path! -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 2016-12-24 11:52, Mark Morgan Lloyd wrote: > regexes are useful. Very much so, and I’m far from being an expert. But I am finding more and more uses for them, the more I use them. > BUT FFS DOCUMENT WHAT YOU'RE DOING FOR PEOPLE WHO DON'T UNDERSTAND THEM! The same can be said for standard code too. At least with regex, there are plenty of tools that explain what they do, and what each part means. For example: http://rick.measham.id.au/paste/explain.pl?regex=%5Cb%5B-%2B%5D%3F%5B0-9%5D%2B%28%3F%3A%5C.%5B0-9%5D%2B%29%3F%5Cb NODE EXPLANATION \b the boundary between a word char (\w) and something that is not a word char [-+]?any character of: '-', '+' (optional (matching the most amount possible)) [0-9]+ any character of: '0' to '9' (1 or more times (matching the most amount possible)) (?: group, but do not capture (optional (matching the most amount possible)): \. '.' [0-9]+ any character of: '0' to '9' (1 or more times (matching the most amount possible)) )? end of grouping \b the boundary between a word char (\w) and something that is not a word char That tells you exactly what each part of the following regex means: ⌜\b[-+]?[0-9]+(?:\.[0-9]+)?\b⌟ Note: I wrap the regex with ⌜ and ⌟ character to denote the start and end of the regex. Some people get confused when you use double quotes. I believe Perl or egrep or something can output the exact same “regex explanation” information from the command line. Regards, Graeme -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
2016-12-23 15:27 GMT-03:00 Marco van de Voort : > In our previous episode, Graeme Geldenhuys said: > > For many other things, plain code could be faster, but often a lot more > > effort and time consuming to implement. Where as you could have written > > a regex expression in under 10 seconds and accomplish the same task 8 > > lines of code or less - very little effort required. > > Writing or even worse, reading/debugging regex is about the most intensive > effort there is IMHO. > Agree that Regex carries an extra mental overhead. This is why i kept away from it for a long time. Early this year i needed to use it in one of my projects, so i decided to bite the bullet and read Mastering Regular Expressions book. Once you understand the reasoning behind regex, it's a lot less intimidating. These days i use eventually For coincidence, yesterday, i was writing code to parse raw text to extract some data. Initially i did manually but when i needed to extract a new field i realized things would get even worse. Than rewrote with regex. See diff here: https://www.diffchecker.com/NDDa9gpH IMO much better. Not saying that is easy or should be used at will. But once you learn the basics, regex is a valuable tool. For debugging i use http://regexr.com/ and rely on unit tests to ensure correctness Luiz ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Am 24.12.2016 12:53 schrieb "Mark Morgan Lloyd" < markmll.fpc-pas...@telemetry.co.uk>: > > On 24/12/16 11:30, Lars wrote: >> >> On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: >>> >>> On 2016-12-23 18:27, Marco van de Voort wrote: >>> Writing or even worse, reading/debugging regex is about the most intensive effort there is IMHO. >>> >>> >>> So is standard programming code - if you don't know the syntax or how it >>> works. ;-) Also the reason why I posted a couple of links to regex sites >>> to get the original poster started (in case he doesn't know regex). Here >>> is another link (by the author of EditPad Pro), who really knows his >>> regex! >>> >>> http://www.regular-expressions.info/tutorial.html >>> >> >> Next thing todo: implement PERL inside pascal programs, compiled in perl. >> Then, realize, why you didn't originally want to go there ;-) > > > Or even allow FPC to to call Lua. You realize that we already have language bindings for Lua somewhere? ;) Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 24/12/16 11:30, Lars wrote: On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: On 2016-12-23 18:27, Marco van de Voort wrote: Writing or even worse, reading/debugging regex is about the most intensive effort there is IMHO. So is standard programming code - if you don't know the syntax or how it works. ;-) Also the reason why I posted a couple of links to regex sites to get the original poster started (in case he doesn't know regex). Here is another link (by the author of EditPad Pro), who really knows his regex! http://www.regular-expressions.info/tutorial.html Next thing todo: implement PERL inside pascal programs, compiled in perl. Then, realize, why you didn't originally want to go there ;-) Or even allow FPC to to call Lua. I know this is rare and probably wouldn't happen outside the "season of goodwill", but I actually agree with Graeme here: regexes are useful. BUT FFS DOCUMENT WHAT YOU'RE DOING FOR PEOPLE WHO DON'T UNDERSTAND THEM! -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote: > On 2016-12-23 18:27, Marco van de Voort wrote: > >> Writing or even worse, reading/debugging regex is about the most >> intensive effort there is IMHO. > > So is standard programming code - if you don't know the syntax or how it > works. ;-) Also the reason why I posted a couple of links to regex sites > to get the original poster started (in case he doesn't know regex). Here > is another link (by the author of EditPad Pro), who really knows his > regex! > > http://www.regular-expressions.info/tutorial.html > Next thing todo: implement PERL inside pascal programs, compiled in perl. Then, realize, why you didn't originally want to go there ;-) ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Em 23 de dez de 2016 05:15, "Bo Berglund" escreveu: Is there a quick way to split a string of whitespace separated values into the separate members? Unir strutils Wordcount + extractword Or Extractsubstr in loop http://www.freepascal.org/docs-html/rtl/strutils/extractsubstr.html Luiz I have to create a function to process a number of big data files where numbers are stored in lines of 4-6 values using whitespace inbetween. First I got a sample looking like this: {code} 0.41670.3636-14.1483227.2260 {code} Here the separators were 4 spaces so on each line I used (slDecode is a TStringList): {code} sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} Worked fine if a bit slow... The stringlist items are then passed to a string to float function and stored into a dynamic array. But then it failed on a file containing lines like this: {code} 0.0000.0007.0000.000 29.6628 {code} Here there are 3 leading spaces plus one separator is only 2 spaces wide. So I had to modify the code: {code} sLine := Trim(sLine); sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); sLine := StringReplace(sLine, ' ', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} This works in this case but now I realize I need something better, which can deal with varying number of whitespace chars inbetween numbers. The test files are very big, like half a million lines and up, so I cannot introduce a lot of code in the loop since processing time will increase. Is there any good and quick way to extract real data from a space separated list without knowing beforehand the size of the whitespace separators? I guess that my next sample problem will be a file with TAB rather than space or even mixed TAB and space... -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, December 23, 2016 4:49 am, Howard Page-Clark wrote: > On 23/12/16 08:14, Bo Berglund wrote: > >> Is there a quick way to split a string of whitespace separated values >> into the separate members? > It is possible that a custom string parser (something along these lines) > might improve your processing speed: > > type TDoubleArray = array of Double; > > > function StrToDblArray(const aString: string): TDoubleArray; > var c: Char; And as soon as char is involved, unicode gets screwed up Am I right, am I right... But if he is not dealing with any unicode data and it is all simple chars, should be okay. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 2016-12-23 18:27, Marco van de Voort wrote: > Writing or even worse, reading/debugging regex is about the most intensive > effort there is IMHO. So is standard programming code - if you don't know the syntax or how it works. ;-) Also the reason why I posted a couple of links to regex sites to get the original poster started (in case he doesn't know regex). Here is another link (by the author of EditPad Pro), who really knows his regex! http://www.regular-expressions.info/tutorial.html Regards, Graeme ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 2016-12-23 18:04, Sven Barth wrote: > E.g. > opening a log file in 10 seconds vs nearly none make a difference Again, it depends on the tool (editor) you use. Both jEdit and EditPad Pro (implement in Delphi) uses regex for syntax highlighting. EditPad Pro also uses it for file navigation, syntax highlighting tools output, output/code navigation etc. Both can handle massive text files and both open them instantly and everything is highlighted from the word go. No idea how they accomplish that, but that's another story. ;-) Regards, Graeme ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
In our previous episode, Graeme Geldenhuys said: > For many other things, plain code could be faster, but often a lot more > effort and time consuming to implement. Where as you could have written > a regex expression in under 10 seconds and accomplish the same task 8 > lines of code or less - very little effort required. Writing or even worse, reading/debugging regex is about the most intensive effort there is IMHO. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 23.12.2016 18:46, Graeme Geldenhuys wrote: > On 2016-12-23 13:06, Sven Barth wrote: >> Regular expressions usually have a higher overhead > > That is not always a given. You are aware that I wrote "usually" there? > For many other things, plain code could be faster, but often a lot more > effort and time consuming to implement. Where as you could have written > a regex expression in under 10 seconds and accomplish the same task 8 > lines of code or less - very little effort required. But sometimes the effort vs performance trade of is worth it. E.g. opening a log file in 10 seconds vs nearly none make a difference (as I said, I don't remember the exact speed up anymore, but it was significant; but also not the only problematic point as originally the opening of a large enough log file took minutes :P ). Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 2016-12-23 13:06, Sven Barth wrote: > Regular expressions usually have a higher overhead That is not always a given. I remember years back we had a similar discussion, but then about syntax highlighting large code units. eg: The the large OSX related unit in FPC (can't remember how many MB's in size it was). Lazarus performed okay syntax highlighting that, but other editors didn't. Everybody was told that Lazarus did so well, because it "understood the code and syntax". jEdit, a Java based editor, implements all it's syntax highlighting (100's of them) all via regex. jEdit was extremely fast, even on that very large OSX related unit. Even when you jump from the beginning of the file straight to the end. For many other things, plain code could be faster, but often a lot more effort and time consuming to implement. Where as you could have written a regex expression in under 10 seconds and accomplish the same task 8 lines of code or less - very little effort required. Regards, Graeme -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Hey Kids, why so complicated? Good old Niklaus Wirth has already everything done for you: I have to cite one sentence on the last slide at his birthday colloquium: "Reducing size and complexity is the triumph" So READ is already quite clever, it doesn't care about whitespaces, carriage returns and linefeeds : PROGRAM readline; VAR a : ARRAY[0..100] OF double; infile : TEXT; lauf, lauf2 : longint; BEGIN lauf := 0; assign(infile, 'infile.txt'); reset(infile); WHILE NOT(eof(infile)) DO BEGIN read(infile, a[lauf]); inc(lauf); END; close(infile); FOR lauf2 := 0 TO pred(lauf) DO BEGIN writeln('Index : ', lauf2, ' Value : ', a[lauf2]); END; END. And here infile.txt: 123.4 55.2 33.1 4 12.1 1.1 1 2 3 4 333.888 444.555 Regards Markus Am 23.12.2016 um 14:06 schrieb Sven Barth: Am 23.12.2016 12:54 schrieb "Graeme Geldenhuys" mailto:mailingli...@geldenhuys.co.uk>>: On 2016-12-23 08:14, Bo Berglund wrote: > Is there a quick way to split a string of whitespace separated values > into the separate members? That problem is perfectly suited for regular expressions. And a rather simple one at than. The FPC's FCL packages include a regex unit too which should suite your needs. http://www.regex101.com/ http://www.regexplained.co.uk/ http://regex.info/ Even the trial book (first chapter only) of "Mastering Regular Expressions" is invaluable for users new to regex. And will explain all you need to know to solve your problem. Regular expressions usually have a higher overhead however (as you might have noticed, Bo timed his code later on). For example at work I changed a regular expression based parser for the lines of a log file to a simpler one and the speedup was noticeable (I don't have exact numbers anymore however). Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
Am 23.12.2016 12:54 schrieb "Graeme Geldenhuys" < mailingli...@geldenhuys.co.uk>: > > On 2016-12-23 08:14, Bo Berglund wrote: > > Is there a quick way to split a string of whitespace separated values > > into the separate members? > > > That problem is perfectly suited for regular expressions. And a rather > simple one at than. The FPC's FCL packages include a regex unit too > which should suite your needs. > > > http://www.regex101.com/ > > http://www.regexplained.co.uk/ > > http://regex.info/ > Even the trial book (first chapter only) of "Mastering Regular > Expressions" is invaluable for users new to regex. And will > explain all you need to know to solve your problem. > Regular expressions usually have a higher overhead however (as you might have noticed, Bo timed his code later on). For example at work I changed a regular expression based parser for the lines of a log file to a simpler one and the speedup was noticeable (I don't have exact numbers anymore however). Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 2016-12-23 08:14, Bo Berglund wrote: > Is there a quick way to split a string of whitespace separated values > into the separate members? That problem is perfectly suited for regular expressions. And a rather simple one at than. The FPC's FCL packages include a regex unit too which should suite your needs. http://www.regex101.com/ http://www.regexplained.co.uk/ http://regex.info/ Even the trial book (first chapter only) of "Mastering Regular Expressions" is invaluable for users new to regex. And will explain all you need to know to solve your problem. Regards, Graeme -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On 23/12/16 08:14, Bo Berglund wrote: Is there a quick way to split a string of whitespace separated values into the separate members? It is possible that a custom string parser (something along these lines) might improve your processing speed: type TDoubleArray = array of Double; function StrToDblArray(const aString: string): TDoubleArray; var c: Char; prevNumeric: boolean = False; sNum: string = ''; number: double; function IsNumeric: boolean; inline; begin Exit(c in ['.', '0'..'9']); end; begin SetLength(Result, 0); for c in aString do begin case IsNumeric of False: if prevNumeric then begin if TryStrToFloat(sNum, number) then begin SetLength(Result, Length(Result) + 1); Result[High(Result)]:=number; end; sNum:=''; prevNumeric:=False; end; True: begin sNum:=sNum + c; if not prevNumeric then prevNumeric:=True; end; end; end; if prevNumeric and TryStrToFloat(sNum, number) then begin SetLength(Result, Length(Result) + 1); Result[High(Result)]:=number; end; end; ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
On Fri, 23 Dec 2016 10:04:09 +0100, Gabor Boros wrote: >2016. 12. 23. 9:14 keltezéssel, Bo Berglund írta: >> Is there a quick way to split a string of whitespace separated values >> into the separate members? > SL.DelimitedText:=' 0.0000.0007.0000.000 29.6628'; Thanks, I did not know that one could do this and get away with it. Believed one had to set the delimiter first and since it is varying number of spaces it would not work. But it seems like it does work! I applied your method by removing all the code for handling this and used only the following: {code} ReadLn(F, sLine); slDecode.DelimitedText := sLine; {code} I timed my original code for a file of some 66+ lines to 9.9s. Result: Original code takes 9.9 s to process the file. Modified code takes 4.4 s And I checked with the file containing the extra spaces and varying size of whitespace. It too was processed correctly. Thanks again! -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] How to split file of whitespace separated numbers?
2016. 12. 23. 9:14 keltezéssel, Bo Berglund írta: Is there a quick way to split a string of whitespace separated values into the separate members? Hi, I don't know quick or not... program Project1; uses Classes; var SL:TStringList; i:Integer; begin SL:=TStringList.Create; SL.DelimitedText:=' 0.0000.0007.0000.000 29.6628'; for i:=0 to SL.Count-1 do begin WriteLn('*'+SL.Strings[i]+'*'); end; ReadLn; end. Gabor ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] How to split file of whitespace separated numbers?
Is there a quick way to split a string of whitespace separated values into the separate members? I have to create a function to process a number of big data files where numbers are stored in lines of 4-6 values using whitespace inbetween. First I got a sample looking like this: {code} 0.41670.3636-14.1483227.2260 {code} Here the separators were 4 spaces so on each line I used (slDecode is a TStringList): {code} sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} Worked fine if a bit slow... The stringlist items are then passed to a string to float function and stored into a dynamic array. But then it failed on a file containing lines like this: {code} 0.0000.0007.0000.000 29.6628 {code} Here there are 3 leading spaces plus one separator is only 2 spaces wide. So I had to modify the code: {code} sLine := Trim(sLine); sLine := StringReplace(sLine, '', #13, [rfReplaceAll]); sLine := StringReplace(sLine, ' ', #13, [rfReplaceAll]); slDecode.Text := sLine; {code} This works in this case but now I realize I need something better, which can deal with varying number of whitespace chars inbetween numbers. The test files are very big, like half a million lines and up, so I cannot introduce a lot of code in the loop since processing time will increase. Is there any good and quick way to extract real data from a space separated list without knowing beforehand the size of the whitespace separators? I guess that my next sample problem will be a file with TAB rather than space or even mixed TAB and space... -- Bo Berglund Developer in Sweden ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal