Re: [fpc-pascal] How to split file of whitespace separated numbers?

2017-01-01 Thread greim

Hallo Bo,

You are right. If you want to keep the numbers of one line together, 
then READ doesn't work for you.


Markus






Am 31.12.2016 um 12:18 schrieb Bo Berglund:

On Sat, 31 Dec 2016 11:27:53 +0100, greim
 wrote:


Hallo Bo,

please try the simple

READ(myfile, x);

as mentioned in my posting from 23.dec.
It works w/o any postprocessing.


I saw that but the problem is that there are varying number of items
on each line and a line is considered a record.
So I would anyway have to keep track of the lines.
With Read I guess it skips all whitespace including line endings and
this causes problems in identifying the content because of varying
number of items on each line.
The Readln approach followed by splitting in a stringlist is enough of
an improvement that I can use it.




___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-31 Thread Bo Berglund
On Sat, 31 Dec 2016 11:30:26 -0300, Luiz Americo Pereira Camara
 wrote:

>2016-12-31 8:18 GMT-03:00 Bo Berglund :
>
>> On Sat, 31 Dec 2016 11:27:53 +0100, greim
>>
>> The Readln approach followed by splitting in a stringlist is enough of
>> an improvement that I can use it.
>>
>>
>Did you look at StrUtils.ExtractSubStr ?
>
>Should be faster than TStrings;
>
>PosWord := 1;
>Word := ExtractSubstr(Line, PosWord, [' ']);
>while (Word <> '') do
>begin
>  //do work
>  Word := ExtractSubstr(Line, PosWord, [' ']);
>end;

Thanks, I did not know about this...
Unfortunately the function it is not part of the Delphi StrUtils unit
and I need my solution to be usable on Delphi since the bulk of the
program is probably too difficult to port to FPC. It contains a lot of
graphics that uses an outdated version of GLScene, which I have had to
make some minimal pathces to in order to port the program up to XE5.
But I feel a port to FPC would tax my abilities too much because of
the 3rd party stuff that has been used...

It was a good suggestion, though!
And I see that the delimiter argument can be set to StdWordDelims,
which takes care of virtually all whitespace stuff. Will use it
whenever I code in FPC and have this problem.


-- 
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-31 Thread Luiz Americo Pereira Camara
2016-12-31 8:18 GMT-03:00 Bo Berglund :

> On Sat, 31 Dec 2016 11:27:53 +0100, greim
>
> The Readln approach followed by splitting in a stringlist is enough of
> an improvement that I can use it.
>
>
Did you look at StrUtils.ExtractSubStr ?

Should be faster than TStrings;

PosWord := 1;
Word := ExtractSubstr(Line, PosWord, [' ']);
while (Word <> '') do
begin
  //do work
  Word := ExtractSubstr(Line, PosWord, [' ']);
end;

Luiz
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-31 Thread Bo Berglund
On Sat, 31 Dec 2016 11:27:53 +0100, greim
 wrote:

>Hallo Bo,
>
>please try the simple
>
>READ(myfile, x);
>
>as mentioned in my posting from 23.dec.
>It works w/o any postprocessing.
>
I saw that but the problem is that there are varying number of items
on each line and a line is considered a record.
So I would anyway have to keep track of the lines.
With Read I guess it skips all whitespace including line endings and
this causes problems in identifying the content because of varying
number of items on each line.
The Readln approach followed by splitting in a stringlist is enough of
an improvement that I can use it.


-- 
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-31 Thread greim

Hallo Bo,

please try the simple

READ(myfile, x);

as mentioned in my posting from 23.dec.
It works w/o any postprocessing.

Markus

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-30 Thread Bo Berglund
On Fri, 23 Dec 2016 11:53:58 +, Graeme Geldenhuys
 wrote:

>That problem is perfectly suited for regular expressions. And a rather
>simple one at than. The FPC's FCL packages include a regex unit too
>which should suite your needs.

I was away over Xmas so I have not seen all this regexp discussion for
my problem until now

In past times I have come across solutions using regular expressions
for example in shellscripts or similar. In most cases I saw that they
worked but had a hard time understanding *how* they worked, the syntax
is too dense for me.

The actual problem I had was that a data processing program, I did not
write myself, was using up extremely long times just loading the input
data file so I was looking at alternate ways to read that data in.

The program was written originally using RAD Studio 2007 by someone
else and I "ported" it to RAD Studio XE5 a few years ago. But I did
not get into the working code, just making the transfer to Unicode and
updating the GUI. All processing code was untouched (except for
changing string to ansistring where needed).
It is a very math intensive processing package and I am no
mathematician...

Anyway, the original author was no real coder but a scientist so
things like file I/O was not optimal. This shows up when reading the
large actual data files, which could be hundreds of Mbytes in length.
In his code it takes minutes to do!
And this was the cause of my original question. Since it seemed rather
general in nature I posted both here and in the Embarcadero forum...

Now I am down to just seconds using the ReadLn +
StringList.Delimitedtext way to parse the data.

My goal now is to simply create a utility that transforms these files
into binary format instead and add code to load the data into dynamic
float arrays.
The tests I did timed the conversion at some few seconds and once the
binary files are created the load of these resulting binary data is
done in fractions of a second.

So I am pretty much done with this problem (without resorting to
regexp usage).

Thanks anyway for pointing out an alternate path!

-- 
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-26 Thread Graeme Geldenhuys
On 2016-12-24 11:52, Mark Morgan Lloyd wrote:
> regexes are useful.

Very much so, and I’m far from being an expert. But I am finding more
and more uses for them, the more I use them.


> BUT FFS DOCUMENT WHAT YOU'RE DOING FOR PEOPLE WHO DON'T UNDERSTAND THEM!

The same can be said for standard code too. At least with regex, there
are plenty of tools that explain what they do, and what each part means.

For example:

http://rick.measham.id.au/paste/explain.pl?regex=%5Cb%5B-%2B%5D%3F%5B0-9%5D%2B%28%3F%3A%5C.%5B0-9%5D%2B%29%3F%5Cb


NODE EXPLANATION

  \b   the boundary between a word char (\w) and
   something that is not a word char

  [-+]?any character of: '-', '+' (optional
   (matching the most amount possible))

  [0-9]+   any character of: '0' to '9' (1 or more
   times (matching the most amount possible))

  (?:  group, but do not capture (optional
   (matching the most amount possible)):

\.   '.'

[0-9]+   any character of: '0' to '9' (1 or more
 times (matching the most amount
 possible))

  )?   end of grouping

  \b   the boundary between a word char (\w) and
   something that is not a word char


That tells you exactly what each part of the following regex means:

   ⌜\b[-+]?[0-9]+(?:\.[0-9]+)?\b⌟

Note:
  I wrap the regex with ⌜ and ⌟ character to denote the start and
  end of the regex. Some people get confused when you use double
  quotes.


I believe Perl or egrep or something can output the exact same “regex
explanation” information from the command line.


Regards,
  Graeme

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Luiz Americo Pereira Camara
2016-12-23 15:27 GMT-03:00 Marco van de Voort :

> In our previous episode, Graeme Geldenhuys said:
> > For many other things, plain code could be faster, but often a lot more
> > effort and time consuming to implement. Where as you could have written
> > a regex expression in under 10 seconds and accomplish the same task 8
> > lines of code or less - very little effort required.
>
> Writing or even worse, reading/debugging regex is about the most intensive
> effort there is IMHO.
>


Agree that Regex carries an extra mental overhead. This is why i kept away
from it for a long time.

Early this year i needed to use it in one of my projects, so i decided to
bite the bullet and read Mastering Regular Expressions book.

Once you understand the reasoning behind regex, it's a lot less
intimidating.

These days i use eventually

For coincidence, yesterday, i was writing code to parse raw text to extract
some data.

Initially i did manually but when i needed to extract a new field i
realized things would get even worse. Than rewrote with regex.

See diff here: https://www.diffchecker.com/NDDa9gpH

IMO much better.

Not saying that is easy or should be used at will. But once you learn the
basics, regex is a valuable tool.

For debugging i use http://regexr.com/ and rely on unit tests to ensure
correctness

Luiz
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Sven Barth
Am 24.12.2016 12:53 schrieb "Mark Morgan Lloyd" <
markmll.fpc-pas...@telemetry.co.uk>:
>
> On 24/12/16 11:30, Lars wrote:
>>
>> On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote:
>>>
>>> On 2016-12-23 18:27, Marco van de Voort wrote:
>>>
 Writing or even worse, reading/debugging regex is about the most
 intensive effort there is IMHO.
>>>
>>>
>>> So is standard programming code - if you don't know the syntax or how it
>>> works. ;-)  Also the reason why I posted a couple of links to regex
sites
>>> to get the original poster started (in case he doesn't know regex). Here
>>> is another link (by the author of EditPad Pro), who really knows his
>>> regex!
>>>
>>> http://www.regular-expressions.info/tutorial.html
>>>
>>
>> Next thing todo: implement PERL inside pascal programs, compiled in perl.
>> Then, realize, why you didn't originally want to go there ;-)
>
>
> Or even allow FPC to to call Lua.

You realize that we already have language bindings for Lua somewhere? ;)

Regards,
Sven
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Mark Morgan Lloyd

On 24/12/16 11:30, Lars wrote:

On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote:

On 2016-12-23 18:27, Marco van de Voort wrote:


Writing or even worse, reading/debugging regex is about the most
intensive effort there is IMHO.


So is standard programming code - if you don't know the syntax or how it
works. ;-)  Also the reason why I posted a couple of links to regex sites
to get the original poster started (in case he doesn't know regex). Here
is another link (by the author of EditPad Pro), who really knows his
regex!

http://www.regular-expressions.info/tutorial.html



Next thing todo: implement PERL inside pascal programs, compiled in perl.
Then, realize, why you didn't originally want to go there ;-)


Or even allow FPC to to call Lua.

I know this is rare and probably wouldn't happen outside the "season of 
goodwill", but I actually agree with Graeme here: regexes are useful. 
BUT FFS DOCUMENT WHAT YOU'RE DOING FOR PEOPLE WHO DON'T UNDERSTAND THEM!



--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Lars
On Fri, December 23, 2016 12:54 pm, Graeme Geldenhuys wrote:
> On 2016-12-23 18:27, Marco van de Voort wrote:
>
>> Writing or even worse, reading/debugging regex is about the most
>> intensive effort there is IMHO.
>
> So is standard programming code - if you don't know the syntax or how it
> works. ;-)  Also the reason why I posted a couple of links to regex sites
> to get the original poster started (in case he doesn't know regex). Here
> is another link (by the author of EditPad Pro), who really knows his
> regex!
>
> http://www.regular-expressions.info/tutorial.html
>

Next thing todo: implement PERL inside pascal programs, compiled in perl.
Then, realize, why you didn't originally want to go there ;-)
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Luiz Americo Pereira Camara
Em 23 de dez de 2016 05:15, "Bo Berglund"  escreveu:

Is there a quick way to split a string of whitespace separated values
into the separate members?



Unir strutils

Wordcount + extractword

Or

Extractsubstr in loop


http://www.freepascal.org/docs-html/rtl/strutils/extractsubstr.html

Luiz

I have to create a function to process a number of big data files
where
numbers are stored in lines of 4-6 values using whitespace inbetween.
First I got a sample looking like this:
{code}
0.41670.3636-14.1483227.2260
{code}
Here the separators were 4 spaces so on each line I used (slDecode is
a TStringList):
{code}
  sLine := StringReplace(sLine, '', #13, [rfReplaceAll]);
  slDecode.Text := sLine;
{code}
Worked fine if a bit slow...
The stringlist items are then passed to a string to float function and
stored into a dynamic array.

But then it failed on a file containing lines like this:
{code}
   0.0000.0007.0000.000  29.6628
{code}
Here there are 3 leading spaces plus one separator is only 2 spaces
wide. So I had to modify the code:
{code}
  sLine := Trim(sLine);
  sLine := StringReplace(sLine, '', #13, [rfReplaceAll]);
  sLine := StringReplace(sLine, '  ', #13, [rfReplaceAll]);
  slDecode.Text := sLine;
{code}

This works in this case but now I realize I need something better,
which can deal with varying number of whitespace chars inbetween
numbers.
The test files are very big, like half a million lines and up, so I
cannot introduce a lot of code in the loop since processing time will
increase.

Is there any good and quick way to extract real data from a space
separated list without knowing beforehand the size of the whitespace
separators?

I guess that my next sample problem will be a file with TAB rather
than space or even mixed TAB and space...

--
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-24 Thread Lars
On Fri, December 23, 2016 4:49 am, Howard Page-Clark wrote:
> On 23/12/16 08:14, Bo Berglund wrote:
>
>> Is there a quick way to split a string of whitespace separated values
>> into the separate members?
> It is possible that a custom string parser (something along these lines)
> might improve your processing speed:
>
> type TDoubleArray = array of Double;
>
>
> function StrToDblArray(const aString: string): TDoubleArray;
> var c: Char;

And as soon as char is involved, unicode gets screwed up

Am I right, am I right...

But if he is not dealing with any unicode data and it is all simple chars,
should be okay.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Graeme Geldenhuys
On 2016-12-23 18:27, Marco van de Voort wrote:
> Writing or even worse, reading/debugging regex is about the most intensive
> effort there is IMHO.

So is standard programming code - if you don't know the syntax or how it
works. ;-)  Also the reason why I posted a couple of links to regex
sites to get the original poster started (in case he doesn't know
regex). Here is another link (by the author of EditPad Pro), who really
knows his regex!

  http://www.regular-expressions.info/tutorial.html

Regards,
  Graeme

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Graeme Geldenhuys
On 2016-12-23 18:04, Sven Barth wrote:
> E.g.
> opening a log file in 10 seconds vs nearly none make a difference 

Again, it depends on the tool (editor) you use. Both jEdit and EditPad
Pro (implement in Delphi) uses regex for syntax highlighting. EditPad
Pro also uses it for file navigation, syntax highlighting tools output,
output/code navigation etc. Both can handle massive text files and both
open them instantly and everything is highlighted from the word go. No
idea how they accomplish that, but that's another story. ;-)

Regards,
  Graeme

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Marco van de Voort
In our previous episode, Graeme Geldenhuys said:
> For many other things, plain code could be faster, but often a lot more
> effort and time consuming to implement. Where as you could have written
> a regex expression in under 10 seconds and accomplish the same task 8
> lines of code or less - very little effort required.

Writing or even worse, reading/debugging regex is about the most intensive
effort there is IMHO.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Sven Barth
On 23.12.2016 18:46, Graeme Geldenhuys wrote:
> On 2016-12-23 13:06, Sven Barth wrote:
>> Regular expressions usually have a higher overhead 
> 
> That is not always a given.

You are aware that I wrote "usually" there?

> For many other things, plain code could be faster, but often a lot more
> effort and time consuming to implement. Where as you could have written
> a regex expression in under 10 seconds and accomplish the same task 8
> lines of code or less - very little effort required.

But sometimes the effort vs performance trade of is worth it. E.g.
opening a log file in 10 seconds vs nearly none make a difference (as I
said, I don't remember the exact speed up anymore, but it was
significant; but also not the only problematic point as originally the
opening of a large enough log file took minutes :P ).

Regards,
Sven

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Graeme Geldenhuys
On 2016-12-23 13:06, Sven Barth wrote:
> Regular expressions usually have a higher overhead 

That is not always a given.

I remember years back we had a similar discussion, but then about syntax
highlighting large code units. eg: The the large OSX related unit in FPC
(can't remember how many MB's in size it was). Lazarus performed okay
syntax highlighting that, but other editors didn't. Everybody was told
that Lazarus did so well, because it "understood the code and syntax".
jEdit, a Java based editor, implements all it's syntax highlighting
(100's of them) all via regex. jEdit was extremely fast, even on that
very large OSX related unit. Even when you jump from the beginning of
the file straight to the end.

For many other things, plain code could be faster, but often a lot more
effort and time consuming to implement. Where as you could have written
a regex expression in under 10 seconds and accomplish the same task 8
lines of code or less - very little effort required.

Regards,
  Graeme

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread greim

Hey Kids,

why so complicated?

Good old Niklaus Wirth has already everything done for you:
I have to cite one sentence on the last slide at his birthday colloquium:

"Reducing size and complexity is the triumph"

So READ is already quite clever, it doesn't care about whitespaces, 
carriage returns and linefeeds :



PROGRAM readline;

VAR a  : ARRAY[0..100] OF double;
infile : TEXT;
lauf, lauf2 : longint;

BEGIN
lauf := 0;
assign(infile, 'infile.txt');
reset(infile);
WHILE NOT(eof(infile)) DO

BEGIN
read(infile, a[lauf]);
inc(lauf);
END;

close(infile);

FOR lauf2 := 0 TO pred(lauf) DO

BEGIN
writeln('Index : ', lauf2, ' Value : ', a[lauf2]);
END;


END.


And here infile.txt:

 123.4   55.2 33.1 4
 12.1 1.1
1 2 3 4
333.888 444.555

Regards

Markus


















Am 23.12.2016 um 14:06 schrieb Sven Barth:

Am 23.12.2016 12:54 schrieb "Graeme Geldenhuys"
mailto:mailingli...@geldenhuys.co.uk>>:


On 2016-12-23 08:14, Bo Berglund wrote:
> Is there a quick way to split a string of whitespace separated values
> into the separate members?


That problem is perfectly suited for regular expressions. And a rather
simple one at than. The FPC's FCL packages include a regex unit too
which should suite your needs.


http://www.regex101.com/

http://www.regexplained.co.uk/

http://regex.info/
  Even the trial book (first chapter only) of "Mastering Regular
  Expressions" is invaluable for users new to regex. And will
  explain all you need to know to solve your problem.



Regular expressions usually have a higher overhead however (as you might
have noticed, Bo timed his code later on).
For example at work I changed a regular expression based parser for the
lines of a log file to a simpler one and the speedup was noticeable (I
don't have exact numbers anymore however).

Regards,
Sven



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Sven Barth
Am 23.12.2016 12:54 schrieb "Graeme Geldenhuys" <
mailingli...@geldenhuys.co.uk>:
>
> On 2016-12-23 08:14, Bo Berglund wrote:
> > Is there a quick way to split a string of whitespace separated values
> > into the separate members?
>
>
> That problem is perfectly suited for regular expressions. And a rather
> simple one at than. The FPC's FCL packages include a regex unit too
> which should suite your needs.
>
>
> http://www.regex101.com/
>
> http://www.regexplained.co.uk/
>
> http://regex.info/
>   Even the trial book (first chapter only) of "Mastering Regular
>   Expressions" is invaluable for users new to regex. And will
>   explain all you need to know to solve your problem.
>

Regular expressions usually have a higher overhead however (as you might
have noticed, Bo timed his code later on).
For example at work I changed a regular expression based parser for the
lines of a log file to a simpler one and the speedup was noticeable (I
don't have exact numbers anymore however).

Regards,
Sven
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Graeme Geldenhuys
On 2016-12-23 08:14, Bo Berglund wrote:
> Is there a quick way to split a string of whitespace separated values
> into the separate members?


That problem is perfectly suited for regular expressions. And a rather
simple one at than. The FPC's FCL packages include a regex unit too
which should suite your needs.


http://www.regex101.com/

http://www.regexplained.co.uk/

http://regex.info/
  Even the trial book (first chapter only) of "Mastering Regular
  Expressions" is invaluable for users new to regex. And will
  explain all you need to know to solve your problem.


Regards,
  Graeme

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Howard Page-Clark

On 23/12/16 08:14, Bo Berglund wrote:

Is there a quick way to split a string of whitespace separated values
into the separate members?
It is possible that a custom string parser (something along these lines) 
might improve your processing speed:


type
TDoubleArray = array of Double;

function StrToDblArray(const aString: string): TDoubleArray;
var
  c: Char;
  prevNumeric: boolean = False;
  sNum: string = '';
  number: double;

  function IsNumeric: boolean; inline;
  begin
Exit(c in ['.', '0'..'9']);
  end;

begin
  SetLength(Result, 0);
  for c in aString do begin
case IsNumeric of
  False: if prevNumeric then begin
if TryStrToFloat(sNum, number) then begin
  SetLength(Result, Length(Result) + 1);
  Result[High(Result)]:=number;
end;
sNum:='';
prevNumeric:=False;
 end;
  True: begin
  sNum:=sNum + c;
  if not prevNumeric then
prevNumeric:=True;
end;
end;
  end;
  if prevNumeric and TryStrToFloat(sNum, number) then begin
SetLength(Result, Length(Result) + 1);
Result[High(Result)]:=number;
  end;
end;
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Bo Berglund
On Fri, 23 Dec 2016 10:04:09 +0100, Gabor Boros
 wrote:

>2016. 12. 23. 9:14 keltezéssel, Bo Berglund írta:
>> Is there a quick way to split a string of whitespace separated values
>> into the separate members?

>   SL.DelimitedText:='   0.0000.0007.0000.000  29.6628';

Thanks,
I did not know that one could do this and get away with it. Believed
one had to set the delimiter first and since it is varying number of
spaces it would not work.
But it seems like it does work!

I applied your method by removing all the code for handling this and
used only the following:
{code}
   ReadLn(F, sLine);
   slDecode.DelimitedText := sLine;
   

{code}

I timed my original code for a file of some 66+ lines to 9.9s.
Result:
Original code takes 9.9 s to process the file.
Modified code takes 4.4 s

And I checked with the file containing the extra spaces and varying
size of whitespace. It too was processed correctly.

Thanks again!

-- 
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Gabor Boros

2016. 12. 23. 9:14 keltezéssel, Bo Berglund írta:

Is there a quick way to split a string of whitespace separated values
into the separate members?


Hi,

I don't know quick or not...

program Project1;

uses Classes;

var
  SL:TStringList;
  i:Integer;

begin
  SL:=TStringList.Create;
  SL.DelimitedText:='   0.0000.0007.0000.000  29.6628';
  for i:=0 to SL.Count-1 do
   begin
 WriteLn('*'+SL.Strings[i]+'*');
   end;
  ReadLn;
end.

Gabor
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


[fpc-pascal] How to split file of whitespace separated numbers?

2016-12-23 Thread Bo Berglund
Is there a quick way to split a string of whitespace separated values
into the separate members?
I have to create a function to process a number of big data files
where
numbers are stored in lines of 4-6 values using whitespace inbetween.
First I got a sample looking like this:
{code}
0.41670.3636-14.1483227.2260
{code}
Here the separators were 4 spaces so on each line I used (slDecode is
a TStringList):
{code}
  sLine := StringReplace(sLine, '', #13, [rfReplaceAll]);
  slDecode.Text := sLine;
{code}
Worked fine if a bit slow...
The stringlist items are then passed to a string to float function and
stored into a dynamic array.

But then it failed on a file containing lines like this:
{code}
   0.0000.0007.0000.000  29.6628
{code}
Here there are 3 leading spaces plus one separator is only 2 spaces
wide. So I had to modify the code:
{code}
  sLine := Trim(sLine);
  sLine := StringReplace(sLine, '', #13, [rfReplaceAll]);
  sLine := StringReplace(sLine, '  ', #13, [rfReplaceAll]);
  slDecode.Text := sLine;
{code}

This works in this case but now I realize I need something better,
which can deal with varying number of whitespace chars inbetween
numbers.
The test files are very big, like half a million lines and up, so I
cannot introduce a lot of code in the loop since processing time will
increase.

Is there any good and quick way to extract real data from a space
separated list without knowing beforehand the size of the whitespace
separators?

I guess that my next sample problem will be a file with TAB rather
than space or even mixed TAB and space...

-- 
Bo Berglund
Developer in Sweden

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal