Dear Erich,

I guess you have more pressing matters and I think proper testing may need to 
wait until I have applied the fix but just for comparison I implemented the 
read method you proposed, I hope this is the way to do it:

  qfileIn = .stream~new(MyFileName)
  myArray = qfileIn~arrayIn()
  qfileIn~Close

I compared to using a loop & a Mutable Buffer (first lines below) and what you 
say is correct for small and medium sized input, it is 7-10 times faster to use 
the ~makeArray() method.

However, for larger files (>1 million lines) the timing turns to the 
disadvantage of ~makeArray(), it is roughly three times SLOWER than my looping 
and using ~lineIn

Small InputFile ez_raw.txt
File read using MB, 5916 items read in 75 ms
File read using ~arrayIn(), 5916 items read in 7 ms

Medium InputFile FR-DE_dict.txt
File read using MB, 68859 items read in 808 ms
File read using ~arrayIn(), 68859 items read in 183 ms

Large InputFile ab_raw.txt
File read using MB, 1135791 items read in 13093 ms
File read using ~arrayIn(), 1135791 items read in 39235 ms

Only a rough test on a single machine, not a scientific study :-)

PS reading char by char is what my C program does so the charin method would 
come close to my values but beating C will be difficult for simple file reads. 
Also not the point, I need a robust and predictable reading. I will compare to 
4.2 when I have some more free time.

Hälsningar/Regards/Grüsse,
P.O. Jonsson
[email protected]




> Am 06.07.2017 um 18:59 schrieb Erich Steinböck <[email protected]>:
> 
> For best performance, use stream~arrayIn(), which is by far preferable
> That was meant in comparison with DO WHILE qfileIn~lines <> 0 and 
> stream~lines("normal") or DO stream~lines()or SIGNAL ON NOTREADY
> 
> using charin to read the entire file into a string followed by a makearray on 
> the string offered the best performance
> Yes, that's still true, though if we are taking about preserving memory, it 
> might use more memory than arrayIn (untested). Also, as we were discussing 
> how to reduce read-times from 50 minutes or 50 seconds, down to 1 second 
> (arrayIn), then the additional 0.5 seconds you can save with charIn probably 
> won't really matter so much anymore :-)
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! 
> http://sdm.link/slashdot_______________________________________________
> Oorexx-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Oorexx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to