Hälsningar/Regards/Grüsse,
P.O. Jonsson
oor...@jonases.se

Hello again Erich,

I know executing other peoples code can be a p.i.t.a. but please give it a try. 
Se it as a golden opportunity to stress test ooRexx :-)

> Am 28.06.2017 um 21:21 schrieb Erich Steinböck <erich.steinbo...@gmail.com>:
> 
> Please download the complete test set and let it run and
> I neither have a Mac nor do I have 50 GB of memory

I can share my machine over remote logon if that would help or we can try to 
look at it using a shared screen. You do not need much memory to run the 
program, 5 GB is more than sufficient for ONE instance of the program, and that 
is enough to simulate the problem.

> 
> I had a REPRODUCIBLE scenario where this problem occurs
> Out of 1200 or so runs it was only this single run that produced memory 
> bloating 
> see if you can reproduce the memory problem
> Can you explain the problem in more detail? What exactly happens when you run 
> which command with what arguments? What are you expecting to happen instead 
> and why?

The problem is that the program tr.rex gets stuck in the routine split_data (in 
the main loop when I break it)) or in sort_data (on the ~Stablesort, 
presumably) for 1000 times longer for certain intermediate data (read below) 
than for other. It is not so much more data compared to other runs that I would 
expect this memory load. While being in one of these routines the memory 
allocation for the rexx process goes up and up and up until you have no more 
memory (and start to swap). At the beginning the memory allocated to the rexx 
process is negligible so you can try it with any memory that runs.
 
> 
> it finished in 7 hours 1200 individual ooRexx processes
> What does "1200 individual ooRexx processes" mean? Are you starting you 
> program with 1200 different sets of arguments? Sequentially or in parallel? 
> Which one of the programs shows the issue? Is it always the same one?

In order to use all cores/threads on my machine I use a bash shell script to 
launch/spawn up to at most 24 instances of the same program in parallel, 
running on the same data but with different parameters, producing different 
intermediate data files (_RAW files) that are read and processed in Split_data 
and handed over to Sort_data. When one chunk of data is processed that process 
finishes (tr.rex exits) and another one is started to do the same over and over 
again up to around 1200 individual runs for one batch. There is only one rexx 
program and the problem only arises for specific parameters in combination with 
specific input data. I have provided you two examples, one that runs like a 
charm and another one that never finishes.

> why is the interpreter not warning me when I overwrite an object with a 
> string?
> You're not overwriting an object with a string, you're changing a variable 
> from referring to one object to referring to another one.  That's totally 
> normal .. similar to coding a = 1; a = 2;

I don't think this is normal but never mind, I never liked objects anyway :-) 
When I started using Rexx the credo was  „Everything is a string“. And I am 
still in the habit of programming like that, hence the code you see before you.

In the past (4.1, 4.2? If I did a say myMutableBuffer it reported „A Mutable 
Buffer“ or something, nowadays I get the value stored in the MB. Is there a way 
to check what kind of object you are referring to? A ~whatAreYou method. Useful 
when you look for mistakes in your code (I occasionally write imperfect code, 
unfortunately).

> 
> On Tue, Jun 27, 2017 at 10:26 PM, P.O. Jonsson <oor...@jonases.se 
> <mailto:oor...@jonases.se>> wrote:
> "maybe it is just bad programming“
> 
> I guess I had it coming…
> 
> Thanks Erich for your advice, I will consider it all, but my intention with 
> this report was another one; for the first time I had a REPRODUCIBLE scenario 
> where this problem occurs. Out of 1200 or so runs it was only this single run 
> that produced memory bloating so my assumption was that is was not ONLY :-) 
> bad programming.
> 
> Please download the complete test set and let it run and see if you can 
> reproduce the memory problem I have. If so it is easy for you to just improve 
> the code and see where the problem goes away. I have a feeling I am stuck at 
> 
> a = a~StableSort
> 
> For quite some time, maybe because of unfavorable data. But I can´t tell for 
> sure.
> 
> PS I had the program run again overnight, it finished in 7 hours 1200 
> individual ooRexx processes  with no problem. In another run I am now at 53 
> GB in a single process running at 100% CPU for 10 hours.
> 
> Question on Mutable Buffers (there is a lot of *NEW* there): I understand I 
> need to ~append or ~insert for the MB but why is the interpreter not warning 
> me when I overwrite an object with a string? Why is that not an error? Is 
> there a reason why it should be allowed to destroy an object like I did?
> 
> Hälsningar/Regards/Grüsse,
> P.O. Jonsson
> oor...@jonases.se <mailto:oor...@jonases.se>
> 
> 
> 
> 
>> Am 27.06.2017 um 17:15 schrieb Erich Steinböck <erich.steinbo...@gmail.com 
>> <mailto:erich.steinbo...@gmail.com>>:
>> 
>> maybe it is just bad programming
>> Hi P.O.,
>> I had a look at Split_data and as far as I can see there are a lot of things 
>> which can be improved.
>> 
>> 1)
>> 
>> You may want to re-read how to work with a MutableBuffer.  E. g.
>> 
>>   tempMB          = .mutablebuffer~new('')
>>   do while ..
>>     tempMB = qfileIn~linein
>> 
>> Initializing a variable with a MutableBuffer instance, and afterwards 
>> assigning it a String (linein() resturns a String) doesn't make sense.
>> 
>> I can see quite a few instances of this, e. g.
>> 
>>   TranslatedMB    = .mutablebuffer~new('')
>>   do while ..
>>     DO i=1 TO i_End
>>       DO j=1 TO j_End
>> 
>>           TranslatedMB = TranslatedMB TranslateWordMB
>> 
>> Again, the final TranslatedMB assignment is not what the ..MB ending of the 
>> variables suggest.
>> 
>> 2) 
>> 
>> You might move invariant stuff (here: LeftWordsMB~Word(i) || '-') in an 
>> inner loop outside the loop, e.g.
>> 
>>       DO j=1 TO j_End
>>         TranslateWordMB = LeftWordsMB~Word(i) || '-' || RightWordsMB~Word(j)
>> 
>> 
>> 3)
>> 
>> Consider using use a single startsWith() instead of the code between lines 
>> 448 and 485
>> 
>> 4)
>> 
>>         IF TranslatedMB~WordPos(TranslateWordMB) > 0 THEN
>>         ..
>>         ELSE
>>         DO
>>           TranslatedMB = TranslatedMB TranslateWordMB
>> 
>> Instead of building a long string of all things seen before, and checking 
>> with wordPos(), you might instead put all things seen into a Set and check 
>> with hasIndex()
>> 
>> 5)
>> 
>> Generally, using Arrays may be more efficient if you can save the Stem.0 
>> handling
>> But then, using the proper type of Collection and appropriate algorithm may 
>> help much more
>> To give suggestions for that, I'd need more detail would on what exactly you 
>> would like to achieve 
>> 
>> On Tue, Jun 27, 2017 at 7:55 AM, P.O. Jonsson <oor...@jonases.se 
>> <mailto:oor...@jonases.se>> wrote:
>> Dear developers,
>> 
>> I have had the memory bloating problem again, this time I reached 48 GB (the 
>> maximum for one CPU in my machine) and the process only ended after some 13 
>> CPU hours with 100% CPU the whole time.
>> 
>> 
>> 
>> 
>> From the logging info I could confirm that the program was stuck somewhere 
>> here most of the time, here are the rough steps
>> 
>> Language pairs detected in C routine -> External call, no memory bloating
>> Data processing finished after 2107 Seconds 00:58:12
>> Splitting finished after 49487 Seconds 14:42:59      -> Routine Split_data
>> Sorting finished after 16527 Seconds 19:18:27        -> Routine Sort_data
>> Processing of Data file finished after 68123 Seconds
>> Writing the Logfile TR_DE-EN-eu_logfile.txt 26 Jun 2017 19:18:28
>> 
>> I have enclosed the Routines in question.
>> 
>> In my dropbox I have stored the complete program with some test data to 
>> replicate the processing, the problem is reproducible. Just put the folder 
>> somewhere, move there and perform the command indicated.
>> 
>> https://www.dropbox.com/sh/vettlcb4f8ae3cw/AACWIQivo_F2KhhytJ6izkbFa?dl=0 
>> <https://www.dropbox.com/sh/vettlcb4f8ae3cw/AACWIQivo_F2KhhytJ6izkbFa?dl=0>
>> 
>> I run Open Object Rexx Version 5.0.0, Build date: May 20 2017, Addressing 
>> mode: 64
>> Hardware Mac Pro with dual-CPU Xeon Processors running Mac OS Sierra 10.12.5
>> 
>> PS as I was making the screenshot the process finished nicely, no crash or 
>> anything and the memory was released. So maybe it is just bad programming, 
>> but at least you can confirm that then :-)
>> 
>> 
>> 
>> 
>> Hälsningar/Regards/Grüsse,
>> P.O. Jonsson
>> oor...@jonases.se <mailto:oor...@jonases.se>
>> 
>> 
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>> http://sdm.link/slashdot <http://sdm.link/slashdot>
>> _______________________________________________
>> Oorexx-devel mailing list
>> Oorexx-devel@lists.sourceforge.net 
>> <mailto:Oorexx-devel@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/oorexx-devel 
>> <https://lists.sourceforge.net/lists/listinfo/oorexx-devel>
>> 
>> 
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org <http://slashdot.org/>! 
>> http://sdm.link/slashdot_______________________________________________ 
>> <http://sdm.link/slashdot_______________________________________________>
>> Oorexx-devel mailing list
>> Oorexx-devel@lists.sourceforge.net 
>> <mailto:Oorexx-devel@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/oorexx-devel 
>> <https://lists.sourceforge.net/lists/listinfo/oorexx-devel>
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot 
> <http://sdm.link/slashdot>
> _______________________________________________
> Oorexx-devel mailing list
> Oorexx-devel@lists.sourceforge.net <mailto:Oorexx-devel@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel 
> <https://lists.sourceforge.net/lists/listinfo/oorexx-devel>
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! 
> http://sdm.link/slashdot_______________________________________________
> Oorexx-devel mailing list
> Oorexx-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oorexx-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to