On Saturday, September 16, 2017 at 6:18:17 AM UTC-7, Simon King wrote:
>
> Hi Thierry, 
>
> On 2017-09-16, Thierry <sage-goo...@lma.metelu.net <javascript:>> wrote: 
> > How does raw "exec" behaves with your large file ? 
> > 
> > sage: with open('your_file.txt') as f: 
> > ....:     exec(preparse(f.read())) 
>
> Time for my preparser that translates the gap readable into 
> Python readable data is about 3 minutes. The result is a string 
> s that I am using in the following. 
>
> Time for Sage's preparser: 
> sage: %time sp = preparse("D = "+s) 
> CPU times: user 4min 33s, sys: 1.08 s, total: 4min 34s 
> Wall time: 4min 34s 
>   
> The attempt to exec(sp) soon exhausted 15 GB, Ctrl-C didn't work, 
> and my laptop swapped so badly that I couldn't even open a new 
> terminal to do "pkill python". 
>
> (if you're in a terminal you have a chance that "ctrl-Z" does work, which 
stops the process. You can then "kill %1" or something like that)

I've tried some benchmarks along the lines of

sage: %time s=preparse("s="+(dict((i,i) for i in range(100000))).__str__())
CPU times: user 1.13 s, sys: 6.23 ms, total: 1.14 s
Wall time: 1.13 s
sage: %time exec(s)
CPU times: user 773 ms, sys: 98.1 ms, total: 871 ms
Wall time: 864 ms

The main thing that I noticed was that execution time (in both steps) 
scales about linearly in the length, so the python parser seems to be using 
an algorithm of the right order.
Note that "exec" will *compile* the expression (i.e., write a straight-line 
program for the cpython virtual machine to assemble the data structure) and 
then execute it, so it needs at least twice the memory.

Running the code doesn't seem to be the bottleneck, though:

sage: %time s=preparse("def L(): return "+(dict((i,i) for i in 
range(100000))).__str__())
CPU times: user 1.15 s, sys: 20.3 ms, total: 1.17 s
Wall time: 1.16 s
sage: %time exec(s)
CPU times: user 772 ms, sys: 80.7 ms, total: 853 ms
Wall time: 857 ms
sage: %time v=L()
CPU times: user 47.4 ms, sys: 5.4 ms, total: 52.8 ms
Wall time: 45.9 ms

I don't know if the performance above indicates acceptable behaviour for 
your use case. If it doesn't then it would seem the python parsing strategy 
is just not suitable for your application. If it does, then there is 
something else that's slow. Perhaps running the code exposes some 
inefficient constructors in sage?

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to