Re: FRE may run out of memory
Richard Biener-2 wrote On Fri, Feb 14, 2014 at 3:50 AM, dxq lt; ziyan01@ gt; wrote: What compiler version did you check? I think that 4.8 has improvements for 1. and 2. (SMS is unmaintained). Note that we only spent time to make -O1 behave sanely with extremely large functions. Finally I'd suggest you open a bugreport and attach a testcase to it that exposes the issues you list. Richard. hi Richard, We are working on gcc-4.7.1. seems that these issues have been fixed in gcc-4.8.3. BTW, we notice that Obstack is used in LIM for memory management, which is really a nice way, and also, we used Ostack to solve the GGC problem in our SMS-UNROLL framework, we put all backup into the Obstack, so it's safe now. http://gcc.1065356.n5.nabble.com/A-GGC-related-question-td988400.html Thanks! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578p1012489.html Sent from the gcc - patches mailing list archive at Nabble.com.
Re: FRE may run out of memory
Richard Biener-2 wrote On Sat, Feb 8, 2014 at 8:29 AM, dxq lt; ziyan01@ gt; wrote: hi all, We found that gcc would run out of memory on Windows when compiling a *big* function (10 lines). More investigation shows that gcc crashes at the function *compute_avail*, in tree-fre pass. *compute_avail* collects information from basic blocks, so memory is allocated to record informantion. However, if there are huge number of basic blocks, the memory would be exhausted and gcc would crash down, especially for Windows PC, only 2G or 4G memory generally. It's ok On linux, and *compute_avail* allocates *2.4G* memory. I guess some optimization passes in gcc like FRE didn't consider the extreme case. This was fixed for GCC 4.8, FRE no longer uses compute_avail (but PRE still does). Basically GCC 4.8 should (at -O1) compile most extreme cases just fine. Richard. hi, Richard, More investigation shows that 1, loop related passes take more compiling time and memory, especially pass_rtl_move_loop_invariants, lim, and at least lim on tree will impact a lot to the following passes. 2, ira will take more than 20g memory in function *create_loop_tree_nodes*, because ira chooses 'mixed' or 'all' region when optimize level. 3, sms pass always creats ddgs for all loops in compiled function, then does sms optimization for all loops, and finally frees ddgs. If there are huge number of loops, sms may crash when creating ddgs because of running out of memory. The passes above , should someone confirm about memory pressure problem? Thanks for your reply! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578p1011035.html Sent from the gcc - patches mailing list archive at Nabble.com.
FRE may run out of memory
hi all, We found that gcc would run out of memory on Windows when compiling a *big* function (10 lines). More investigation shows that gcc crashes at the function *compute_avail*, in tree-fre pass. *compute_avail* collects information from basic blocks, so memory is allocated to record informantion. However, if there are huge number of basic blocks, the memory would be exhausted and gcc would crash down, especially for Windows PC, only 2G or 4G memory generally. It's ok On linux, and *compute_avail* allocates *2.4G* memory. I guess some optimization passes in gcc like FRE didn't consider the extreme case. When disable tree-fre pass, gcc crashes at IRA pass. I will do more investigation about that. Any suggestions? Thanks! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/FRE-may-run-out-of-memory-tp1009578.html Sent from the gcc - patches mailing list archive at Nabble.com.
memory leak in reorg_loops
hi, In hw-doloop.c, is there a memory leak? void reorg_loops (bool do_reorder, struct hw_doloop_hooks *hooks) { hwloop_info loops = NULL; hwloop_info loop; bitmap_obstack loop_stack; df_live_add_problem (); df_live_set_all_dirty (); df_analyze (); *bitmap_obstack_initialize (loop_stack);* if (dump_file) fprintf (dump_file, ;; Find loops, first pass\n\n); loops = discover_loops (loop_stack, hooks); if (do_reorder) { reorder_loops (loops); free_loops (loops); if (dump_file) fprintf (dump_file, ;; Find loops, second pass\n\n); loops = discover_loops (loop_stack, hooks); } for (loop = loops; loop; loop = loop-next) scan_loop (loop); /* Now apply the optimizations. */ for (loop = loops; loop; loop = loop-next) optimize_loop (loop, hooks); if (dump_file) { fprintf (dump_file, ;; After hardware loops optimization:\n\n); dump_hwloops (loops); } free_loops (loops); if (dump_file) print_rtl (dump_file, get_insns ()); } valgrind checking shows: ==18622== 1,479,296 bytes in 364 blocks are definitely lost in loss record 559 of 559 ==18622==at 0x4006ADD: malloc (vg_replace_malloc.c:291) ==18622==by 0x8C0A9D5: xmalloc (xmalloc.c:147) ==18622==by 0x910457: _obstack_begin (in /lib/libc-2.5.so) ==18622==by 0x81EDD24: bitmap_obstack_initialize (bitmap.c:318) ==18622==by 0x8B22BBE: reorg_loops (hw-doloop.c:635) ... ... ==18622==by 0x8688B3E: rest_of_handle_machine_reorg (reorg.c:4183) ==18622==by 0x861D2A6: execute_one_pass (passes.c:2097) ==18622==by 0x861D6A0: execute_pass_list (passes.c:2152) ==18622==by 0x861D6BC: execute_pass_list (passes.c:2153) ==18622==by 0x861D6BC: execute_pass_list (passes.c:2153) should loop_stack be freed at the end of reorg_loops? please confirm! if it's true, commit for me, thanks! free_loops (loops); + bitmap_obstack_release (loop_stack); if (dump_file) print_rtl (dump_file, get_insns ()); thanks! dxq -- View this message in context: http://gcc.1065356.n5.nabble.com/memory-leak-in-reorg-loops-tp999219.html Sent from the gcc - patches mailing list archive at Nabble.com.
questions about COND_EXEC and SMS
hi all, *We found that COND_EXEC is better than IF_THEN_ELSE when used as expressing condition move insns, because in sched, IF_THEN_ELSE insn has a dependence on itself, and COND_EXEC has not. * Besides, IF_THEN_ELSE is not good for SMS. some backend (frv) expands condition move as IF_THEN_ELSE pattern before reload and splits IF_THEN_ELSE into COND_EXEC at split2 pass, which is a post reolad pass. However, in SMS pass(pre-reload), we can't get the accurate information of IF_THEN_ELSE insns. * However, as far as i know, COND_EXEC is not supporting in pre-reload passes. So, I'm asking for some suggestions for supporiting COND_EXEC in pre-reload passes, for me, maybe from split1 to reload pass is good enough. what work need to do for supporting that, and is there any one who has done any work on that? thanks! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/questions-about-COND-EXEC-and-SMS-tp992591.html Sent from the gcc - patches mailing list archive at Nabble.com.
A GGC related question
hi, I'm doing a work to make unroll, doloop, and sms pass work together as following way: * before the first unroll pass, duplicate all global information such as insn chain and CFG as backup. * unroll with factor = 1, go on to finish sms, and record the result of swp, ii, loop count etc * go back unroll pass, discard the all global information, using the backup rerun again to sms with unroll factor = 2, and record the result of swp. * repeat above steps with unroll factor = 4. * ok, it's time to decide which factor is the best one, and rerun with it, done. we have implemented it, and it works well. but, if the compiled file is too big, gcc would carsh down with ICE. we find out that copying consumes lots of memories so that GGC purge the backup. we have done some try: * adjust the ggc_min_heap and ggc_min_expand, and disable ggc collect while doing sms unroll, but it can not be the optimal way. * put the backup in the pool, but it doesn't work. is pool ok for GTY data structs like rtx, basic_block, edge ? if not, why? is there any way to make the backup insn chain not touched by the GGC? * force to ggc collect at the end of tree_rest_of_compilation, but if inlining happens, it crashes down. if the functions are independent, is it safe to force to do GGC after tree_rest_of_compilation? Brs, Thanks! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/A-GGC-related-question-tp988400.html Sent from the gcc - patches mailing list archive at Nabble.com.
Re: A GGC related question
fixing SMS, do you mean that we only modify the SMS pass? if so, the problem we have to solve: * how to make unroll and sms work together? calling unroll pass in sms, but it would be needed more passes such as web, and it's perfect to rerun all the passes between unroll and sms. * unroll and web pass exsit in gcc, however gcc's passes only work for a compilation unit, function, rather than a smaller unit we expect, loop. that's why we copy all global information, and rerun the passes between the unroll and sms sevral times. * if we need try more unroll factors, copying is also needed. the backup exsits in single pass, so it would not be purged by GGC. but, if memory consuming is huge, is there any risk for the other passes? from my experience, when disable GGC between unroll and sms, with ggc_min_expand = 100 ggc_min_heap = 20480, and compile a big file, gcc crashes down. That's what I can think of. you know, it's a very big and hard work. do you have any suggestions about our current solution? thanks for your reply! danxiaoqiang -- View this message in context: http://gcc.1065356.n5.nabble.com/A-GGC-related-question-tp988400p988645.html Sent from the gcc - patches mailing list archive at Nabble.com.