[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Comment #11 from eric dot weddington at atmel dot com 2010-01-29 17:07 --- Setting Target Milestone. -- eric dot weddington at atmel dot com changed: What|Removed |Added Target Milestone|--- |4.5.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Comment #9 from hutchinsonandy at gcc dot gnu dot org 2009-12-13 21:03 --- Subject: Bug 23726 Author: hutchinsonandy Date: Sun Dec 13 21:03:41 2009 New Revision: 155195 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=155195 Log: PR target/23726 * config/avr/predicates.md (pseudo_register_operand): New predicate for pseudos. * config/avr/avr.md (divmodqi4): Replace with define_insn_and_split to allow div/mod optimization. (udivmodqi4): Ditto. (divmodhi4): Ditto. (udivmodhi4): Ditto. (divmodsi4): Ditto. (udivmodsi4): Ditto. Modified: trunk/gcc/ChangeLog trunk/gcc/config/avr/avr.md trunk/gcc/config/avr/predicates.md -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Comment #10 from hutchinsonandy at gcc dot gnu dot org 2009-12-13 21:05 --- Fixed 4.5 -- hutchinsonandy at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Comment #8 from bjoern dot m dot haase at web dot de 2007-12-12 18:14 --- Created an attachment (id=14738) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14738action=view) patch My new analysis leads me to the result that the key problem of this missed optimization is a target problem: Presently we expand a divmod4 pattern that implements the support funciton calls directly. This pattern refers directly to hard registers. The resulting RTL is too complex for CSE and we don't identify this optimization. The attached patch uses the same RTL for the support function calls. Only this RTL is now generated only after combine by a define_insn_and_split pattern. The insn part of this pattern is valid only for pseudos so that until split. In order to assure this, I have added a new predicate. I have tested the patch against the atmega128 simulation target without regressions. 2007-12-11 Bjoern Haase [EMAIL PROTECTED] PR target/23726 * config/avr/predicates.md (pseudo_register_operand): Add new predicate for pseudos * config/avr/avr.md (divmodqi4,udivmodqi4): replace define_expand by define_insn_and_split, delay expansion of call patterns to split pass. (divmodhi4,udivmodhi4,divmodsi4,udivmodsi4): likewise. -- bjoern dot m dot haase at web dot de changed: What|Removed |Added Attachment #9665 is|0 |1 obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Comment #7 from pinskia at gcc dot gnu dot org 2005-11-02 17:16 --- All P1 enhancements not targeted towards 4.1, moving to P5. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Priority|P1 |P5 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
-- pinskia at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2005-10-25 20:32:02 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From bjoern dot m dot haase at web dot de 2005-09-06 07:52 --- Do you know of any doc on how libcall block RTL is supposed to look like (except from e.g. code reading optabs.c)? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From bjoern dot m dot haase at web dot de 2005-09-06 12:59 --- I have done some code reading now and come to the following conclusion: When having an expanded sequence that provides one single result, libcall blocks are an appropriate method for making sure that a single-set insn that carries a REG_EQUAL note is *not* deleted too early. Libcall notes, however, do not provide a method so far for dealing with library calls or expanded sequences that yield *two* results. E.g. they are no solution for both, divmod4 on the one hand and arithmetic expanders doing subreg lowering and also yielding a CC on the other hand. In order to keep changes as small as possible, my suggestion is to change static void rest_of_handle_jump2 (void) such that it no longer calls delete_trivially_dead_insns () at it's very beginning. (I'd have posted a patch, if savannah.gnu.org is down right now.) IIUC that the trivially dead insn are in first line a performance issue because they need memory and handling that would otherwise not be necessary, my suggested change would not be too serious. delete_trivially_dead_insns () would be among the first things that is called in the pass that immediately follows jump2: The first cse pass. I'd appreciate comments on whether such trivially dead insn could prevent jump2 from realizing important optimization steps. Bjoern. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From bjoern dot m dot haase at web dot de 2005-09-05 06:59 --- IMO, one would not be able to handle the issue by changing CSE. E.g. I presently don't see how to avoid using register notes for the following situation. Imagine a target not having DImode operations so that DImode arithmetic needs to be lowered to a sequence of SImode operations. Imagine further that after the sequence the condition code contains useful information that we want to re-use. E.g a minus:DI operation would be expanded into two parallels like (parallel[ (set (subreg:SI (reg:DI operand0) 0) minus:SI ((subreg:SI (reg:DI operand1) 0) (subreg:SI (reg:DI operand2) 0 (set (reg:CCmode CC) (generate borrow))]) (parallel[ (set (subreg:SI (reg:DI operand0) 4) minus:SI ( (extract_borrow:SI (reg:CCmode CC)) (minus:SI ((subreg:SI (reg:DI operand1) 4) (subreg:SI (reg:DI operand2) 4) (set (reg:CCmode CC) (generate condition code))]) In order to describe what information is written to the CC register in the second parallel, one needs to refer to both, input parameters of the parallel for the lower 4 bytes and input parameters for the higher 4 bytes. E.g. the information that CC contains the result of a compare of operand1 and operand2 could therefore not be expressed in the RTL! One could add, however, a third instruction to the expanded sequence reading (set (reg:CCmode CC) (reg:CCmode CC)) - WITH ATTACHED REGISTER NOTE is equal to compare:DI (operand1) (operand2) where the REG_EQUAL note gives the required information. IMO this is a general issue relevant for all targets that are aiming to do subreg lowering of arithmetic and logic operations at expand and that wish to recycle the condition codes generated. Of course 8 bit targets like AVR that need to use subreg lowering for almost everything will benefit most :-). I agree that generally register notes are kind of ugly. But for this kind of information, I think that they could be useful. Yours, Bjoern -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-09-05 14:55 --- You most likely want to use a libcall blocks instead of the regnotes here. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From bjoern dot m dot haase at web dot de 2005-09-04 22:02 --- Created an attachment (id=9665) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9665action=view) Patch adding REG_EQUAL notes to the divmod4 expanders -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726
[Bug rtl-optimization/23726] Missed optimizations for divmod
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-09-04 22:33 --- Or even better add support for multiple sets in CSE and forget about adding notes. That will improve more targets than just AVR. -- What|Removed |Added GCC build triplet|unknown-x86_64-linux| GCC host triplet|unknown-x86_64-linux| Keywords||missed-optimization http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23726