Ping...
thanks, Cong On Fri, Sep 20, 2013 at 9:49 AM, Cong Hou <co...@google.com> wrote: > Any comment or more suggestions on this patch? > > > thanks, > Cong > > On Mon, Sep 9, 2013 at 7:28 PM, Cong Hou <co...@google.com> wrote: >> On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li <davi...@google.com> wrote: >>> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou <co...@google.com> wrote: >>>> First, thank you for your detailed comments again! Then I deeply >>>> apologize for not explaining my patch properly and responding to your >>>> previous comment. I didn't understand thoroughly the problem before >>>> submitting the patch. >>>> >>>> Previously I only considered the following three conversions for sqrt(): >>>> >>>> >>>> 1: (float) sqrt ((double) float_val) -> sqrtf (float_val) >>>> 2: (float) sqrtl ((long double) float_val) -> sqrtf (float_val) >>>> 3: (double) sqrtl ((long double) double_val) -> sqrt (double_val) >>>> >>>> >>>> We have four types here: >>>> >>>> TYPE is the type to which the result of the function call is converted. >>>> ITYPE is the type of the math call expression. >>>> TREE_TYPE(arg0) is the type of the function argument (before type >>>> conversion). >>>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision. >>>> It will be the type of the new math call expression after conversion. >>>> >>>> For all three cases above, TYPE is always the same as NEWTYPE. That is >>>> why I only considered TYPE during the precision comparison. ITYPE can >>>> only be double_type_node or long_double_type_node depending on the >>>> type of the math function. That is why I explicitly used those two >>>> types instead of ITYPE (no correctness issue). But you are right, >>>> ITYPE is more elegant and better here. >>>> >>>> After further analysis, I found I missed two more cases. Note that we >>>> have the following conditions according to the code in convert.c: >>>> >>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE) >>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0)) >>>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE) >>>> >>>> the last condition comes from the fact that we only consider >>>> converting a math function call into another one with narrower type. >>>> Therefore we have >>>> >>>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE) >>>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE) >>>> >>>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for >>>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with >>>> four possible combinations. Therefore we have two more conversions to >>>> consider besides the three ones I mentioned above: >>>> >>>> >>>> 4: (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) >>>> 5: (double) sqrtl ((long double) float_val) -> sqrt ((double) float_val) >>>> >>>> >>>> For the first conversion here, TYPE (float) is different from NEWTYPE >>>> (double), and my previous patch doesn't handle this case.The correct >>>> way is to compare precisions of ITYPE and NEWTYPE now. >>>> >>>> To sum up, we are converting the expression >>>> >>>> (TYPE) sqrtITYPE ((ITYPE) expr) >>>> >>>> to >>>> >>>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr) >>>> >>>> and we require >>>> >>>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2 >>>> >>>> to make it a safe conversion. >>>> >>>> >>>> The new patch is pasted below. >>>> >>>> I appreciate your detailed comments and analysis, and next time when I >>>> submit a patch I will be more carefully about the reviewer's comment. >>>> >>>> >>>> Thank you! >>>> >>>> Cong >>>> >>>> >>>> >>>> Index: gcc/convert.c >>>> =================================================================== >>>> --- gcc/convert.c (revision 201891) >>>> +++ gcc/convert.c (working copy) >>>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr) >>>> CASE_MATHFN (COS) >>>> CASE_MATHFN (ERF) >>>> CASE_MATHFN (ERFC) >>>> - CASE_MATHFN (FABS) >>>> CASE_MATHFN (LOG) >>>> CASE_MATHFN (LOG10) >>>> CASE_MATHFN (LOG2) >>>> CASE_MATHFN (LOG1P) >>>> - CASE_MATHFN (LOGB) >>>> CASE_MATHFN (SIN) >>>> - CASE_MATHFN (SQRT) >>>> CASE_MATHFN (TAN) >>>> CASE_MATHFN (TANH) >>>> + /* The above functions are not safe to do this conversion. */ >>>> + if (!flag_unsafe_math_optimizations) >>>> + break; >>>> + CASE_MATHFN (SQRT) >>>> + CASE_MATHFN (FABS) >>>> + CASE_MATHFN (LOGB) >>>> #undef CASE_MATHFN >>>> { >>>> tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0)); >>>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr) >>>> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) >>>> newtype = TREE_TYPE (arg0); >>>> >>>> + /* We consider to convert >>>> + >>>> + (T1) sqrtT2 ((T2) exprT3) >>>> + to >>>> + (T1) sqrtT4 ((T4) exprT3) >>> >>> Should this be >>> >>> (T4) sqrtT4 ((T4) exprT3) >> >> T4 is not necessarily the same as T1. For the conversion: >> >> (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) >> >> T4 is double and T1 is float. >> >> >>>> + >>>> + , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0), >>>> + and T4 is NEWTYPE. >>> >>> NEWTYPE is also the wider one between T1 and T3. >> >> Right. Actually this is easy to catch from the context just before >> this comment. >> >> tree newtype = type; >> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) >> newtype = TREE_TYPE (arg0); >> >> >> >> thanks, >> Cong >> >> >>> >>> David >>> >>>> All those types are of floating point types. >>>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion >>>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of >>>> + T2 and T4. See the following URL for a reference: >>>> + >>>> http://stackoverflow.com/questions/9235456/determining-floating-point-square-root >>>> + */ >>>> + if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL) >>>> + { >>>> + int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p; >>>> + int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p; >>>> + if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations) >>>> + break; >>>> + } >>>> + >>>> /* Be careful about integer to fp conversions. >>>> These may overflow still. */ >>>> if (FLOAT_TYPE_P (TREE_TYPE (arg0)) >>>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c >>>> =================================================================== >>>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891) >>>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy) >>>> @@ -44,11 +44,11 @@ __attribute__ ((noinline)) >>>> double >>>> sin(double a) >>>> { >>>> - abort (); >>>> + return a; >>>> } >>>> __attribute__ ((noinline)) >>>> float >>>> sinf(float a) >>>> { >>>> - return a; >>>> + abort (); >>>> } >>>> >>>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers <jos...@codesourcery.com> >>>> wrote: >>>>> On Wed, 4 Sep 2013, Xinliang David Li wrote: >>>>> >>>>>> > This patch submission still fails to pay attention to various of my >>>>>> > comments. >>>>>> > >>>>>> >>>>>> If you can provide inlined comments in the patch, that will be more >>>>>> useful and productive. >>>>> >>>>> I have explained things several times in this thread already. I see no >>>>> point in repeating things when what I would say has already been said and >>>>> ignored. As far as I can tell, this latest patch submission has taken one >>>>> line from the message it is in response to, and largely ignored the >>>>> following two paragraphs (including where I explicitly say that said line >>>>> should not appear literally in the source code at all). But, repeating >>>>> what I said before yet again: >>>>> >>>>> (but you should be referring to the relevant types >>>>> >>>>> The patch does not do properly that. It refers explicitly to >>>>> long_double_type_node and double_type_node. >>>>> >>>>> - "type", the type >>>>> being converted to, "itype", the type of the function being called in >>>>> the >>>>> source code, "TREE_TYPE (arg0)", the type of the argument after >>>>> extensions >>>>> have been removed, and "newtype", computed from those >>>>> >>>>> The patch only engages with "type". I suspect "newtype" is what it really >>>>> means there when using "type". When it uses long_double_type_node and >>>>> double_type_node, those should be "itype". >>>>> >>>>> - so you should have >>>>> expressions like the above with two or more of those four types, but not >>>>> with long_double_type_node directly). >>>>> >>>>> See above. The patch uses long_double_type_node directly, contrary to >>>>> what I explicitly said. You are free to disagree with something I say in >>>>> a review - but in that case you need to reply specifically to the review >>>>> comment explaining your rationale for disagreeing with it. Just ignoring >>>>> the comment and not mentioning the disagreement wastes the time of >>>>> reviewers. >>>>> >>>>> The patch submission will need to include a proper analysis to justify >>>>> to >>>>> the reader why the particular inequality with particular types from >>>>> those >>>>> four is correct in all cases where the relevant code may be executed. >>>>> >>>>> The comments only refer to "T1" and "T2" without explaining how they >>>>> relate to the original expression (three types) or the variables within >>>>> GCC dealing with various types (four types, because newtype gets >>>>> involved). I said back in >>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00161.html> and >>>>> <http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01384.html> that three types >>>>> are involved - when I say "the patch submission needs to include its own >>>>> analysis for the full generality of three types", again, it's >>>>> inappropriate for a patch to omit such an analysis without justification. >>>>> The patch submission needs to include an analysis that properly explains >>>>> the transformation involved and the conditions under which it is safe. >>>>> Maybe starting along the lines of: >>>>> >>>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3 >>>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point >>>>> square root function being used (ITYPE), T1 is TYPE and all these types >>>>> are binary floating-point types. We wish to optimize if possible to an >>>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is >>>>> narrower than T2. (Then explain the choice of T4 and the conditions under >>>>> which the transformation is safe, with appropriate references.) >>>>> >>>>> I suggest that for the next patch submission you (the patch submitter) >>>>> make sure you spend at least an hour properly understanding the issues and >>>>> all the previous messages in the thread and writing up the detailed, >>>>> coherent explanation of the transformation and analysis of the issues that >>>>> I asked for some time ago, as if for a reader who hasn't read any of this >>>>> thread or looked at this transformation in GCC before. I've certainly >>>>> spent longer than that on review in this thread. It's normal for a patch >>>>> involving anything at all tricky to take hours to write up (and also >>>>> normal for one to discover, in the course of writing the detailed coherent >>>>> analysis for people who aren't familiar with the code and the issues >>>>> involved, that there's in fact something wrong with the patch and it needs >>>>> revisiting before submission). >>>>> >>>>> -- >>>>> Joseph S. Myers >>>>> jos...@codesourcery.com