Re: Why not contribute? (to GCC)

2010-04-26 Thread Ross Ridge
Alfred M. Szmidt writes:
You are still open to liabilities for your own project, if you
incorporate code that you do not have copyright over, the original
copyright holder can still sue you

That's irrlevent.  By signing the FSF's document I'd be doing nothing
to reduce anyone's ability to sue me, I could only be increasing them.
And please don't try to argue that's not true, because I have no reason
to believe you.  Only a lawyer working for myself would be in a position
to convince me otherwise, but if I have to go that far, it's clearly
not worth it.

The debate over legalities has already derailed this thread, so let me
try to put it another way.

Years ago, I was asked to sign one of these documents for some public
domain code I wrote that I never intended to become part of a FSF project.
Someone wanted to turn it a regular GNU project with a GPL license,
configure scripts, a cute acronym and all that stuff.   I said no.
It's public domain, take it or leave it.  Why I should I sign some
legally binding document for some code I had in effect already donated
to the public?  How would you feel if some charity you donated money to
came back with a piece of paper for you to sign?

Submitting a hypothetical patch to GCC isn't much different to me.  For
some people having their code in the GCC distribution is worth something.
For me it's not.  For them it's a fair trade.  For me it's a donation.

We are all humans, patches fall through the cracks.  Would you like to
help keeping an eye out for patches that have fallen through?  Would
anyone else like to do this?

As I said, I was just listing the reasons why I don't contribute.
I'm not arguing that anything should be changed or can be changed.
However, what I do know is that excuses won't make me or anyone else
more likely to contribute to GCC.

Please refer to GCC as a free software project, it was written by the
GNU project and the free software community. 

Oh, yah, forgot about that one.  Political stuff like this another reason
not to get involved with GCC. 

Ross Ridge



Re: Why not contribute? (to GCC)

2010-04-26 Thread Ross Ridge
Ross Ridge writes:
 Years ago, I was asked to sign one of these documents for some public
 domain code I wrote that I never intended to become part of a FSF project.
 Someone wanted to turn it a regular GNU project with a GPL license,
 configure scripts, a cute acronym and all that stuff.   I said no.
 It's public domain, take it or leave it.  Why I should I sign some
 legally binding document for some code I had in effect already donated
 to the public?

Richard Kenner writes:
 Because that's the only way to PUT something in the public domain!

That's absurd and beside the point.  

 How would you feel if some charity you donated money to came back
 with a piece of paper for you to sign?

A closer analogy: a charity receives an unsolicited script for a play from
you.

No, that's not a closer analogy.  As I said, I never intended for my
code to become part of an FSF project.  I didn't send them anything
unsolicited.

I'm contributing to this thread solely to answer the question asked.
Either take the time to read what I've written and use it try to
understand why I don't and others might not contribute to GCC, or please
just ignore it.  Your unsubstantiated and irrelevent legal opinions
aren't helping.

Ross Ridge



Re: Why not contribute? (to GCC)

2010-04-24 Thread Ross Ridge
Manuel López-Ibáñez writes:
What reasons keep you from contributing to GCC?

The big reason the copyright assignment.  I never even bothered to read
it, but as I don't get anything in return there's no point.  Why should
put obligaitons on myself, open myself up to even unlikely liabilities,
just so my patches can merged into the official source distribution?
I work on software on my own time to solve my own problems.  I'm happy
enough not horde it and give it away for free, but it doesn't
make much difference to me if anyone else actually ends up using it.
I can have my own patched version of GCC that does what I want without
signing anything.

Another reason is the poor patch submission process.  Why e-mail a patch
if I know, as a new contributor, there's a good chance it won't even be
looked at by anyone?  Why would I want to go through I a process where I'm
expected to keep begging until my patch finally gets someone's attention?

I also just don't need the abuse.  GCC, while not the most of hostile of
open source projects out there, it's up there.  Manuel López-Ibáñez's
unjustified hostility towards Michael Witten in this thread is just a
small example.

Finally, it's also a lot of work.  Just building GCC can be pain, having
to find upto date versions of a growing list of math libraries that
don't benefit me in the slightest way.  Running the test suite takes a
long time, so even trivial patches require a non-trivial amount of work.
Anything more serious can take a huge ammount of time.  I've abandonded
projects once I realized it would be lot quicker to find some other
solution like using assembly, rather than trying to get GCC to do what
I wanted it to do.

Now these are just the main reasons why I don't contribute to GCC.
I'm not arguing that any these issues need to be or can be fixed.  If I
had what I thought where good solutions that would be better overall to
GCC then I'd have suggested them long ago.

I will add, that I don't think code quality is a problem with GCC.  I hate
the GNU coding style as much as anyone, but it's used consistantly and
that's what matters.  Compared other open and closed projects I've seen
it's as easy to understand and maintain as anything.  GNU binutils is
a pile of poo, but I don't know of any codebase the size of GCC that's
as nice to work with.

Ross Ridge



Re: [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue

2009-11-24 Thread Ross Ridge
Andrew Haley writes:
Alright.  So, it is possible in theory for gcc to generate code that
only uses -maccumulate-outgoing-args when it needs to realign SP.
And, therefore, we could have a nice option for the kernel: one with
(mostly) good code density and never generates the bizarre code
sequence in the prologue.

The best option would be for the Linux people to fix the underlying
problem in their kernel sources.  If the code no longer requested
that certain automatic variables be aligned, then not only would this
bizarre code sequence not be emitted, the unnecessary stack alignment
would disapear as well.  The kernel would then be free to choose to use
whatever code generation options it felt was appropriate.

Ross Ridge



Re: dg-error vs. i18n?

2009-10-28 Thread Ross Ridge
Ross Ridge wrote:
 The correct fix is for GCC not to intentionally choose to rely on
 implementation defined behaviour when using the C locale.  GCC can't
 portably assume any other locale exists, but can portibly and easily
 choose to get consistant output when using the C locale.

Joseph S. Myers writes:
GCC is behaving properly according to the user's locale (representing 
English-language diagnostics as best it can - remember that ASCII does not 
allow good representation of English in all cases).  

This is an issue of style, but as I far as I'm concerned using these
fancy quotes in English locales is unnecessary and unhelpful. 

The problem here is not a bug in the compiler proper, it is an issue
with how to test the compiler portably - that is, how the testsuite can
portably set a locale with English language and ASCII character set in
order to test the output the compiler gives in such a locale.

It's a design flaw in GCC.  The C locale is the only locale that GCC
can use to reliably and portably get consistant output across all ASCII
systems and so should be the locale used to achieve consistant output.
GCC can simply choose to restrict it's output to ASCII.  It's not in
any way being forced by POSIX to output non-ASCII characters, or for
that matter to treat the C locale as an English locale. 

Ross Ridge



Re: dg-error vs. i18n?

2009-10-27 Thread Ross Ridge
Eric Blake writes:
The correct workaround is indeed to specify a locale with specific charset 
encodings, rather than relying on plain C (hopefully cygwin will 
support C.ASCII, if it does not already).

The correct fix is for GCC not to intentionally choose to rely on
implementation defined behaviour when using the C locale.  GCC can't
portably assume any other locale exists, but can portibly and easily
choose to get consistant output when using the C locale.

As far as I know, the hole is intentional.  But if others would like
me to, I am willing to pursue the action of raising a defect against
the POSIX standard, requesting that the next version of POSIX consider
including a standardized name for a locale with guaranteed single-byte
encoding.

I don't see how a defect in POSIX is exposed here.  Nothing in
the standard forced GCC to output multi-byte characters when
nl_langinfo(CHARSET) returns something like utf-8.  GCC chould just
as easily have choosen to output these quotes as single-byte characters
when nl_langinfo(CHARSET) returns something like windows-1252, or some
other non-ASCII single-byte characters when it returned iso-8859-1.

Ross Ridge



Re: Add support for the Win32 hook prologue (try 3)

2009-09-11 Thread Ross Ridge
Stefan Dösinger writes:
On a partly related topic, I think the Win64 ABI requires that the first
function is two bytes long, and there at least 6 bytes of slack before
the function. Does gcc implement that?

As far as I can tell the Win64 ABI doesn't have either of these
requirements.  Microsoft's compiler certainly doesn't guarantee that
functions begin with two byte instructions, and the x64 Software
Conventions document gives examples of prologues with larger initial
instructions:

http://msdn.microsoft.com/en-us/library/tawsa7cb(VS.80).aspx

Mind you, last I checked, GCC didn't actually follow the ABI requirements
for prologues and epilogues given in the link above, but that only breaks
ABI unwinding.

Ross Ridge



Re: MSVC hook function prologue

2009-09-07 Thread Ross Ridge
Paolo Bonzini writes:
The naked attribute has been proposed and bashed to death multiple
times on the GCC list too.

No, not really.  It's been proposed a few times, but the discussion never
gets anywhere because the i386 maintainers quickly put their foot down
and end it.  That hasn't stopped other ports from implementing a naked
attribute or for that matter developers like me creating their own
private implementations.

Ross Ridge



Re: MSVC hook function prologue

2009-09-05 Thread Ross Ridge
Paolo Bonzini writes:
Are there non-Microsoft DLLs that expect to be hooked this way?  If
so, I think the patch is interesting for gcc independent of whether it
is useful for Wine.

Stefan Dösinger writes:
I haven't seen any so far. ...

If this patch is essentially only for one application, maybe the idea
of implementing a more generally useful naked attribute would be the
way to go.  I implemented a naked attribute in my private sources to
do something similar, although supporting hookable prologues was just
a small part of its more general use in supporting an assembler based API.

Ross Ridge



Re: CVS/SVN binutils and gcc on MacOS X?

2009-09-05 Thread Ross Ridge
Stefan Dösinger writes:
Unfortunately I need support for the swap suffix in as, so using the system 
binaries is not an option. Is the only thing I can do to find the source of 
the as version, backport the swap suffix and hope for the best?

Another option might be a hack like this:

(define_insn vswapmov
  [(set (match_operand:SI 0 register_operand =r)
(match_operand:SI 1 register_operand r))
   (unspec_volatile [(const_int 0)] UNSPECV_VSWAPMOV)]
  
{
#ifdef HAVE_AS_IX86_SWAP
  return movl.s\t{%1, %0|%0, %1};
#else
  if (true_regnum(operand[0]) == DI_REG
   true_regnum(operand[1]) == DI_REG)
return ASM_BYTE 0x8B, 0xFF;
  if (true_regnum(operand[0]) == BP_REG
   true_regnum(operand[1]) == SP_REG)
return ASM_BYTE 0x8B, 0xEC;
  gcc_unreachable();
#endif
}
  [(set_attr length 2)
   (set_attr length_immediate 0)
   (set_attr modrm 0)])

It's not pretty but you won't be dependent on binutils.

Ross Ridge



Re: Add crc32 function to libiberty

2009-07-24 Thread Ross Ridge
DJ Delorie writes:
I didn't reference the web site for the polynomial, just for background.
To be honest, I'm not sure what the polynomial is.  As the comments
explain, the algorithm I used is precisely taken from gdb, in remote.c,
and is intended to produce the same result.  Does anybody on the gdb
side know the polynomial or any other information?

Your code uses the (one and only) CRC-32 polynomial 0x04c11db7, so just
describing it as the CRC-32 function should be sufficient documentation.
It's the same CRC function as used by PKZIP, Ethernet, and chksum.
It's not compatible with the Intel CRC32 instruction which uses the
CRC-32C polynomial (0x1EDC6F41).

Ross Ridge



Re: Ideas for Google Summer of Code

2009-03-30 Thread Ross Ridge
Paolo Bonzini writes:
Regarding the NVIDIA GPU backend, I think NVIDIA is not yet distributing
details about the instruction set unlike ATI, is it?  In this case, I
think ATI would be a better match.

I think a GPU backend would be well beyond the scope of a Summer of
Code project.  GPUs don't have normal stacks and addressing support
is limitted. 

Another possibility is to analyze OpenCL C and try to integrate its
features in GCC as much as possible.  This would include

1) masking/swizzling support for GCC's generic vector language extension;

A project that started and ended here would give GCC, in particular
GCC's Cell SPU port, the only major required functionality in the OpenCL
language, outside the runtime, that GCC is missing.

2) half-precision floats;

Do you mean just conversion only support, like Sandra Loosemore's
proposed ARM patch, or full arithmetic support like any other scalar or
vector type?

Ross Ridge



Re: Ideas for Google Summer of Code

2009-03-30 Thread Ross Ridge
Joe Buck writes:
I'm having trouble finding that document, I don't see a link to it
on that page.  Maybe I'm missing something obvious?

Sticking nvidia ptx into Google turned up this document:

http://www.nvidia.com/object/io_1195170102263.html

It's an intermediate language, so isn't tied to any particular NVIDIA GPU.
I beleive there's something similar for AMD/ATI GPUs.

btw. The computational power of Intel's integrated GPUs is pretty dismal,
so I don't think GCC port targetting them would be very useful.

Ross Ridge



Re: GCC OpenCL ?

2009-02-03 Thread Ross Ridge
Mark Mitchell writes:
That's correct.  I was envisioning a proper compiler that would take
OpenCL input and generate binary output, for a particular target, just
as with all other GCC input languages.  That target might be a GPU, or
it might be a multi-core CPU, or it might be a single-core CPU.

I have a hard time seeing why this would be all that worthwhile.
Since the instruction sets for AMD, NVIDIA or current Intel GPUs are
trade scretes, GCC won't be able to generate binary output for them.
OpenCL is designed for heterogenous systems, compiling for multi-core or
single-core CPUs would only be useful as a cheap fallback implementation.
This limits a GCC-based OpenGL implementation to achieving it's primary
purpose with just Cell processors and maybe Intel's Larrabee.  Is that
what you envision?  Without AMD/NVIDIA GPU support it doesn't sound all
that useful to me.

Ross Ridge



Re: GCC OpenCL ?

2009-02-03 Thread Ross Ridge

Basile STARYNKEVITCH writes:
It seems to me that some specifications
seems to be available. I am not a GPU expert, but
http://developer.amd.com/documentation/guides/Pages/default.aspx
contains a R8xx Family Instruction Set Archictectire document at
http://developer.amd.com/gpu_assets/r600isa.pdf and at a very quick
first glance (perhaps wrongly) I feel that it could be enough to design 
write a code generator for it.

Oh, ok, that makes a world of difference.  Even with just AMD GPU
support a GCC-based OpenCL implementation becomes a lot more practical.

Ross Ridge



Re: GCC OpenCL ?

2009-02-03 Thread Ross Ridge
Ross Ridge wrote:
 Oh, ok, that makes a world of difference.  Even with just AMD GPU
 support a GCC-based OpenCL implementation becomes a lot more practical.

Michael Meissner writes:
And bear in mind that x86's with GPUs are not the only platform of interest

I never said anything about x86's and I already mentioned the Cell.
Regardless, I don't think an GCC-based OpenCL implementation that didn't
target GPUs would be that useful.

Ross Ridge



Re: Problem with x64 SEH macro implementation for ReactOS

2008-12-03 Thread Ross Ridge
Timo Kreuzer writes:
The problem of the missing Windows x64 unwind tables is already solved!
I generate them from DWARF2 unwind info in a postprocessing step.

Ok.  The Windows unwind opcodes seemed to be much more limitted than
DWARF2's that I wouldn't have thought this approach would work.

The problem that I have is simply how to mark the try / except / finally
blocks in the code with reference to each other, so I can also generate
the SCOPE_TABLE data in that post processing step

You can output address pairs to a special section to get the mapping
you need.  Something like:

asm(.section .seh.data, \n\\n\t
.quad %0, %1\n\t
.text
: : i (addr1), i (addr2));

Unfortunately, I don't think section stack directives work on PE-COFF
targets, so you'd have to assume the function was using the .text section.
btw. don't rely on GCC putting adjacent asm statements together like
you did in your orignal message.  Make them a single asm statement.

Note that the SCOPE_TABLE structure is part of Microsoft's internal
private SEH implementation.  I don't think it's a good idea to use or
copy Microsoft's implementation.  Create your own handler function and
give it whatever data you need.

Ross Ridge



Re: Problem with x64 SEH macro implementation for ReactOS

2008-11-28 Thread Ross Ridge
Kai Tietz writes:
Hmm, yes and no. First the exception handler uses the .pdata and .xdata 
section for checking throws. But there is still the stack based exception 
mechanism as for 32-bit IIRC. 

No.  The mechanism is completely different.  The whole point of the unwind
tables is to remove the overhead of maintaining the linked list of records
on the stack.  It works just like DWARF2 exceptions in this respect.

No, this isn't that curious as you mean. In the link you sent me, it is 
explained. The exception handler tables (.pdata/.xdata) are optional and 
not necessarily required.

This is what Microsoft's documentation says:

Every function that allocates stack space, calls other functions,
saves nonvolatile registers, or uses exception handling must
have a prolog whose address limits are described in the unwind
data associated with the respective function table entry.

In this very limited case RtlUnwindEx() can indeed unwind a function
without it having any unwind info associated with it.  If RtlUnwindEx()
can't find the unwind data for a function then it assumes that the
stack pointer points directly at the return address.  To unwind through
the function it pops the top of stack to get the next frame's RIP and
RSP values.  Otherwise RltUnwindEx() needs the unwind information.

The restrictions on the format of prologue and epilogue only exist to
making handle the case where the current RIP points to the prologue or
epilogue much easier.  Without the unwind info RtlUnwindEx() has no way
of knowing where the prologue is.

There's a very detailed explaination on how Windows x64 exceptions work,
including RltUnwindEx() on this blog:

http://www.nynaeve.net/?p=113

But in general I agree, that the generation of .pdata/.xdata sections 
would be a good thing for better support of MS abis by gcc.

I'm not advocating that they should be added to GCC now.  I'm just
pointing out that without them 64-bit SEH macros will be of limitted use.

Ross Ridge



Re: Problem with x64 SEH macro implementation for ReactOS

2008-11-27 Thread Ross Ridge
Kai Tietz writes:
I am very interested to learn, which parts in calling convention aren't 
implemented for w64? 

Well, maybe I'm missing something but I can't any see code in GCC for
generating prologues, epilogues and unwind tables in the format required
by the Windows x64 ABI.

http://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

I am a bit curious, as I found that the unwind mechanism of Windows
itself working quite well on gcc compiled code, so I assumed, that the
most important parts of its calling convention are implemented.

How exactly are you testing this?  Without SEH support Windows wouldn't
ordinarily ever need to unwind through GCC compiled code.  I assumed
that's why it was never implemented.

Ross Ridge



Re: Problem with x64 SEH macro implementation for ReactOS

2008-11-27 Thread Ross Ridge
Kai Tietz writes:
Well, you mean the SEH tables on stack.

No, I mean the ABI required unwind information. 

 Well, those aren't implemented (as they aren't for 32-bit).

64-bit SEH handling is completely different from 32-bit SEH handling.
In the 64-bit Windows ABI exceptions are handled using unwind tables
similar in concept to DWARF2 exceptions.  There are no SEH tables on
the stack.  In the 32-bit ABI exceptions are handled using a linked list
of records on the stack, similar to SJLJ exceptions.

 But the the unwinding via  RtlUnwind and RtlUnwindEx do their job even
for gcc compiled code quite well

I don't see how it would be possible in the general case.  Without the
unwind talbes Windows doesn't have the required information to unwind
through GCC compiled functions.

Ross Ridge



Re: Problem with x64 SEH macro implementation for ReactOS

2008-11-26 Thread Ross Ridge
Timo Kreuzer wrote: 
I am working on x64 SEH for ReactOS. The idea is to use .cfi_escape
codes to mark the code positions where the try block starts / ends and
of the except landing pad. The emitted .eh_frame section is parsed after
linking and converted into Windows compatible unwind info / scope tables.
This works quite well so far.

Richard Henderson writes:
I always imagined that if someone wanted to do SEH, they'd actually
implement it within GCC, rather than post-processing it like this.
Surely you're making things too hard for yourself with these escape hacks

I assume he's trying to create the equivilent of the existing macro's for
handling Windows structured exceptions in 32-bit code.  The 32-bit macros
don't require any post-processing and are fairly simple.  Still even
with the post-processing, Timo Kreuzer's solution would be heck of a
lot easier to implement then adding SEH support to GCC.

The big problem is that the last time I checked GCC wasn't generating the
Windows x64 ABI required prologue, epilogues or unwind info for functions.
Windows won't be able to unwind through GCC compiled functions whether
the macros are used or not.

I think the solution to the specific problem he mentioned, connecting
nested functions to their try blocks, would be to emit address pairs to
a special section.

Ross Ridge



Re: How to teach gcc, that registers are clobbered by api calls?

2008-04-22 Thread Ross Ridge
Kai Tietz writes:
I read that too, but how can I teach gcc to do this that registers are
callee-saved? I tried it by use of call_used part in regclass.c, but
this didn't worked as expected.

I think you need to modify CALL_USED_REGISTERS and/or
CONDITIONAL_REGISTER_USAGE in i386.h.  Making any changes to regclass.c
is probably not the right thing to do.

Ross Ridge



Re: How to teach gcc, that registers are clobbered by api calls?

2008-04-21 Thread Ross Ridge
H.J. Lu writes:
Are r10-r15 callee-saved in w64ABI?

Here's what Microsoft's documentation says:

Caller/Callee Saved Registers  

The registers RAX, RCX, RDX, R8, R9, R10, R11 are considered
volatile and must be considered destroyed on function calls
(unless otherwise safety-provable by analysis such as whole
program optimization).

The registers RBX, RBP, RDI, RSI, R12, R13, R14, and R15 are
considered nonvolatile and must be saved and restored by a
function that uses them

Other parts of the documentation state that XMM0-XMM5 are volatile
(caller-saved), while XMM6-XXM15 are non-volatile (callee-saved).

Ross Ridge



Re: atomic accesses

2008-03-04 Thread Ross Ridge
Segher Boessenkool writes:
... People are relying on this undocumented GCC behaviour already,
and when things break, chaos ensues.

GCC has introduced many changes over the years that have broken many
programs that have relied on undocumented or unspecified behaviour.
You won't find much sympathy for who people assume that GCC must behave
in some way where there is no requirement for it to do so. 

If we change this to be documented behaviour, at least it is clear
where the problem lies (namely, with the compiler), and things can be
fixed easily.

I don't think you'll find any support for imposing a requirement on GCC
that would always require it to use an atomic instruction when there
is alternative instruction or sequence of instructions that would be
faster and/or shorter.  I think your best bet a long these lines would
be adding __sync_fetch() and __sync_store() builtins, but doing so would
be more difficult than a simple documentation change.

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-03 Thread Ross Ridge
Ross Ridge:
With INTO I don't see any way distignuish the SIGSEGV it generates on
Linux from any of the myriad other ways a SIGSEGV can be generated.

Paolo Bonzini writes:
sc.eip == 0xCE (if I remember x86 opcodes well :-) as I'm going by heart...)

The INTO instruction generates a trap exception, just like INT 4 would, so
the return address on the stack points to the instruction after the INTO.

That's similar to how Java traps SIGFPEs and transform them to
zero-divide exceptions, IIRC.

Floating point exceptions are fault exceptions so the return address
points to the faulting instruction.

At the risk of my curiousity getting me into more trouble, could any
one explain to me how to access these eip and trapno members from
a signal handler on Linux?  I can't find any relevent documention with
man nor Google.

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-02 Thread Ross Ridge
Mark Mitchell writes:
 However, I don't think doing all of that work is required to make this 
 feature useful to people.  You seem to be focusing on making -ftrapv 
 capture 100% of overflows, so that people could depend on their programs 
 crashing if they had an overflow.  That might be useful in two 
 circumstances: (a) getting bugs out (though for an example like the one 
 above, I can well imagine many people not considering that a bug worth 
 fixing), and (b) in safety-critical situations where it's better to die 
 than do the wrong thing.

Richard Kenner writed:
 You forgot the third: if Ada is to use it rather than its own approach,
 it must indeed be 100% reliable.

Actually, that's a different issue than catching 100% of overflows, 
which apparently Ada doesn't require.

 Robert is correct that if it's sufficiently more efficient than Ada's
 approach, it can be made the default, so that by default range-checking
 is on in Ada, but not in a 100% reliable fashion.

On the issue of performance, out of curiosity I tried playing around
with the IA-32 INTO instruction.  I noticed two things, the first was
that instruction wasn't supported in 64-bit mode, and the second was
that it on the Linux I was using, it generated SIGSEGV signal that was
indistinguishable from any other SIGSEGV.  If Ada needs to be able to
catch and distinguish overflow exceptions, this and possibile other
cases of missing operating support might make processor specific overlow
support detrimental.

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-02 Thread Ross Ridge
Robert Dewar write:
Usually there are ways of telling what is going on at a sufficiently
low level, but in any case, code using the conditional jump instruction
(jo/jno) is hugely better than what we do now (and it is often faster
to usea jo than into).

My point is that using INTO or some other processor's overlow mechanism
that requires operating system support wouldn't necessarily be better for
Ada, even it performs better (or uses less space) than the alternatives.
Having the program crash with a vague exception would meet the
requirements of -ftrapv, but not Ada.

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-02 Thread Ross Ridge
Robert Dewar write: 
Usually there are ways of telling what is going on at a sufficiently
low level, but in any case, code using the conditional jump instruction
(jo/jno) is hugely better than what we do now (and it is often faster
to usea jo than into).

Ross Ridge wrote: 
My point is that using INTO or some other processor's overlow mechanism
that requires operating system support wouldn't necessarily be better for
Ada, even it performs better (or uses less space) than the alternatives.
Having the program crash with a vague exception would meet the
requirements of -ftrapv, but not Ada.

Robert Dewar write: 
But, once again, using the processor specific JO instruction will be
much better for Ada than double length arithmetic, using JO does not
involve a program crash with a vague exception.

*sigh*  The possibility of using GCC's -ftrapv support to implement
overlow exceptions in Ada was mentioned in this thread.  There's no
requirement that -ftrapv do anything other than crash when overflow
occurs.  A -ftrapv that did everything you've said you wanted, performed
faster and caught 100% of overflows 100% reliabily wouldn't necessarily be
better for Ada.  On the 32-bit IA-32 platform, either the JO instruction
or a INTO instruction could legitimately be used to provide a more
optimal implementation of -ftrapv.  Even the JO instruction could do
nothing more than jump to abort().

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-02 Thread Ross Ridge
Robert Dewar writes:
Yes, and that is what we would want for Ada, so I am puzzled by your
sigh. All Ada needs to do is to issue a constraint_error exception,
it does not need to know where the exception came from or why except
in very broad detail.

Unless printing This application has requested the Runtime to terminate
it in an unusual way. counts an issuing a contraint_error in Ada,
it seems to me that -ftrapv and Ada have differing requirements.
How can you portabilty and correctly generate a constraint_error if
the code generated by -ftrapv calls the C runtime function abort()?
On Unix-like systems you can catch SIGABRT, but even there how do you
tell that it didn't come from CTRL-\, a signal sent from a different
process, or abort() called fom some other context?  With INTO I don't
see any way distignuish the SIGSEGV it generates on Linux from any of
the myriad other ways a SIGSEGV can be generated.

Ross Ridge



Re: [PATCH][4.3] Deprecate -ftrapv

2008-03-02 Thread Ross Ridge
Ross Ridge writes:
 On Unix-like systems you can catch SIGABRT, but even there how do you
 tell that it didn't come from CTRL-\...

Oops, I forgot that CTRL-\ had it own signal SIGQUIT.

Ross Ridge



Re: [m32c] type precedence rules and pointer signs

2008-01-30 Thread Ross Ridge
DJ Delorie writes:
extern int U();
void *ra;
...
  foo((ra + U()) - 1)
...
1. What are the language rules controlling this expression, and do they
have any say about signed vs unsigned wrt the int-pointer promotion?

There is no integer to pointer promotion.  You're adding an integer to a
pointer and then subtracting an integer from the resulting pointer value.
If U() returns zero then the pointer passed to foo() should point to
the element before the one that ra points to.  Well, it should if ra
actually had a type that Standard C permitted using pointer arithmetic on.

Ross Ridge



RE: Memory leaks in compiler

2008-01-17 Thread Ross Ridge
Diego Novillo wrote:
I agree.  Freeing memory right before we exit is a waste of time.

Dave Korn writes:
 So, no gcc without an MMU and virtual memory platform ever again?
Shame, it used to run on Amigas.

I don't know if GCC ever freed all of its memory before exiting.
An operating system doesn't need an MMU or virtual memory in order to
free all the memory used by a process when it exits.  MS-DOS did this,
and I assume AmigaOS did as well.

Ross Ridge



Re: __builtin_expect for indirect function calls

2008-01-06 Thread Ross Ridge
Mark Mitchell writes:
 What do people think?  Do we have the leeway to change this?

If it were just cases where using __builtin_expect is pointless that
would break, like function overloading and sizeof then I don't think
it would be a problem.  However, it would change behaviour when C++
conversion operators are used and I can see these being legitimately
used with __builtin_expect.  Something like:

struct C {
operator long();
};

int foo() {
if (__builtin_expect(C(), 0))
return 1;
return 2;
}

If cases like these are rare enough it's probably an acceptable change
if they give an error because the argument types don't match.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-19 Thread Ross Ridge
H.J. Lu writes:
 What value did you use for -mpreferred-stack-boundary? The x86 backend
 defaults to 16byte.

On Windows the 16-byte default pretty much just wastes space, so I use
-mpreferred-stack-boundary=2 where it might make a difference.  In the
case where I wanted to use SSE vector instructions, I explicitly used
-mpreferred-stack-boundary=4 (16-byte alignment).

STACK_BOUNDARY is the minimum stack boundary. MAX(STACK_BOUNDARY,
PREFERRED_STACK_BOUNDARY) == PREFERRED_STACK_BOUNDARY.  So the question is
if we should assume INCOMING == PREFERRED_STACK_BOUNDARY in all cases:

Doing this would also remove need for ABI_STACK_BOUNDARY in your proposal.

Pros:
  1. Keep the current behaviour of -mpreferred-stack-boundary.

Cons:
  1. The generated code is incompatible with the other object files.

Well, your proposal wouldn't completely solve that problem, either.
You can't guarantee compatiblity with object files compiled with different
values -mpreferred-stack-boundary, including those compiled with current
implementation, unless you assume the incomming stack is aligned to
the lowest value the flag can have and align the outgoing stack to the
highest value that the flag can have.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-19 Thread Ross Ridge
Ross Ridge writes:
 As I mentioned later in my message STACK_BOUNDARY shouldn't be defined in
 terms of hardware, but in terms of the ABI.  While the i386 allows the
 stack pointer to bet set to any value, by convention the stack pointer
 is always kept 4-byte aligned at all times.  GCC should never generate
 code that that would violate this requirement, even in leaf-functions
 or transitorily during the prologue/epilogue.

H.J. Lu writes:
 From gcc internal manual

I'm suggesting a different defintion of STACK_BOUNDARY which wouldn't,
if strictly followed, result STACK_BOUNDARY being defined as 8 on
the i386.  The i386 hardware doesn't enforce a minimum alignment on the
stack pointer.

 Since x86 always push/pop stack by decrementing/incrementing address
 size, it makes senses to define STACK_BOUNDARY as address size.

The i386 PUSH and POP instructions adjust stack pointer the by the
operand size of the instruction.  The address size of the instruction
has no effect.  For example, GCC should never generate code like this:

pushw $0
pushw %ax

because the stack is temporarily misaligned.  This could result in a
signal, trap, interrupt or other asynchronous handler using a misaligned
stack.  In context of your proposal, defining STACK_BOUNDARY this way,
as a requirement imposed on GCC by an ABI (or at least by convention),
not the hardware, is important.  Without an ABI requirement, there's
nothing that would prohibit an i386 leaf function from adjusting the
stack in a way that leaves the stack 1- or 2-byte aligned.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-19 Thread Ross Ridge
Andrew Pinski writes:
 Can we stop talking about x86/x86_64 specifics issues here?

No.

I have an use case for the PowerPC side of the Cell BE for variables
greater than the normal stack boundary alignment of 16bytes.  They need
to be 128byte aligned for DMA transfering to the SPUs.

I already proposed a patch [1] to fix this use case but I have not
seen many replies yet.

Complaining about someone talking about x86/x86_64 specific issues and
then bringing up a PowerPC/Cell specific issue is probably not the best
way to go about getting your patch approved.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-18 Thread Ross Ridge
Ross Ridge writes:
 This section doesn't make sense to me.  The force_align_arg_pointer
 attribute and -mstackrealign assume that the ABI is being
 followed, while the -fpreferred-stack-boundary option effectively

H.J. Lu hjl at lucon dot org writes
 According to Apple engineer who implemented the -mstackrealign,
 on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume
 4byte, which is STACK_BOUNDARY.

Ok.  The importanting thing is that for backwards compatibility it needs
to continue to assume 4-byte alignment on entry and align the stack to
a 16-byte alignment on x86 targets, so that makes more sense.

 changes the ABI.  According your defintions, I would think
 that INCOMING should be ABI_STACK_BOUNDARY in the first case,
 and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second.

 That isn't true since some .o files may not be compiled with
 -fpreferred-stack-boundary or with a different value of
 -fpreferred-stack-boundary.

Like with any ABI changing flag, that's not supported:

... Further, every function must be generated such that it keeps
the stack aligned.  Thus calling a function compiled with a higher
preferred stack boundary from a function compiled with a lower
preferred stack boundary will most likely misalign the stack.

The -fpreferrred-stack-boundary flag currently generates code that
assumes the stack aligned to the preferred alignment on function entry.
If you assume a worse incoming alignment you'll be aligning the stack
unnecessarily and generating code that this flag doesn't require.

 On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may
 want to use 8 byte for PREFERRED_STACK_BOUNDARY.

Ok, if people are using this flag to change the alignment to something
smaller than used by the standard ABI, then INCOMING should be
MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY).

Ross Ridge



Re: A proposal to align GCC stack

2007-12-18 Thread Ross Ridge
Ye, Joey writes: 
i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
and 64 for x86_64. It is the minimum stack boundary. It is fixed.

Ross Ridge wrote:
Strictly speaking by the above definition it would be 8 for i386.
The hardware doesn't force the stack to be 32-bit aligned, it just
performs poorly if it isn't.

Robert Dewar writes:
First, although for some types, the accesses may work, the optimizer
is allowed to assume that data is properly aligned, and could possibly
generate incorrect code ...

That's not enforced by hardware.

Second, I am pretty sure there are SSE types that require
alignment at the hardware levell, even on the i386

This isn't a restriction on stack aligment.  It's a restriction on what
kinds of machine types can be accessed on the stack.

As I mentioned later in my message STACK_BOUNDARY shouldn't be defined in
terms of hardware, but in terms of the ABI.  While the i386 allows the
stack pointer to bet set to any value, by convention the stack pointer
is always kept 4-byte aligned at all times.  GCC should never generate
code that that would violate this requirement, even in leaf-functions
or transitorily during the prologue/epilogue.

This is different than the proposed ABI_STACK_BOUNDARY macro which defines
the possibily stricter aligment the ABI requires at function entry.  Since
most i386 ABIs don't require a stricter alignment, that has ment that
SSE types couldn't be located on the stack.  Currently you can get around
this problem by changing the ABI using -fperferred-stack-boundary or by
forcing an SSE compatible alignment using -mstackrealign or __attribute__
((force_align_arg_pointer)).  Joey Ye's proposal is another solution
to this problem where GCC would automatically force an SSE compatible
aligment when SSE types are used on the stack.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-18 Thread Ross Ridge
Ross Ridge wrote:
 The -fpreferrred-stack-boundary flag currently generates code that
 assumes the stack aligned to the preferred alignment on function entry.
 If you assume a worse incoming alignment you'll be aligning the stack
 unnecessarily and generating code that this flag doesn't require.

H.J. Lu writes:
 That is how we get into trouble in the first place. The only place I
 think of where you can guarantee everything is compiled with the same
 -fpreferrred-stack-boundary is kernel. Our proposal will align stack
 only when needed. PREFERRED_STACK_BOUNDARY  ABI_STACK_BOUNDARY will
 generate a largr stack unnecessarily.

I'm currently using -fpreferred-stack-boundary without any trouble.
Your proposal would in fact generate code to align stack when it's not
necessary.  This would change the behaviour of -fpreferred-stack-boundary,
hurting performance and that's unacceptable to me.

 Ok, if people are using this flag to change the alignment to something
 smaller than used by the standard ABI, then INCOMING should be
 MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY).

 On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may
 want to use 8 byte for PREFERRED_STACK_BOUNDARY. INCOMING will
 be MIN(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8 byte.

Using MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) also equals 8 in that
case and preserves the behaviour -fpreferred-stack-boundary in every case.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-18 Thread Ross Ridge
Robert Dewar writes:
Well if we have local variables of type float (and we have specified
use of SSE), we are in trouble, no?

Non-vector SSE instructions, like the ones that operate on scalar floats,
don't require memory operands to be aligned.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-18 Thread Ross Ridge
Ross Ridge wrote:
 I'm currently using -fpreferred-stack-boundary without any trouble.
 Your proposal would in fact generate code to align stack when it's
 not necessary.  This would change the behaviour of
 -fpreferred-stack-boundary, hurting performance and that's unacceptable
 to me.

Ye, Joey writes:
 This proposal values correctness at first place. So when compile can't
 make sure a function is only called from functions with the same or bigger
 preferred-stack-boundary, it will conservatively align the stack. One
 optimization is to set INCOMING = PREFERRED for local functions. Do you
 think it more acceptable?

Not really.  It might reduce the amount of unnecessary stack adjustment,
but the performance regression would remain.  Changing the behaviour of
-fpreferred-stack-boundary doesn't make it more correct.  It supposed
to change the ABI, it works as documented and, yes, if it's misused it
will cause problems.  So will any number of GCC's ABI changing options.

Look at it another way.  Lets say you were compiling x86_64 code with
-fpreferred-stack-boundary=3, an 8-byte PREFERRED alignment.  As you
know, this is different from the standard x86_64 ABI which requires a
16-byte alignment.  Now with your proposal, GCC's behaviour of won't
change, because it's safe to assume that incoming stack is at least
8-byte aligned.  There should be no change in the code GCC generates,
with or without your proposal.  However, the outgoing stack won't be
16-byte aligned as the x86_64 ABI requires.  In this case, what also
doesn't change is the fact that mixing code compiled with different
-fpreferred-stack-boundary values doesn't work.  It's just as problematic
and unsafe as it was before.

So when you said this proposal values correctness at first place,
that really isn't true.  The proposal only addresses safety when
preferred alignment is raised from the standard ABI's alignment.  You're
conservatively aligning the incoming stack, but not the outgoing stack.
You don't seem to be concerned about the problems that can arise when
the preferred is raised above the ABI's.  Why?  My guess is that because
correctness in this case would cause unacceptable regressions when
compiling the x86_64 Linux kernel.

If you can understand why it would be unacceptable to change how
-fpreferred-stack-boundary behaves when compiling the Linux kernel,
then maybe you can understand why I don't find it acceptable for it to
change when compiling my code.

Ross Ridge



Re: A proposal to align GCC stack

2007-12-17 Thread Ross Ridge
Ye, Joey writes:
i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386
and 64 for x86_64. It is the minimum stack boundary. It is fixed.

Strictly speaking by the above definition it would be 8 for i386.
The hardware doesn't force the stack to be 32-bit aligned, it just
performs poorly if it isn't.

v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary
at function entry. If a function is marked with __attribute__
((force_align_arg_pointer)) or -mstackrealign option is provided,
INCOMING = STACK_BOUNDARY.  Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY,
PREFERRED_STACK_BOUNDARY) because a function can be called via psABI
externally or called locally with PREFERRED_STACK_BOUNDARY.

This section doesn't make sense to me.  The force_align_arg_pointer
attribute and -mstackrealign assume that the ABI is being
followed, while the -fpreferred-stack-boundary option effectively
changes the ABI.  According your defintions, I would think
that INCOMING should be ABI_STACK_BOUNDARY in the first case,
and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second.
(Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's
should be rejected during command line processing.)

vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required
by local variables and calling other function. REQUIRED_STACK_ALIGNMENT
== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a
non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT ==
LOCAL_STACK_BOUNDARY.

Hmm... I think you should define STACK_BOUNDARY as the minimum
alignment that ABI requires the stack pointer to keep at all times.
ABI_STACK_BOUNDARY should be defined as the stack alignment the
ABI requires at function entry.  In that case a leaf function's
REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY,
STACK_BOUNDARY).

Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX
and CX as parameter passing registers, there are limited candidates for
this proposal to choose. Current proposal suggests EDI, because it won't
conflict with i386 PIC or regparm.

Could you pick a call-clobbered register in cases where one is availale?

//  Reserve two stack slots and save return address 
//  and previous frame pointer into them. By
//  pointing new ebp to them, we build a pseudo 
//  stack for unwinding

Hmmm... I don't know much about the DWARF unwind information, but
couldn't it handle this case without creating the pseudo frame?
Or at least be extended so it could?

Ross Ridge



Re: BITS_PER_UNIT less than 8

2007-12-07 Thread Ross Ridge
Boris Boesler writes:
 Ok, so what have I to do to write a back-end where all addresses are
 given in bits? Memory is addressed in bits, not bytes. So I set:

 #define BITS_PER_UNIT 1
 #define UNITS_PER_WORD 32

I don't know if it's useful to define the size of a byte to be less than
8-bits, even if that more accurately reflects the hardware.  Standard C
requires that the char type both be at least 8 bits (UCHAR_MAX = 256)
and the same size as a byte (sizeof(char) == 1).  You can't define any
types that are smaller than a char and have sizeof work correctly.

So, what can I do to get this running for my architecture?

If you think there's still some benefit from having GCC use a 1-bit byte,
you'll probably have to fix a number of assumptions made in the code.
Things like that the size of a byte is at least 8 bits and is the same
in frontend and backend.

Ross Ridge



Re: libiberty/pex-unix vfork abuse?

2007-12-07 Thread Ross Ridge
Dave Korn writes:
  Perhaps we could work around this case by setting environ in the parent
 before the vfork call and restoring it afterward, but we'd need kind of
 serialisation there, and I don't know how to do a critical section using
 pthreads/posix.

A simple solution would be to call fork() instead of vfork() when changing
the environment.

Ross Ridge



Re: BITS_PER_UNIT larger than 8 -- word addressing

2007-11-26 Thread Ross Ridge
Miceal Eagar writes:
I'm working with a target that has 32-bit word addressing,
so there is a define of BITS_PER_UNIT = 32.

According to the documentation, this changes the size of a byte to 32
bits, instead of the more usual 8 bits.

This causes a problem:  an error saying that there is
no emulation for 'DI'.  DImode has a precision of 128 bits,
which is clearly incorrect.  (All the other integer modes
were incorrect as well.)

DImode is defined to be 8 bytes long so with a 32-bit byte I'd expect
it to be 256 bits.  Trying use QImode and HImode for 32-bit and 64-bit
operations respectively.

Ross Ridge



Re: strict aliasing

2007-11-05 Thread Ross Ridge
Ian Lance Taylor wrote:
 Strict aliasing only refers to loads and stores using pointers.  

skaller writes:
 Ah, I see. So turning it off isn't really all that bad
 for optimisation.

One example of where it hurts on just about any platform is something
like this:

void allocate(int **p, unsigned len);

int *foo(unsigned len) {
int *p;
unsigned i;
allocate(p, len);
for (i = 0; i  len; i++) 
p[i] = 1;
return p;
}

Without strict aliasing being enabled, the compiler can't assume that
that the assignment p[i] = 1 won't change p.  This results the value
of p being loaded on every loop iteration, instead of just once at the
start of the loop.  It also prevents GCC from vectorizing the loop.

On Itaninum CPUs speculative loads can be used instead of strict alias
analysis to avoid this problem.

Ross Ridge




Re: gomp slowness

2007-10-17 Thread Ross Ridge
skaller writes:
 Unfortunately no, unless MSVC++ in VS2005 has openMP.

I don't know if Visual C++ 2005 Express supports OpenMP, but the
Professional edition should.  Alternatively, the free, as in beer,
Microsoft compiler included in the Windows SDK supports OpenMP.

Ross Ridge



Re: Preparsing sprintf format strings

2007-10-12 Thread Ross Ridge
[EMAIL PROTECTED] (Ross Ridge) writes:
 The entire parsing of the format string is affected by the multi-byte
 character encoding.  I don't know how GCC would be able tell that a byte
 with the same value as '%' in the middle of string would actually be
 interpreted as '%' character rather than a part of an extended multibyte
 character.  This can easily happen with the ISO 2022-JP encoding.

Andreas Schwab writes:
 The compiler is supposed to know the encoding of the strings.

The compiler can't in general know what encoding that printf, fprintf,
and sprintf will use to parse the string.  It's locale dependent.

Ross Ridge




Re: Preparsing sprintf format strings

2007-10-12 Thread Ross Ridge
Ross Ridge writes:
The compiler can't in general know what encoding that printf, fprintf,
and sprintf will use to parse the string.  It's locale dependent.

Paolo Bonzini writes:
It is undefined what happens if you run a program in a different charset
than in the one you specified for -fexec-charset. (locale != charset).

I don't think that's true, but regardless many systems have runtime
character sets that are dependent on locale.  If GCC doesn't support this,
then GCC is broken.

A google code search for printf.*\\x1[bB][($].*%s hints that this is
not a problem in practice.

In practice, probably not.  I doubt there are any ASCII based systems that
actually support stateful encodings like ISO 2202-JP in their C runtimes.
There is at least one EBCDIC based systems, that fully supports stateful
encodings, but I don't know if in these encodings '%' byte values can
appear outside of the initial shift state.

Ross Ridge



Re: Preparsing sprintf format strings

2007-10-12 Thread Ross Ridge
Ross Ridge wrote:
The compiler can't in general know what encoding that printf, fprintf,
and sprintf will use to parse the string.  It's locale dependent.

Bernd Schmidt writes:
Does this mean it can vary from one run of the program to another? 

Yes, that's the whole point having locales.  So a single program can
work with more than one language.  In fact locales can chage during the
execution of a program.

 I'll admit I don't understand locales very well, but doesn't this
 sound like a recipe for security holes?

A program has to explicitly call setlocale() to change the locale to
anything other than the default C locale.

Ross Ridge



Re: Preparsing sprintf format strings

2007-10-12 Thread Ross Ridge
Ross Ridge writes:
 The entire parsing of the format string is affected by the multi-byte
 character encoding.  I don't know how GCC would be able tell that a byte
 with the same value as '%' in the middle of string would actually be
 interpreted as '%' character rather than a part of an extended multibyte
 character.  This can easily happen with the ISO 2022-JP encoding.

Michael Meissner writes:
 Yes, and the ISO standard for C says that the compiler must be told what
 locale to use when parsing string constants anyway, since the compiler
 must behave as if it did a mbtowc on the source file.

The compiler needs to know the source character set both to parse the
string literal and to translate it into the execution character set.
It doesn't need to know, nor can it generally know, the locale dependent
character set that the standard library will use when parsing printf
format strings.

Ross Ridge



Re: Preparsing sprintf format strings

2007-10-12 Thread Ross Ridge
[EMAIL PROTECTED] (Ross Ridge) writes:
 I don't think that's true, but regardless many systems have runtime
 character sets that are dependent on locale.  If GCC doesn't support this,
 then GCC is broken.

Geoffrey Keating writes:
 I don't think it's unreasonable to insist that you tell the compiler a
 character set that matches the one you are using at execution time for
 string literals.

It's completely unreasonable.  I should be able put whatever byte values
I want into strings literal, using octal and hexidecimal escapes if
necessary, regardless of what locale might be at runtime or what GCC
thinks the execution character set is.  It would be absurd for code like
like fprintf(f, \xFF\xFF); to be undefined only because GCC thinks the
execution character set is UTF-8 or ASCII. 

Ross Ridge



Re: Preparsing sprintf format strings

2007-10-11 Thread Ross Ridge
Heikki Linnakangas writes:
The only features in the printf-family of functions that depends on the
locale are the conversion with thousand grouping (%'d), and glibc
extension of using locale's alternative output digits (%Id). 

The entire parsing of the format string is affected by the multi-byte
character encoding.  I don't know how GCC would be able tell that a byte
with the same value as '%' in the middle of string would actually be
interpreted as '%' character rather than a part of an extended multibyte
character.  This can easily happen with the ISO 2022-JP encoding.

Ross Ridge



Re: recent troubles with float vectors bitwise ops

2007-08-24 Thread Ross Ridge
Mark Mitchell 
Let's assume that the recent change is what we want, i.e., that the
answer to (1) is No, these operations should not be part of the vector
extensions because they are not valid scalar extensions.  

I don't think we should assume that.  If we were to we'd also have
to change vector casts to work like scalar casts and actually convert
the values.  (Or like valarray, disallow them completely.)  That would
force a solution like Paolo Bonzini's to use unions instead of casts,
making it even more cumbersome.

If you look at what these bitwise operations are doing, they're taking
a floating point vector and applying an operation (eg. negation) to
certain members of the vector of according to a (normally) constant mask.
They're really unaray floating-point vector operations.  I don't think
it's unreasonable to want to express these operations using floating-point
vector types directly.  Using vector casts that behave differently than
scalar casts has a lot more potential to generate confusion than allowing
bitwise operations on vector floats does.

As I see it, there's two ways you can express these kinds operations
without using casts that are both cumbersome and misleading.  The easy
way would be to just revert the change, and allow bitwise operations on
vector floats.  This is essentially an old-school programmer-knows-best
solution where the compiler provides operators that represent the sort
of operations generally supported by CPUs.  Even on Altivec these bitwise
operations on vector floats are meaningful and useful.

The other way is to provide a complete set operations that would
make using the bitwise operators pretty much unnecessary, like it is
with scalar floats.  For example, you can express masked negation by
multiplying with a constant vector of -1.0 and 1.0 elements.  It shouldn't
be too hard for GCC to optimize this into an appropriate bitwise
instruction for the target.  For other operations the solution isn't
as nice.  You could implement a set of builtin functions easily enough,
but it wouldn't be much better than using target specific intrinsics.
Chances are though that operatations are going to be missed.  For example,
I doubt anyone unfamiliar with 3D programming would've seen the need
for only negating part of a vector.

(A more concise way to eliminate the need for the bitwise operations on
vector floats would be to implement either the swizzles used in 3D
shaders or array indexing on vectors.  It would require a lot of work
to implement properly, so I don't see it happening.)

Ross Ridge



Re: [RFC] try to generate FP and/or/xor instructions for SSE

2007-08-23 Thread Ross Ridge
Richard Guenther writes:
As I said - at least for AMD CPUs - it looks like you can freely
interchange the ps|pd or integer variants of the bitwise and/or
operations without a penalty.

An example in AMD's Software Optmization Guide for AMD64 Processors
suggests that you can't freely interchange them.  In the example it
gives for using XOR to negate a double-precision vector, it uses XORPD.
If PXOR, XORPS and XORPD were all interchangable, it should have used
XORPS since it's a byte shorter than XORPD.

The guide also says:

When it is necessary to zero out an XMM register, use an
instruction whose format matches the format required by the
consumers of the zeroed register.

...

When an XMM register must be set to zero, using the appropriate
instruction helps reduce the chance of any performance penalty
later.

This advice differs from Intel's, which on Pentium 4 processors recommends
always using PXOR to clear XMM registers, as that instruction breaks
dependency chains, while the XORPS and XORPD instructions don't.
Only the newer Intel Core processors support breaking chains with all
three instructions.

Ross Ridge



Re: recent troubles with float vectors bitwise ops

2007-08-22 Thread Ross Ridge
tbp writes:
Apparently enough for a small vendor like Intel to propose such things
as orps, andps, andnps, and xorps.

Paolo Bonzini writes:
I think you're running too far with your sarcasm. SSE's instructions
do not go so far as to specify integer vs. floating point.  To me, ps
means 32-bit SIMD, independent of integerness

The IA-32 instruction set does distignuish between integer and
floating point bitiwse operations.  In addition to the single-precision
floating-point bitwise instructions that tbp mentioned (ORPS, ANDPS,
ANDNPS and XORPS) there are both distinct double-precision floating-point
bitwise instructions (ORPD, ANDPD, ANDNPD and XORPD) and integer bitwise
instructions (POR, PAND, PANDN and PXOR).  While these operations all do
the same thing, they can differ in performance depending on the context.

Intel's IA-32 Software Developer's Manual gives this warning:

In this example: XORPS or PXOR can be used in place of XORPD
and yield the same correct result. However, because of the type
mismatch between the operand data type and the instruction data
type, a latency penalty will be incurred due to implementations
of the instructions at the microarchitecture level.

And now i guess the only sanctioned access to those ops is via
builtins/intrinsics.

No, you can do so with casts.

tbp is correct.  Using casts gets you the integer bitwise instrucitons,
not the single-precision bitwise instructions that are more optimal for
flipping bits in single-precision vectors.  If you want GCC to generate
better code using single-precision bitwise instructions you're now forced
to use the intrinsics.

Ross Ridge



Re: recent troubles with float vectors bitwise ops

2007-08-22 Thread Ross Ridge
Ross Ridge writes:
tbp is correct.  Using casts gets you the integer bitwise instrucitons,
not the single-precision bitwise instructions that are more optimal for
flipping bits in single-precision vectors.  If you want GCC to generate
better code using single-precision bitwise instructions you're now forced
to use the intrinsics.

GCC makes the problem is even worse if only SSE and not SSE 2 instructions
are enabled.  Since the integer bitwise instructions are only available
with SSE 2, using casts instead of intrinsics causes GCC to expand the
operation into a long series of instructions.

If I were tbp, I'd just code all his vector operatations using intrinsics.
The other responses in this thread have made it clear that GCC's vector
arithemetic operations are really only designed to be used with the Cell
Broadband Engine and other Power PC processors.

Ross Ridge



Re: recent troubles with float vectors bitwise ops

2007-08-22 Thread Ross Ridge
Ross Ridge [EMAIL PROTECTED] wrote:
 GCC makes the problem is even worse if only SSE and not SSE 2 instructions
 are enabled.  Since the integer bitwise instructions are only available
 with SSE 2, using casts instead of intrinsics causes GCC to expand the
 operation into a long series of instructions.

Andrew Pinski writes:
...
Why did Intel split up these instructions in the first place, is it
because they wanted to have a seperate vector units in some cases?
I don't know and I don't care that much. 

Well, if you would rather remain ingorant, I suppose there's little
point in discussing this with you.  However, please don't try to pretend
that the vector extenstions are supposed to be generic when you use
justifications like it's how Altivec works, and it's compatible with
a proprietary standard called C/C++ Language Extensions for Cell
Broadband Engine Architecture.  If you're going to continue to use
justifications like this and ignore the performance implications of
your changes on IA-32, then you should accept the fact that the vector
extensions are not ment for platforms that you don't know and don't care
that much about.

Ross Ridge



Re: I'm sorry, but this is unacceptable (union members and ctors)

2007-06-19 Thread Ross Ridge
Lawrence Crowl writes:
 On the specific topic of unions, there is a proposal before the
committee to extend unions in this direction.  Let me caution you
that this proposal has not been reviewed by a significant fraction
of the committee, and hence has a small chance of being accepted
and an even smaller chance of surviving unchanged.

This only supports my position.  If an active comittee member can't
get their proposal reviewed by a significant fraction of the committee,
then why should an outsider even bother?   You're better off posting a
patch to gcc-patches, at least that will have a chance of being seriously
considered.

Ross Ridge



Re: I'm sorry, but this is unacceptable (union members and ctors)

2007-06-17 Thread Ross Ridge
Ross Ridge wrote:
I completely disagree.  Standards should primarily standardize existing
practice, not inventing new features.  New features should be created
by people who actually want and will use the features, not by some
disinterested committee.

Robert Dewar write:
First of all, I think you mean uninterested and not disinterested,
indeed the ideal is that all committee members *should* be disinterested,
though this is not always the case.

Since it's essentially impossible to be impartial about a feature you
created, both senses of the word apply here.

The history for C here does not apply to C++ in my opinion. Adding new
features to a language like C++ is at this stage highly non-trivial in
terms of getting a correct formal definition.

Most of GCC's long list of extensions to C are also implemented as
extensions to C++, so you've already lost this battle in GNU C++.
Trying to add new a new feature without an existing implementation only
makes it harder to get both a correct formal definition and something
that people will actually want to use.

Ross Ridge



Re: RFC: Make dllimport/dllexport imply default visibility

2007-06-17 Thread Ross Ridge
Daniel Jacobowitz writes:
The minimum I'd want to accept this code would be a complete and useful
example in the manual; since Mark and Danny say this happens a lot
on Windows

I don't understand how this issue can come up at all on Windows.  As far
I know, visibility is an ELF-only thing, while DLLs are a PE-COFF-only
thing.  Is there some platform that supports both sets of attributes?

Ross Ridge



Re: I'm sorry, but this is unacceptable (union members and ctors)

2007-06-16 Thread Ross Ridge
Robert Dewar writes:
The only time that it is reasonable to extend is when there are clear
signals from the standards committee that it is likely that a feature
will be added, in which case there may be an argument for adding the
feature prematurely.

I completely disagree.  Standards should primarily standardize existing
practice, not inventing new features.  New features should be created
by people who actually want and will use the features, not by some
disinterested committee.

GCC has always been a place for experimenting with new features.  Many of
the new features in C99 had already been implemented GCC.  Even in the
cases where C99 standardized features differently, I think both GCC and
Standard C benefited from the work done in GCC.

Ross Ridge



Re: MinGW, GCC Vista,

2007-05-08 Thread Ross Ridge
Mark Mitchell writes:
In my opinion, this is a GCC bug: there's no such thing as X_OK on
Windows (it's not in the Microsoft headers, or documented by Microsoft
as part of the API), and so GCC shouldn't be using it.

Strictly speaking, access() (or _access()) isn't a documented part of
any Windows ABI.  It's only documented as part of the C Runtime Library
for Visual C++, a different product.  This is an important distinction,
while MinGW should support Windows APIs as documented by Microsoft,
it's not ment to be compatable with Visual C++.  MinGW does use the same
runtime DLL as used by Visual C++ 6.0, but this is essentially just an
implementation detail, not ment as a compatibility goal.  There are a
few of ways MinGW's runtime is incompatable with Visual C++ 6.0.

One of those ways is that the MinGW headers define R_OK, W_OK and X_OK.
That was a probably a mistake, but in order for the MinGW runtime to
be compatibile with both previous implementations and Windows Vista I
think this change makes sense.

Ross Ridge



Re: Integer overflow in operator new

2007-04-09 Thread Ross Ridge
Florian Weimer writes:
Yeah, but that division is fairly expensive if it can't be performed
at compile time.  OTOH, if __compute_size is inlined in all places,
code size does increase somewhat.

Well, I believe the assumption was that __compute_size would be inlined.
If you want to minimize code size and avoid the division then a library
function something like following might work:

void *__allocate_array(size_t num, size_t size, size_t max_num) {
if (num  max_num)
size = ~size_t(0);
else
size *= num;
return operator new[](size);
}

GCC would caclulate the constant ~size_t(0) / size and pass it as the
third argument.  You'ld be trading a multiply for a couple of constant
outgoing arguments, so the code growth should be small.  Unfortunately,
you'd be trading what in most cases is a fast shift and maybe add or
two for slower multiply.

So long as whatever switch is used to enable this check isn't on by
default and its effect on code size and speed is documented, I don't
think it matters that much what those effects are.  Anything that works
should make the people concerned about security happy.   People more
concerned with size or speed aren't going to enable this feature.

Ross Ridge



Re: Integer overflow in operator new

2007-04-08 Thread Ross Ridge
Joe Buck writes:
 inline size_t __compute_size(size_t num, size_t size) {
 size_t product = num * size;
 return product = num ? product : ~size_t(0);
 }

Florian Weimer writes:
I don't think this check is correct.  Consider num = 0x3334 and
size = 6.  It seems that the check is difficult to perform efficiently
unless the architecture provides unsigned multiplication with overflow
detection, or an instruction to implement __builtin_clz.

This should work instead:

inline size_t __compute_size(size_t num, size_t size) {
if (num  ~size_t(0) / size) 
return ~size_t(0);
return num * size;
}

Ross Ridge



Re: Integer overflow in operator new

2007-04-07 Thread Ross Ridge
Joe Buck writes:
If a check were to be implemented, the right thing to do would be to throw
bad_alloc (for the default new) or return 0 (for the nothrow new).

What do you do if the user has defined his own operator new that does
something else?

There cases where the penalty for this check could have
an impact, like for pool allocators that are otherwise very cheap.
If so, there could be a flag to suppress the check.

Excessive code size growth could also be problem for some programs.

Ross Ridge



Re: Integer overflow in operator new

2007-04-07 Thread Ross Ridge
Joe Buck writes:
If a check were to be implemented, the right thing to do would be to throw
bad_alloc (for the default new) or return 0 (for the nothrow new).

Ross Ridge writes:
What do you do if the user has defined his own operator new that does
something else?

Gabriel Dos Reis writes:
More precisely?

Well, for example, like all other things that a new_handler can do,
like throwing an exception derived from bad_alloc or calling exit().
In addition, any number of side effects are possible, like printing
error messages or setting flags.

Those programs willing to do anything to avoid imagined or perceived
excessive code size growth may use the suggested switch.

The code size growth would be real, and there are enough applications
out there that would consider any unnecessary growth in code excessive.
The switch would be required both for that reason, and for Standard
conformance.

Ross Ridge



Re: Integer overflow in operator new

2007-04-07 Thread Ross Ridge
[EMAIL PROTECTED] (Ross Ridge) writes:
 Well, for example, like all other things that a new_handler can do,
 like throwing an exception derived from bad_alloc or calling exit().
 In addition, any number of side effects are possible, like printing
 error messages or setting flags.

Gabriel Dos Reis writes:
I believe you're confused about the semantics.  
The issue here is that the *size of object* requested can be
represented.  That is independent of whether the machine has enough
memory or not.  So, new_handler is a red herring

The issue is what GCC should do when the calculation of the size of
memory to allocate with operator new() results in unsigned wrapping.
Currently, GCC's behavior is standard conforming but probably isn't the
expected result.  If GCC does something other than what operator new()
does when there isn't enough memory available then it will be doing
something that is both non-conforming and probably not what was expected.

Ross Ridge



Re: Integer overflow in operator new

2007-04-07 Thread Ross Ridge
Joe Buck writes:
Consider an implementation that, when given

Foo* array_of_foo = new Foo[n_elements];

passes __compute_size(elements, sizeof Foo) instead of n_elements*sizeof Foo
to operator new, where __compute_size is

inline size_t __compute_size(size_t num, size_t size) {
size_t product = num * size;
return product = num ? product : ~size_t(0);
}

Yes, doing something like this instead would largely answer my concerns.

This counts on the fact that any operator new implementation has to fail
when asked to supply every single addressible byte, less one. 

I don't know if you can assume ~size_t(0) is equal to the number of
addressable bytes, less one.  A counter example would be 16-bit 80x86
compilers where size_t is 16-bits and an allocation of 65535 bytes can
succeed, but I don't know if GCC supports any targets where something
similar can happen.

I haven't memorized the standard, but I don't believe that this
implementation would violate it.  The behavior differs only when more
memory is requested than can be delivered.

It differs because the actual amount of memory requested is the result
of the unsigned multiplication of n_elements * sizeof Foo, using your
example above.  Since this result of this caclulation isn't undefined,
even if it overflows, there's no room for the compiler to calculate
a different value to pass to operator new().

Ross Ridge



Re: RFC: Enable __declspec for Linux/x86

2007-04-03 Thread Ross Ridge
Joe Buck write:
If the Windows version of GCC has to recognize __declspec to function
as a hosted compiler on Windows, then the work already needs to be done
to implement it.

Well, I'm kinda surprised that Windows verision of GCC recognizes
__declspec.  The implementation is just a simple macro, and could've just
as easily been implemented in a runtime header, as the MinGW runtime does.

 So what's the harm in allowing it on other platforms?

Probably none, but since the macro can be defined on the command line
with -D__declspec(x)=__attribute__((x)) defining it by default on
other platforms is only a minor convenience.

If it makes it easier for Windows programmers to move to free compilers
and OSes, isn't that something that should be supported?

I suppose that would argue for unconditionally defining the macro
regardless of the platform.

Ross Ridge



Re: i386: Problems with references to import symbols.

2007-03-21 Thread Ross Ridge
Richard Henderson writes:
Dunno.  One could also wait to expand *__imp_foo, for functions,
until expanding the function call.  And then this variable would
receive the address of the import library thunk.

What does VC++ do? 

It seems to always use *__imp_foo except when initializing a statically
allocated variable in C.  In that case it uses _foo, unless compiling
with extensions disabled (/Za) in which case it generates a similar error
as we do.  In C++ it uses dynamic initialization like Dave Korn suggested.

 I'm mostly wondering about what pointer equality guarantees we can make.

It looks like MSC requires that you link with the static CRT libraries
if you want strict standard conformance.

Ross Ridge



Re: Building mainline and 4.2 on Debian/amd64

2007-03-19 Thread Ross Ridge
Joe Buck writes:
This brings up a point: the build procedure doesn't work by default on
Debian-like amd64 distros, because they lack 32-bit support (which is
present on Red Hat/Fedora/SuSE/etc distros).  Ideally this would be
detected when configuring.

The Debian-like AMD64 system I'm using has 32-bit support, but the build
procedure breaks anyways because it assumes 32-bit libraries are in lib
and 64-bit libraries are in lib64.  Instead, this Debian-like AMD64
system has 32-bit libraries in lib32 and 64-bit libraries in lib.

Ross Ridge



Re: symbol names are not created with stdcall syntax: MINGW, (GCC) 4.3.0 20061021

2007-03-12 Thread Ross Ridge
Ross Ridge wrote:
 Any library that needs to be able to be called from VisualBasic 6 or some
 other stdcall only environment should explictly declare it's exported
 functions with the stdcall calling convention.

Tobias Burnus writes:
 Thus, if I understood you correctly, you recommend that we add, e.g.,
 pragma support to gfortran with a pragma which adds the
 __attribute__((stdcall)) to the tree?

I have no idea what would be the best way to do it in Fortran, but yes,
something that would add the stdcall attribute.

Ross Ridge



Re: symbol names are not created with stdcall syntax: MINGW, (GCC) 4.3.0 20061021

2007-03-10 Thread Ross Ridge
Danny Smith writes:
Unless you are planning to use a gfortran dll in a VisualBasic app, I
can see little reason to change from the default C calling convention

FX Coudert writes:
That precise reason is, as far as I understand, important for some
people. Fortran code is used for internal routines, built into shared
libraries that are later plugged into commercial apps.

Well, perhaps things are different in Fortran, but the big problem with
using -mrtd in C/C++ is that it changes the default calling convention for
all functions, not just those that are ment to be exported.  While most of
MinGW's of headers declare the calling convention of functions explictily,
not all of them do.

How hard do you think it would be to implement a -mrtd-naming option
(or another name) to go with -mrtd and add name decorations

It wouldn't be too hard, but I don't think it would be a good idea to
implement.  It would mislead people into thinking the option might be
useful, and -mrtd fools enough people as it is.  Adding name decorations
won't make it more useful.  From the examples I've seen, VisualBasic 6
has no problem DLL functions expored without @n suffixes.

Any library that needs to be able to be called from VisualBasic 6 or some
other stdcall only environment should explictly declare it's exported
functions with the stdcall calling convention.

Ross Ridge



Re: I need some advice for x86_64-pc-mingw32 va_list calling convention (in i386.c)

2007-02-26 Thread Ross Ridge
Kai Tietz writes:
But I still have a problem about va-argument-passing. The MS compiler
reserves stack space for all may va-callable methods register arguments.

Passing arguments to functions with variable arguments isn't a special
case here.  According to Microsoft's documentation, you always need to
allocate space for 4 arguments.

The only thing different you need to do with functions taking variable
arguments (and unprototyped functions) is to pass floating point values
both in the integer and floating point registers for that argument.

Ross Ridge



Re: bootstrap failure on HEAD

2006-11-12 Thread Ross Ridge
Dave Korn writes:
Is it just me, or does anyone else get this?  I objdump'd and diff'd the
stage2 and stage3 versions of cfg.o and it seems to have developed a habit of
inserting 'shrd'/'shld' opcodes:

It looks to me like the stage3 version with the shrd/shld is correct
and it's that stage2 version that's missing opcodes.  In both versions
the source and destination of the shift are a 64-bit pair of registers,
but the stage2 version uses 32-bit shifts, while the stage3 version uses
64-bit shitfs.  The code in the first chunk looks like it's the result
of the expansion of the RDIV macro with the dividend being a gcov_type
value and the divisor being 65536.  It looks like gcov_type is 64-bits,
so it should be using 64-bit arithmetic.

 although disturbingly enough there's a missing 'lea' too:

It's a NOP.  Probably inserted by the assembler because of an alignment
directive.

Ross Ridge



Re: strict aliasing question

2006-11-11 Thread Ross Ridge
Howard Chu wrote:
 extern void getit( void **arg );
 
 main() {
union {
int *foo;
void *bar;
} u;
 
getit( u.bar );
printf(foo: %x\n, *u.foo);
 }

Rask Ingemann Lambertsen wrote:
 As far as I know, memcpy() is the answer:

You don't need a union or memcpy() to convert the pointer types.  You can
solve the void ** aliasing problem with just a cast:

void *p;
getit(p);
printf(%d\n, *(int *)p);

This assumes that getit() actually writes to an int object and returns
a void * pointer to that object.  If it doesn't then you have another
aliasing problem to worry about.  If it writes to the object using some
other known type, then you need two casts to make it safe:

void *p;
getit(p);
printf(%d\n, (int)*(long *)p);

If writes to the object using an unknown type then you might able to
use memcpy() to get around the aliasing problem, but this assumes you
know that two types are compatable at the bit level:

void *p;
int n;
getit(p);
memcpy(n, p, sizeof n);
printf(%d\n, n);

The best solution would be to fix the interface so that it returns the
pointer types it acutally uses.  This would make it typesafe and you
wouldn't need to use any casts.  If you can't fix the interface itself
the next best thing would be to create your own wrappers which put all
the nasty casts in one place:

int sasl_getprop_str(sasl_conn_t *conn, int prop, char const **pvalue)
{
assert(prop == SASL_AUTHUSER || prop == SASL_APPNAME || ...);
void *tmp;
int r = sasl_getprop(conn, prop, tmp);
if (r == SASL_OK) 
*pvalue = (char const *) tmp;
return r;
}

Unfortuantely, there are aliasing problems in the Cyrus SASL source that
can still come around and bite you once LTO arrives no matter what you
do in your own code.  You might want to see if you can't get them to
change undefined code like this:

*(unsigned **)pvalue = conn-oparams.maxoutbuf;

into code like this:

*pvalue = (void *) conn-oparams.maxoutbuf;

Ross Ridge



Re: Threading the compiler

2006-11-11 Thread Ross Ridge
Ross Ridge wrote:
Umm... those 80 processors that Intel is talking about are more like the
8 coprocessors in the Cell CPU. 

Michael Eager wrote:
No, the Cell is asymmetrical (vintage 2000) architecture.

The Cell CPU as a whole is asymmetrical, but I'm only comparing the
design to the 8 identical coprocessors (of which only 7 are enabled in
the CPU used in the PlayStation 3).

Intel  AMD have announced that they are developing large multi-core
symmetric processors.  The timelines I've seen say that the number of
cores on each chip will double every year or two. 

This doesn't change that fact that SMP systems don't scale well after
16 processors or so.  To go beyond that you need a different design.
Clustering and NUMA have been ways of solving the problem outside the
chip.  Intel's plan for solving it inside the chip involves giving each
of the 80 cores it's own 32 MB of SRAM and only connecting each core to
its immediate neighbours.  This is similiar to the Cell SPE's.  Each has
256K of local memory and they're all connected together in a ring.

 Moore's law hasn't stopped.

While Moore's Law may still be holding on, bus and memory speeds aren't
doubling every two years.  You can't design an 80 core CPU like an 4 core
CPU with 20 times as many cores.  Having 80 processors all competing over
the same bus for the same memory won't work.  Neither will make -j80.
You need to do more than just divide up the work between different
processes or threads.  You need to divide up the program and data into
chunks that will fit into each core's local memory and orchestrate
everything so that the data propagates smoothly between cores.

 The number of gates per chip doubles every 18 months.

Actually, in fact it's closer to doubling every 24 months and Gordon
Moore never said it would double every 18 months.  Originaly in 1965
he said that the number of components doubled every year, in 1975 after
things slowed down he revised it to doubling every two years.

Ross Ridge



Re: Threading the compiler

2006-11-10 Thread Ross Ridge
Mike Stump writes:
We're going to have to think seriously about threading the compiler. Intel
predicts 80 cores in the near future (5 years). [...] To use this many
cores for a single compile, we have to find ways to split the work. The
best way, of course is to have make -j80 do that for us, this usually
results in excellent efficiencies and an ability to use as many cores
as there are jobs to run.

Umm... those 80 processors that Intel is talking about are more like the
8 coprocessors in the Cell CPU.  It's not going to give you an 80-way
SMP machine that you can just make -j80 on.  If that's really your
target achitecture you're going to have to come up with some really
innovative techniques to take advantage of it in GCC.  I don't think
working on parallelizing GCC for 4- and 8-way SMP systems is going to
give you much of a head start.  Which isn't to say it wouldn't be a
worthy enough project in it's own right.

Ross Ridge



Re: Why doesn't libgcc define _chkstk on MinGW?

2006-11-04 Thread Ross Ridge
Ross Ridge wrote:
There are other MSC library functions that MinGW doesn't provide, so
libraries may not link even with a _chkstk alias.

Mark Mitchell wrote:
Got a list?

Probably the most common missing symbols, using their assembler
names are:

__ftol2
@[EMAIL PROTECTED]
___security_cookie

These are newer symbols in the MS CRT library and also cause problems
for Visual C++ 6.0 users.  I've worked around the missing security cookie
symbols by providing my own stub implementation, but apparently newer
versions of the Platform SDK include a library that fully implement these.
I'm not sure how _ftol2 is supposed to be different from _ftol, but
since I use -ffast-math anyways, I've just used the following code
as a work around:

long _ftol2(double f) { return (long) f; }

Looking at an old copy of MSVCRT.LIB (c. 1998) other missing symbols
that might be a problem include:

T __alldiv [I]
T __allmul [I]
T __alloca_probe [I][*]
T __allrem [I]
T __allshl [I][*]
T __allshr [I]
T __aulldiv [I]
T __aullrem [I]
T __aullshr [I]
A __except_list [I][*]
T __matherr [D]
T __setargv [D]
T ___setargv [X]
A __tls_array [I]
B __tls_index [I]
R __tls_used [I]
T __wsetargv [D]

[D] Documented external interface
[I] Implicitly referenced by the MSC compiler
[X] Undocumented external interface
[*] Missing symbols I've encountered

The are other problems related to linking that can make an MSC compiled
static library incompatible including not processing MSC intialization
and termination sections, no support for thread-local variables and
broken COMDAT section handling.

Ross Ridge



Re: Merging identical functions in GCC

2006-09-18 Thread Ross Ridge
Daniel Berlin writes
Please go away and stop trolling.

I'm not the one who's being rude and abusive.

If your concern is function pointers or global functions, you can
never eliminate any global function, unless your application doesn't
call dlopen, or otherwise load anything dynamically, including through
shared libs.

I hope that doesn't include global functions like listint::sort() and
listlong::sort(), otherwise your optimization is pretty much useless.
If your optimization does merge such functions then you're still left
with the problem that their member function pointers might be compared
in another compilation unit.

Ross Ridge



Re: Merging identical functions in GCC

2006-09-16 Thread Ross Ridge
Ross Ridge writes:
No, and I can't see how how you've came up with such an abusurd
misintepretation of what I said.  As I said clearly and explicity,
the example I gave was where you'd want to use function merging.

Daniel Berlin writes:
Whatever.  Why would you turn on function merging if you are trying to
specifically get the compiler to produce different code for your
functions than it did before?

Because I as already said, you want to merge the funtions that happen
to be same.  You don't want to merge the ones that aren't the same.
Sometimes using different compiler options (eg. for CPU architecture)
generates different code, sometimes it doesn't.  If you could always
predict what the exact code the compiler was going generate you'd might
as well write your code in assembly.

As an FYI, you already have this situation with linkonce functions.

No, linkonce functions get merged because they have same name.

I think this is best done by linker which
can much more reliably compare the contents of functions to see if they
are the same.

No it can't. It has no idea what a function consists of other than a
bunch of bytes, in pretty much all cases.  ... Stupid byte
comparisons of functions generally won't save you anything truly
interesting.

Microsoft's implementation has proven that stupid byte comparions can
generate significant savings. 

Ross Ridge



Re: Merging identical functions in GCC

2006-09-16 Thread Ross Ridge
Gabriel Dos Reis write:
Not very logn ago I spoke with the VC++ manager about this, and he
said that their implementation currently is not conforming -- but
they are working on it.  The issue has to with fint and flong
required to have difference addresses -- which is violated by their
implementation. 

Yes, this issue has already been mentioned in this thread and is a problem
regardless of how you compare functions to find out if they are the same.
The compiler also needs to be able to detect when its safe to merge
functions that are identical.

Ross Ridge



Re: Merging identical functions in GCC

2006-09-16 Thread Ross Ridge
Ross Ridge writes:
Microsoft's implementation has proven that stupid byte comparions can
generate significant savings.

Daniel Berlin wrtes:
No they haven't.

So Microsoft and everyone who says they've got significant savings using
it is lying?

But have fun implementing it in your linker, and still making it safe
if that's what you really want.
I'm not going to do that, and I don't believe it is a good idea.

I'm not asking you to do anything.  I'm just telling you that I don't
think your idea is any good. 

Ross Ridge



Re: Merging identical functions in GCC

2006-09-16 Thread Ross Ridge
Daniel Berlin writes
Do you really want me to sit here and knock down every single one of
your arguments?

Why would you think I would've wanted your No, it isn't responses
instead?

Your functions you are trying to optimize for multiple cpu
types and compiled with different flags may be output as linkonce
functions.  The linker is simply going to pick one, regardless of what
CPU architecture or assembly it generated...

No, in the example I gave, the functions have different names.

The fact is that Microsoft's implementation rarely generates
significant savings over that given by linkonce functions, and when
it does, it will in no way compare to anything that does *more* than
stupid byte comparisons will give you.

No, linkonce function discarding is always done by the Microsoft
toolchain and can't be disabled.  The reported savings are the result of
comparing the results of enabling and disabling identical COMDAT folding.
I don't see how your intelligent hashing can do significantly better
except by merging functions that aren't really identical. 

That's nice.  It's the only way to do it sanely and correctly in all
cases, without having to teach the linker how to look at code, or to
control the linker (which we don't on some platforms), and output a
side channel explaining what it is allowed to eliminate, at which
point, you might as well do it in the compiler!

How does hashing the RTL and using that as the COMDAT label solve this
problem?  You're telling the linker you know it's safe to merge when you
don't know if the function's address is compared in another compilation
unit or not.

You can believe what you like about the idea.  Until you are willing
to implement something *you* believe will help, or at the least
explain how you forsee it being done safely (which Microsoft
doesn't!), it's very hard to take you seriously

As I've already said, it can be made safe by communicating to the linker
which functions have had their address taken.  Yes, this requires special
support from the linker, but then so has linkonce on some platforms.
If that special support isn't available you're still left with an unsafe
but very useful optimization for applications that don't compare function
pointers.

Ross Ridge



Re: Merging identical functions in GCC

2006-09-15 Thread Ross Ridge
Ian Lance Taylor wrote:
I think Danny has a 75% implementation based on hashing the RTL for a
section and using that to select the COMDAT section signature.

I don't think this is a good idea.  With different compiler options the
same RTL can generate different assembly instructions.  Consider the case
of compiling the same function multiple times with different names and
different CPU architectures selected.  You'd actually want the linker
to merge the functions that ended up having the same assembly, but not
the ones with the same RTL but different assembly.

Also, I don't think it's safe if you merge only functions in COMDAT
sections.

Consider:

#include assert.h

template class T T foo(T a) { return a; }
template class T T bar(T a) { return a; }

int
main() {
assert((int (*)(int)) fooint != (int (*)(int)) barint);
}

Both fooint and barint get put in their own COMDAT section and their
RTL and assembly are the same, but it's not safe to merge them.

Simply merging identical COMDAT sections would have to be optional and
disabled by default as Michael Popov said at the start of this thread.
The only way I can see to do it safely would be to emit some sort
instruction not to merge a function when the compiler sees that its
address is taken.

Ross Ridge



Re: Merging identical functions in GCC

2006-09-15 Thread Ross Ridge

Ross Ridge writes:
I don't think this is a good idea.  With different compiler options the
same RTL can generate different assembly instructions.  Consider the case
of compiling the same function multiple times with different names and
different CPU architectures selected.  You'd actually want the linker
to merge the functions that ended up having the same assembly, but not
the ones with the same RTL but different assembly.

Daniel Berlin writes:
So basically you are saying if you don't know what you are doing, or
know you don't want to use it, you shouldn't be using it.

No, and I can't see how how you've came up with such an abusurd
misintepretation of what I said.  As I said clearly and explicity,
the example I gave was where you'd want to use function merging.

The current hash actually takes into account compiler options as a
starting value for the hash, btw!)

Well, then that brings up the other problem I have with this, figuring
out exactly which options and which parts of the RTL should be hashed
seems to be too error prone.  I think this is best done by linker which
can much more reliably compare the contents of functions to see if they
are the same.

Ross Ridge



Re: does gcc support multiple sizes, or not?

2006-08-17 Thread Ross Ridge
Mark Mitchell wrote:
I think you really have to accept that the change you want to make goes
to a relatively fundamental invariant of C++.

I don't see how you can call this a realatively fundamental invariant
of C++, given how various C++ implementations have supported multiple
pointer sizes for much of the history of C++.  Perhaps you could argue
that Standard C++ made a fundamental change to the language, but I don't
think so.  The original STL made specific allowances for different memory
models and pointer types, and this design, with it's otherwise unnecessary
pointer and size_type types, was incorporated in to the standard.
I think the intent of the (T *)(U *)(T *)x == (T *)x invariant was
only to limit the standard pointer types, not make to non-standard
pointer types of differt size fundamentally not C++.  (Unlike, say,
the fundamental changes the standard made to how templates work...)

Ross Ridge



Re: RFC: __cxa_atexit for mingw32

2006-06-28 Thread Ross Ridge
Mark Mitchell writes:
As a MinGW user, I would prefer not to see __cxa_atexit added to MinGW.
I really want MinGW to provide the ability to link to MSVCRT: nothing
more, nothing less.

Well, even Microsoft's compiler doesn't just to link MSVCRT.DLL (or it's
successors) a certain part of C runtime is implemented as static objects
in MSVCRT.LIB.  MinGW has to provide equivilent functionality in their
static runtime library, or at least what GCC doesn't already provide in
it's runtime library.

 ... I think it would be better to adopt G++ to use whatever method
Microsoft uses to handle static destructions.

I've looked into handling Microsoft's static constructors correctly when
linking MSC compiled objects with MinGW and I don't think it's an either
or situtation.  MinGW can handle both it's own style of construction and
Microsoft's at the same time.  I didn't look into how Microsoft handles
destructors though, because the objects in particular I was concerned
about didn't seem to use them.

Ultimately, I would like to see G++ support the Microsoft C++ ABI --
unless we can convince Microsoft to support the cross-platform C++ ABI. :-)

Hmm... I'm not sure which would be easier. 

btw. regarding Microsoft's patents, Google turned up this link:

http://www.codesourcery.com/archives/cxx-abi-dev/msg00097.html

That message is from 1999, so I wouldn't be surprised if Microsoft has
filed a bunch of new C++ ABI patents since then.

Ross Ridge



Re: why are we not using const?

2006-06-27 Thread Ross Ridge
Andrew Pinski wrote:
Stupid example where a const argument can change:
tree a;
int f(const tree b)
{
  TREE_CODE(a) = TREE_CODE (b) + 1;
  return TREE_CODE (b);
}

You're not changing the constant argument b, just what b might point
to.  I don't think there are any optimizing opportunities for arguments
declared as const, as opposed to arguments declared as pointing to const.

Ross Ridge



Re: Coroutines

2006-06-19 Thread Ross Ridge
Ross Ridge wrote:
Hmm?  I don't see how the Lua-style coroutines you're looking are any
lightweight than what Maurizio Vitale is looking for.  They're actually
more heavyweight because you need to implement some method of returning
values to the coroutine being yeilded to.

Dustin Laurence wrote:
I guess that depends on whether the userspace thread package in question
provides for a return value as pthreads does.

Maurizio Vitale clearly wasn't looking for pthreads.

 In any case, coroutines don't need a scheduler, even a cooperative one.

He also made it clear he wanted schedule his threads himself, just like
you want to do.  In fact, what he seems to be trying to implement are
true symmetric coroutines.

Ross Ridge



Re: Coroutines

2006-06-18 Thread Ross Ridge
Maurizio Vitale wrote:
 I'm looking at the very same problem, hoping to get very lightweight  
 user-level threads for use in discrete event simulation.

Dustin Laurence wrote:
Yeah, though even that is more heavyweight than coroutines, so your job
is harder than mine.

Hmm?  I don't see how the Lua-style coroutines you're looking are any
lightweight than what Maurizio Vitale is looking for.  They're actually
more heavyweight because you need to implement some method of returning
values to the coroutine being yeilded to.

Ross Ridge



Re: TLS on windows

2006-06-08 Thread Ross Ridge
Ross Ridge wrote:
 Actually, the last one I haven't done yet.  I've just been using a linker
 script to do that, but it should be in a library so the TLS directory
 entry isn't created if the executable doesn't use TLS.

Richard Henderson wrote:
 You can also create this in the linker, without a library.
 Not too difficult, since you've got to do that to set the
 bit in the PE header anyway.

Fortunately, the linker already supports setting the TLS directory entry
in the PE header if a symbol named __tls_used exists.  Section relative
relocations are also already supported (for DWARF, I think), I just
needed to add the syntax to gas.

Ross Ridge



Re: [MinGW] Set NATIVE_SYSTEM_HEADER_DIR relative to configured prefix

2006-06-05 Thread Ross Ridge
Ranjit Mathew wrote:
 Danny, I'm using the same configure flags that you have used for GCC
3.4.5 MinGW release (*except* for --prefix=/mingw, which is something
like --prefix=/j/mingw/mgw for me), but the GCC I get is not relocatable
at all, while I can put the MinGW GCC 3.4.5 release anywhere on the
filesystem and it still works. :-(

The GCC I get from my native MinGW build of the trunk is relocatable:

e:\util\mygcc.new\bin\gcc -v -E -o nul -x c x.c
Using built-in specs.
Target: mingw32
Configured with: ../gcc/configure --prefix=/src/gcc/runtime --target=mingw32 
--host=mingw32 --enable-languages=c,c++ --enable-threads=win32 
--with-win32-nlsapi=unicode --enable-bootstrap --disable-werror 
--with-ld=/src/gcc/runtime/bin/ld --with-as=/src/gcc/runtime/bin/as
Thread model: win32
gcc version 4.2.0 20060513 (experimental)
 e:/util/mygcc.new/bin/../libexec/gcc/mingw32/4.2.0/cc1.exe -E -quiet -v 
-iprefix e:\util\mygcc.new\bin\../lib/gcc/mingw32/4.2.0/ x.c -o nul.exe 
-mtune=i386
ignoring nonexistent directory 
e:/util/mygcc.new/bin/../lib/gcc/mingw32/4.2.0/../../../../mingw32/include
ignoring nonexistent directory /src/gcc/runtime/include
ignoring nonexistent directory /src/gcc/runtime/include
ignoring nonexistent directory /src/gcc/runtime/lib/gcc/mingw32/4.2.0/include
ignoring nonexistent directory /src/gcc/runtime/mingw32/include
ignoring nonexistent directory /mingw/include
#include ... search starts here:
#include ... search starts here:
 e:/util/mygcc.new/bin/../lib/gcc/mingw32/4.2.0/../../../../include
 e:/util/mygcc.new/bin/../lib/gcc/mingw32/4.2.0/include
End of search list.

It picks up the system include directory without a problem.  What
exactly is the error you're getting that indicates that your compiled
version of GCC isn't relocatable?

Ross Ridge



Re: [MinGW] Set NATIVE_SYSTEM_HEADER_DIR relative to configured prefix

2006-06-05 Thread Ross Ridge
Ross Ridge wrote:
The GCC I get from my native MinGW build of the trunk is relocatable:

Hmm... I should have sent that to gcc-patches, sorry.

Ross Ridge



Re: TLS on windows

2006-06-04 Thread Ross Ridge
FX Coudert wrote:
 Now, for an idea of how much work it represents... perhaps someone
here can tell us?

It's not too hard but it requires changing GCC and binutils, plus a
bit of library support.  In my implementation (more or less finished,
but I have had time to test it yet), I did the following:

- Used the existing __thread support in the front-end.  Silently
  ignore the ELF TLS models, because Windows only has one model.
- Added target specific (cygming) support for __attribute__((thread))
  aka __declspec(thread) for MSC compatibility.
- Created an legitimize_win32_tls_address() to replace
  legitimize_tls_address() in i386.c.  It outputs RTL like:
(set (reg:SI tp) (mem:SI (unspec [(const_int 44)] WIN32_TIB)))
(set (reg:SI index) (mem:SI (symbol_ref:SI __tls_index__)))
(set (reg:SI base) (mem:SI (add:SI (reg:SI tp)
   (mult:SI (reg:SI index)
(const_int 4)
(plus:SI (reg:SI base)
 (const:SI (unspec:SI [(symbol_ref:SI foo)]
  SECREL
- Handled the WIN32_TIB unspec by outputting %fs:44 and the
  SECREL unspec by outputting foo`SECREL.  I couldn't use
  [EMAIL PROTECTED] because @ is valid in identifiers with PECOFF.
- Support .tls sections in PECOFF by creating an
  i386_pe_select_section() based on the generic ELF version.
- Added an -mfiber-safe-tls target specific option that makes
  the references to the WIN32 TIB non-constant.
- Modified gas to handle foo`SECREL, based on the ELF support
  for @ relocations
- Fixed some problems with TLS handling in the PECOFF linker
  script
- Created an object file that defines the __tls_used structure
  (and thus the TLS directory entry) and __tls_index__.

Actually, the last one I haven't done yet.  I've just been using a linker
script to do that, but it should be in a library so the TLS directory
entry isn't created if the executable doesn't use TLS.

Ross Ridge



Re: mingw32 subtle build failure

2006-05-31 Thread Ross Ridge
FX Coudert wrote:
 -B/mingw/i386-pc-mingw32/bin/

This looks wrong, it should be /mingw/mingw32/bin.  Putting a copy of
as and ld in /mingw/i386-pc-mingw32/bin might work around your problem.

Ross Ridge



Re: Segment registers support for i386

2006-05-31 Thread Ross Ridge
Remy Saissy wrote:
I've looked for a target specific callback to modify but I've found
nothing, even in the gcc internals info pages. Do you mean I would
have to modify some code outside of the i386 directory ? Or maybe to
add such a callback if it doesn't exist ;)

You'ld have to modify code in the main GCC directory, probably a lot
of code.  Since it's target dependent, you'ld need to implement it using
a hook or hooks.

In which file does the tree to RTL conversion code is located ?

There are several files that do this jobs.  See the internals
documentation.

Does it mean that an RTL expression which use reg: force gcc to use a
particular pseudo register ? 

Pseudo registers aren't real registers.  They either get changed to real
hard registers, or memory references to stack slots.  See the internals
documentation for more details.

Ross Ridge



Re: Segment registers support for i386

2006-05-29 Thread Ross Ridge
Remy Saissy wrote:
if I understand well, to make gcc generating rtx according to an
__attribute__((far(fs))) on a pointer I only have to add or modify
rtx in the i386.md file and add an UNSPEC among the constants ?

No, the work you need to on the backend, adding an UNSPEC constant to
i386.md and writing code to handle the UNSPEC in i386.c is just the
easy part.

 What I understand is that there is two kind of managment for attribute :

Attributes are handled in various different ways depending on what the
attribute does.  To handle your case correctly, you'ld have to change how
the tree to RTL conversion generates RTL addresses expressions whenever
a pointer with the far attribute is dereferenced.  This is probably
going to be a lot work.

 Therefore, I can consider the following relationship:
  (mem:SI (plus:SI (unspec:SI [(reg:HI fs)] SEGREF) (reg:SI var)))
   |  |||
  \/ \/   \/   \/
  int * __attribute__((far(fs)))  p;

No, that's not what the RTL expression represents.  Declarations aren't
represented in RTL.  The example RTL expression I gave is just an
expression, not a full RTL instruction.  It's something that could be
used as the memory operand of an instruction.  The RTL expression I gave
would correspond to a C expression (not a statement) like this:

*(int * __atribute__((far(fs var

 does (reg:HI fs) care about the type of the parameter fs ?

See the GCC Internals documentation.  In my example, since I don't know
what the actual hard register number you assigned to the FS segment
register, I just put fs in the place where the actual register number
would appear.  Similarily, the var in  (reg:SI var) represents
the number of the pseudo-register GCC would allocate for an automatic
variable named var.

 how does gcc recognize such an expression ? 

Since this expression is a memory operand, it's recognized by the
GO_IF_LEGITIMATE_ADDRESS() macro.  In the i386 port, that's implemented
by legitimate_address_p() in i386.c.

Ross Ridge



Re: Compiling files not encoded with system settings

2006-05-24 Thread Ross Ridge
Nicolas De Rico wrote:
 The file hi-utf16.c, created with Notepad and saved in unicode,
contains a BOM which is, in essence, a small header at the beginning of
the file that indicates the encoding.

It's not a header that indicates the encoding.  It's a header that
indicates the byte order of the 16-bit values that follow when the
encoding is already known to be UTF-16.  When then encoding is known
to be UTF-16LE or UTF-16BE there shouldn't be any BOM present at the
start of a C file, since a BOM in the correct byte order is actually
the Unicode zero-width non-breaking space character, which isn't valid
as the first character in a C file.  Similarly, there shouldn't be a
BOM mark at the start of a UTF-8 C file, especially since UTF-8 encoded
files don't have a byte-order.

The presence of what looks to be UTF-16 BOM header can be used a part
of a heuristic to guess the encoding of file, but I don't think it's a
good idea for GCC to be guessing the encoding of files.

Of course, stdio.h is stored in UTF-8 on the system so trying to convert
it from UTF-16 will fail right away.

It would probably be more accurate to describe stdio.h as an ASCII file.

Ross Ridge



  1   2   >