Richard L. Hamilton wrote:
> I haven't read enough on the ELF format to be able to answer this myself:
> why can't one process an ELF file in such a way as to re-create it with
> a particular string table enlarged sufficiently to hold whatever one wants
> (not needing to reserve space for patching string table entries in the 
> binary)?
> 
> For that matter, if string table contents are only pointed at by structures
> that are part of the ELF format (and not by arbitrary text or data section
> contents), then why isn't it possible (if tedious) to determine whether there
> are any unused bytes in a string table?  That is, if one can find the size of
> the string table, and everything pointing into it should be findable by
> navigating the various data structures, and all strings are null-terminated,
> then it should be possible to determine how many times every byte in the
> string table falls within one or more references, and what those references
> are.
> 
> Don't get me wrong; being able to fix run paths at all is much needed.  But
> I was hoping for either a truly smart ELF editor that could reconstitute an
> ELF file with all necessary adjustments without needing to reserve space
> for such purposes (and with the potential flexibility to make other 
> interesting
> adjustments), or alternatively, something with ld.so.1 and crle such
> that one could set specific overriding run paths for specific execnames
> (which would at least centralize all such overrides and avoid the need to
> hack up binaries, making their checksums, md5sums, or signatures invalid).
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> tools-linking mailing list
> tools-linking at opensolaris.org


Hi Richard,

    The basic answer is that the information needed to reverse a
link doesn't exist in the file. The linker takes the inputs,
combines sections, creates references to things by their
relative distance from each other, processes relocations, and
so forth, and then creates a new output. This new file contains the
information needed by other linker passes to use the results, but the
information about how the object was put together is gone, and can't
be reliably reversed.

As Rod said when I asked him exactly the same question a year
ago: "How would it do that?"

To oversimplify, there is no trail of bread crumbs to follow back.

This is partly for efficiency/space reasons, but also to contain
complexity of the linking format. After all, we can usually relink
an object in the normal way to achieve these goals.  The need
to take apart, modify, and reassemble an object without starting with the
sources is a fringe case, and maybe not worth a big infrastructure to
support.

So, we can't increase the size of the string table because that
would break all of the things in the file that refer to other
things by relative offset, and we lack the information needed
to find and fix those offsets.

Your other point is also true: I said that you can't tell who is
referencing the strings, but clearly you can in fact take the file
apart byte by byte and figure it out. Tedious yes, but not impossible.

I'm sure I don't need to convince you that it is much easier to explicitly
record that information (DT_SUNW_STRPAD) than it is to figure it out from
scratch, and probably more reliable. However, there is also another
consideration: time travel of objects from the future.

It happens more than you might think, that a file built on a newer
version of the system is copied to an older system. Suppose that new
system has a new ELF feature that references a string (a new dynamic
entry seems likely) that the old system doesn't know about. A brute force
search won't work in this case, because the old system won't understand the
new feature, and will miss the string reference. If you take time travel
into account, finding all the string references by searching the file
may not be possible. It turns out that having DT_SUNW_STRPAD is not
only easier to do, but also a bit more robust.

Now a time traveling object may or may not work when it goes to the
past for other reasons, but often it does work, and the linkers try
hard not to be the cause of such failures.

I'll finish with some quick answers to your closing points:


 > Don't get me wrong; being able to fix run paths at all is much needed.  But
 > I was hoping for either a truly smart ELF editor that could reconstitute an
 > ELF file with all necessary adjustments without needing to reserve space
 > for such purposes (and with the potential flexibility to make other 
 > interesting
 > adjustments)

Sure, we'd all like that (but see above). Since ELF doesn't support it,
the fixed extra space idea is a "better than nothing" answer that handles
most cases we've encountered over the years. I don't mean to sell it as more
than that. Simple and useful, but not perfect...

 > or alternatively, something with ld.so.1 and crle such
 > that one could set specific overriding run paths for specific execnames
 > (which would at least centralize all such overrides and avoid the need to
 > hack up binaries

The runpath editing doesn't preclude a crle extension like this, but
there are downsides to that approach too:
        - Every program on the system pays some overhead for having
          ld.so.1 examine a config file. Do you really want all
          the programs that don't need this "fixup" to pay this
          price?
        - Centralized databases can be fragile and difficult to
          manage too (think Windows registry).
Although my bias is to fix this sort of thing at the edges, I think
there's room for more than one answer.

I've also seen proposals to use extended file attributes
to do similar things, and that is another interesting idea.


 > , making their checksums, md5sums, or signatures invalid).

Agreed. This might not be a huge issue though. If an object is signed
(a sophisticated concept), might we not expect its creator to get
the runpath right also?  (I know, I know, but still...  :-)  )

Or, the signing might be done after the runpath change. For example, we
have discussed using this feature to fix up runpaths in some of the
freeware we ship --- doing it in their makefiles can be painful, and
a post processing step to fix them up might be easier and more reliable.
In a case like that, any signing would be applied to the file after it was
edited, so the signatures would be valid.

In many cases, the objects are not signed, or the end user would gladly
invalidate them in exchange for a working runpath. So I guess it
depends on the situation.

Thanks for the great questions...

- Ali

Reply via email to