Platform dependent code placement (was: Re: repo layout again)

Andrey Chernyshev Wed, 15 Feb 2006 13:29:35 -0800

Hi All,

Sorry for my late attempt to resurrect this thread, but I'm not sure
if we've already came to a well-defined picture here:

On 1/4/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >> Some more platform tree names:
> >>
> >>     solaris32.sparc solaris64.sparc
> >>     linux32.sparc linux64.sparc
> >>     darwin32.ppc (Is this correct for the new MAC boxes?)
> >
> >Wouldn't the wordsize be better associated with the processor family?
> >
> >So  solaris.sparc32, windows.x86, linux.ppc32, and so on.
>
> Better yet, keep the three variables independent because the
> _same_ chip can operate in both a 32-bit and a 64-bit modes.
> It is the operating system that works in one mode or the other.
> Furthermore, a 64-bit OS also allows 32-bit applications to run
> on it in a compatibility mode.  A good example of this is the
> Sun JDK that runs on Solaris.  There is a 32-bit version, which
> is the default, and there is an additional 64-bit module that
> may be added on for running in 64-bit mode.  The same SPARC
> processor is used for both.
>

I think another example showing the benefit of having independent OS /
CPU attributes could be scenario when we have, for instance, a JIT
compiler which is producing nearly equal code for different OSes on
IA32 (or whataver CPU), and, for example, file I/O module which
doesn't  care a much about the specific CPU, but rather cares about
the specific OS.  In other words, there will be likely the code which
can be easily shared between the different CPU's, and the code which
can be shared between the different OSes.

On the other hand, having a separate source trees like linux32.sparc,
solaris64.sparc, win.IA32 for each specific platform combination may
lead to a huge code duplication. We may need to be able to share the
code through the certain, but not through all platform combinations.

To address that issue, I can suggest a pretty straightforward scheme
for platform-dependent code placement which looks as follows:

1. There is a fixed set of attributes which denotes a specific target
configuration. As a starter set, we may have OS (for operating system)
and, say ARCH (for architecture) attributes. This set can be extended
later, but, as it was suggested, let's don't cross that bridge if we
come to it.

2. Files in the source tree are selected for compilation based on the
OS or ARCH attribute values which may (or may not appear) in a file or
directory name.
Some examples are:

src\main\native\solaris\foo.cpp
    - means file is applicable for whatever system running Solaris;

src\main\native\win\foo_ia32.cpp
    - file is applicable only for  Windows / IA32;

src\main\native\foo_ia32_em64t.cpp
    - file can be compiled for whatever OS on either IA32 or EM64T
architecture, but nothing else.

The formal file selection rule may look like:

(1) File is applicable for the given OS value if its pathname contains regexp
[\W_]${OS}[\W_], or pathname doesn't contain any OS value;

(2) File is applicable for the given ARCH value if its pathname contains regexp
[\W_]${ARCH}[\W_], or pathname doesn't contain any ARCH value;

(3) File is selected for a compilation if it satisfies both (1) and
(2) criteria.

One can see that this naming convention gives developers enough
freedom to layout their code in a most convenient way (actually,
experience shows that the meaning of "convenient" may differ
significantly depending on a component type :). On the other hand, it
gives well defined (and hopefully intuitive enough) rule showing
whether the particular file is picked up by the compiler or not,
depending on a configuration.

In addition to the above, the code could also be selected for
compilation by means of #defines directives in C/C++ files (it is
convenient when the most of a file is platform-independent, with the
exception of just a few lines of code). The building system could set
up the OS and ARCH attributes as appropriate defines for the C/C++
code.
For example, for Windows/IA32 config, the following defines could be set:

     #define OS WIN
     #define WIN
     #define ARCH IA32
     #define IA32

Then the platform-dependent code sections may look like:

#ifdef WIN
….
#endif

which is essentially same as:

#if OS == WIN
….
#endif

It is important that OS/ARCH (or whatever additional) attribute names
and values are used consistently in the file names and define
directives.

Finally, I'd suggest that the platform dependent code can be organized
in 3 different ways:

(1) Explicitly, via defining the appropriate file list. For example, 
Ant xml file may choose either one or another fileset, depending on
the current OS and ARCH property values. This approach is most
convenient, for example,  whenever a third-party code is compiled or
the file names could not be changed for some reason.

(2) Via the file path naming convention. This is the preferred
approach and works well whenever distinctive files for different
platforms can be identified.

(3) By means of the preprocessor directives. This could be convenient
if only few lines of code need to vary across the platforms. However,
preprocessor directives would make the code less readable, hence this
should be used with care.

In terms of building process, it means that the code has to pass all 3
stages of filtering before it is selected for the compilation.

The point is that components at Harmony could be very different,
especially if we take into account that they may belong both to Class
Libraries and VM world. Hence, the most efficient (in terms of code
sharing and readability) code placement would require a maximum
flexibility, though preserving some well-defined rules. The scheme
based on file dir/name matching seems to be flexible enough.

How does the above proposal sound?

Thank you,
Andrey Chernyshev
Intel Middleware Products Division

> >
> >Maybe in some components we would want to include a window manager
> >family too, though let's cross that bridge...
> >
> >I had a quick hunt round for a recognized standard or convention for OS
> >and CPU family names, but it seems there are enough subtle differences
> >around that we should just define them for ourselves.
> >
>
> My VM's config script maintains CPU type, OS name, and word size as three
> independent values.  These are combined in various ways in the source code
> and support scripts depending on the particular need.  The distribution script
> names the 'tar' files for the binaries with all three as a part of the file 
> name
> as, "...-CPU-OS-WORD.tar" as the tail end of the file name.  (NB:  I am going
> to simplify the distribution scripts shortly into a single script that 
> creates the
> various pieces, binaries, source, and documentation.  This will be out soon.)
>
> Does this help?
>
> Dan Lydick
>
> >Regards,
> >Tim
> >
> >
> >--
> >
> >Tim Ellison ([EMAIL PROTECTED])
> >IBM Java technology centre, UK.
>
>
>
>
> Dan Lydick
>

Platform dependent code placement (was: Re: repo layout again)

Reply via email to