A small correction:
> Currently the IA-32 and Intel-64 CG uses calling conventions
> that pass all parameters on stack.
Stack-only is about IA-32 calling convention.
Intel64 CG generates a calling convention as per SystemV ABI
recommendations [1].
This is a variant of fastcall with 6 GP registers and 8 XMM registers
used for arguments passing.
> Also, all used calling conventions require returning
> FP values in x87 register stack and
Again, only on IA-32. Intel64 code returns float-point values on XMM0.
> do not support callee-saved SSE registers while all
> FP arithmetic is done using SSE/SSE2.
That's correct.
No convention currently used supports callee-saved XMMs...
[1] http://refspecs.freestandards.org/elf/x86_64-SysV-psABI.pdf
--
Thanks,
Alex
Rana Dasgupta wrote:
Hi Gregory,
It is a good idea to put up a live list, thanks. Here are some
suggestions on the contents for development items in the VM/JIT. A few may
be almost done. We can fine tune...and add other work items as well
JIT Items
======
- GC related: WB support in Jitrino.opt
Implement support of write barriers in the Jitrino.opt compiler. Write
barriers are required to implement a generational GC. Currently WB are
supported in Jitrino.JET only.
- JIT: HLO improvements in Jitrino
A set of performance-oriented improvements:
- Reduce overhead from Back Branch Polling - remove BBP from finite
loops or reduce overhead if loop finiteness is undetermined.
- Implement interface call devirtualization - based either on
heuristics or on the edge profile or on the value profile. The latter
requires implementation of the value profile.
- Array Bounds Check Elimination - need to fix current ABCD algorithm and
implement a new range check elimination optimization
- Loop unrolling - Improve loop unrolling and the code produced after this
optimization.
- JIT: New optimizations in Jitrino high-level optimizer.
The Escape Analysis (EA) algorithm prototype is in Jitrino code.
The following new optimizations could use EA information:
- EA-based scalar replacement
- EA-based on-stack allocation
- Improvement of calling conventions.
Currently the IA-32 and Intel-64 CG uses calling conventions that pass all
parameters on stack. Also, all used calling conventions require returning
FP values in x87 register stack and do not support callee-saved SSE
registers while all FP arithmetic is done using SSE/SSE2.
Although aggressive inlining reduces the overhead performance can
be improved in the following directions:
- Passing arguments in GP and SSE registers
- SSE-friendly calling conventions.
- Pluggable calling conventions.
Calling convention improvements require changes in all JITs and VM core.
- Branch optimization in IA32/Intel 64 CG.
Analysis of the code generated by Jitrino on IA32 show that many
unnecessary branches could be eliminated.
- Register allocation improvements and tuning.
Currently there are 2 global register allocators in Jitrino:
bin-packing and color graph. Further improvement of register
allocation could be done in the following directions:
- Profile-guided live-range splitting in the register allocators.
- Tuning and enhancing the color graph RA.
- Implementation of new register allocation schemes.
- IA-64 enabling.
Jitrino.opt contains an IA-64 code generator but the rest of the system
does not support this platform. Also, the Jitrino.opt code generator needs
more platform-specific optimizations and tuning.
- X87 based floating point math.
Currently all FP operations in Jitrino are implemented using SSE/SSE2
except for the calling conventions which use x87 and a few other minor
exceptions in Jitrino.opt to be fixed.
If a processor does not support these technologies the system behavior
is undefined.
- Jitrino front-end re-factoring (BC translator, HLO info, etc.)
Re-factor Java bytecode translator in the Jitrino.opt to make the code
clearer and simplify the used data structures.
Improve HLO framework (SSA on-demand, cleanup on-demand etc.)
- DPGO: Bytecode-based edge profiling
Currently edge profile information in the Jitrino.opt is mapped to the
Internal Representation (IR) which makes profile sensitive to the IR
transformations. Mapping profile (or IR) to the bytecode will remove such
dependency. Then possibility to unify the IR-bytecode mapping used for
profiling and JVMTI support should be also considered.
Core VM
=======
- bytecode verifier:
Develop subroutine Verification
- JNI:
JNI Weak References development
Globalization: These are community tasks ( geir ) but not important for
product
- Develop VM locale support
- Unicode support for Java classes/method/field name
- Thread Manager:
Develop synchronization protocol for JVM
Implement of synchronization and other basic helpers for IPF
remove "non system locks"
- Stack : Complete overflow handling support for Java and native code
- Tool Interface:
Profiling implementation, in particular Heap iteration, in JIT mode
- Garbage Collection:
The work in the garbage collection area is new gc_v5 functionality, and
bug fixing and removal of performance bottlenecks( SpecJBB2005 ) in GCV4.1.
The work on GC_v5 include:
Generational GC with WB support
LOS support
Parallelization Support
Debugging and verification framework
Weak Reference and Finalizer Support
( Xiao Feng?)
- MMTk integration:
Support for magic classes in Jitrino
VM/JIT support for MMTk collectors including RSE and thread suspension
( Weldon, could you please add details?)
Thanks
On 10/16/06, Mikhail Loenko <[EMAIL PROTECTED]> wrote:
>
> Once it's more or less ready let's point to that page from TODO on our
> website
>
> Thanks,
> Mikhail
>
> 2006/10/17, Gregory Shimansky <[EMAIL PROTECTED]>:
> > Hello
> >
> > You know that drlvm was donated by Intel and there was some period of
> time
> > while drlvm was developed internally. We had an internal bugzilla
> server to
> > track the issues. In an effort to move all development to the open
> this
> > internal bugzilla will be closed.
> >
> > The database is quite big and contains a lot of valuable information
> including
> > still open bugs. There are many of them which are not exactly bug
> reports,
> > but requests for enhancements or limitation problems. These small
> issues
> > didn't make it to README because they are mostly minor and low
> priority, but
> > it is better to use information which we have already than rediscover
> these
> > problems again.
> >
> > So not to create a lot of garbage in JIRA like problem requests
> without
> > patches I think it is better to create something like a TODO list for
> drlvm.
> > Not exactly bugs, but more like known limitations list.
> >
> > To give you an example, vm/vmcore/src/init/properties.cpp contains a
> #define
> > MAX_PROP_LINE 5120 which is a maximum property length specified in
> > vm.properties file. I am not even sure if this file still used,
> probably not,
> > but a buffer length limitation is still a bad thing.
> >
> > I think a good place for such list would be on wiki. I am going to
> create a
> > page for it so that everyone who has open bugs inside of Intel could
> add a
> > line or two describing a problem recorded in bugzilla. I have 3 like
> these
> > filed myself including the one I gave as an example.
> >
> > I don't know the number of such bugs overall, maybe it is not so big.
> But
> > before creating JIRAs for them I think it is better to collect a list
> on
> > wiki. What do you think?
> >
> > --
> > Gregory Shimansky, Intel Middleware Products Division
> >
>
---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]