On 08.12.2010 17:50, Martin Dias wrote:
Hi all
Last months I and Tristan have been working on Fuel project, an object
binary serialization tool. The idea is that objects are much more
times loaded than stored, therefore it is worth to spend time while
storing in order to have faster loading and user experience. We
present an implementation of a pickle format that is based on
clustering similar objects.
There is a summary of the project below, but more complete information
is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel
The implementation still needs a lot of work to be really useful,
optimizations should be done, but we'll be glad to get feedback of the
community.
= Pickle format =
The pickle format and the serialization algorithm main idea, is
explained in this slides:
http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example
= Current features =
- Class shape changing (when a variable has been added, or removed, or
its index changed)
- Serialize most of the basic objects.
- Serialize (almost) any CompiledMethod
- Detection of global or class variables
- Support for cyclic object graphs
- Tests
= Next steps =
- Improve version checking.
- Optimize performance.
- Serialize more kinds of objects:
-- Class with its complete description.
-- Method contexts
-- Active block closures
-- Continuation
- Some improvements for the user:
-- pre and post actions to be executed.
-- easily say 'this object is singleton'.
- Partial loading of a stored graph.
- Fast statistics/brief info extraction of a stored graph.
- ConfigurationOfFuel.
- Be able to deploy materialization behavior only (independent from
the serialization behavior)
= Download =
In a Pharo 1.1 or 1.1.1 evaluate:
Gofer new
squeaksource: 'Fuel';
version: 'Fuel-MartinDias.74';
version: 'FuelBenchmarks-MartinDias.4';
load.
= Benchmarks =
You can run benchmarks executing this line (results in Transcript):
FLBenchmarks newBasic run.
Thank you!
Martin Dias
One thing I do not see mentioned, and feel could use some attention, is
thread safety. (aka other threads altering the graph you are serializing)
The classic answer would of course be "always run FUEL at highest
priority", but if we ever want to move to true multi-core, that's not
enough.
What would be neat is protecting all mutation of objects in the graph
with a Mutex/Monitor whose critical section covers the analysis and
serialization, i.e. blocking all other processes that wants to mutate
objects in the graph untill serialization is complete.
Aside from the behaviour when a marked object is encountered, the
process is the same as for immutability, as discussed here:
http://forum.world.st/immutability-and-become-Was-Re-squeak-dev-immutability-td1597511.html
You could do this image-side as part of the analyze phase, as Eliot's
post suggests, but it's not entirely safe when:
A child -> B
B parent -> A
1 . Process1 protects A with Mutex
2. Process2 calls method on B which does:
a) B parent: C
b) (B's tmpRef to A) child: somethingElse *Wait for Mutex*
3. Process 1 makes B, C immutable, serializes B with C as parent.
4. Process2 changes A child.
It could of course be argued that is a programming/logic error to update
B parent ref before child ref in A, (you'd actually have to do extra
work to get that order), still it's a hard one to debug if it does happen.
(Note to the thread: Following the same logic, I would also say it
should be considered an error to keep a mutable cache as part of an
immutable object :) )
In fact, you *could* implement it with immutability, injecting handler
contexts with behaviour as described above into existing/new processes
for any resulting immutability errors.
Cheers,
Henry