On 08.12.2010 17:50, Martin Dias wrote:
Hi all

Last months I and Tristan have been working on Fuel project, an object
binary serialization tool. The idea is that objects are much more
times loaded than stored, therefore it is worth to spend time while
storing in order to have faster loading and user experience. We
present an implementation of a pickle format that is based on
clustering similar objects.

There is a summary of the project below, but more complete information
is available here: http://rmod.lille.inria.fr/web/pier/software/Fuel

The implementation still needs a lot of work to be really useful,
optimizations should be done, but we'll be glad to get feedback of the
community.


= Pickle format =

The pickle format and the serialization algorithm main idea, is
explained in this slides:

http://www.slideshare.net/tinchodias/fuel-serialization-in-an-example


= Current features =

- Class shape changing (when a variable has been added, or removed, or
its index changed)
- Serialize most of the basic objects.
- Serialize (almost) any CompiledMethod
- Detection of global or class variables
- Support for cyclic object graphs
- Tests


= Next steps =

- Improve version checking.
- Optimize performance.
- Serialize more kinds of objects:
-- Class with its complete description.
-- Method contexts
-- Active block closures
-- Continuation
- Some improvements for the user:
-- pre and post actions to be executed.
-- easily say 'this object is singleton'.
- Partial loading of a stored graph.
- Fast statistics/brief info extraction of a stored graph.
- ConfigurationOfFuel.
- Be able to deploy materialization behavior only (independent from
the serialization behavior)


= Download =

In a Pharo 1.1 or 1.1.1 evaluate:

Gofer new
        squeaksource: 'Fuel';
        version: 'Fuel-MartinDias.74';
        version: 'FuelBenchmarks-MartinDias.4';
        load.


= Benchmarks =

You can run benchmarks executing this line (results in Transcript):

FLBenchmarks newBasic run.


Thank you!
Martin Dias

One thing I do not see mentioned, and feel could use some attention, is thread safety. (aka other threads altering the graph you are serializing) The classic answer would of course be "always run FUEL at highest priority", but if we ever want to move to true multi-core, that's not enough.

What would be neat is protecting all mutation of objects in the graph with a Mutex/Monitor whose critical section covers the analysis and serialization, i.e. blocking all other processes that wants to mutate objects in the graph untill serialization is complete.

Aside from the behaviour when a marked object is encountered, the process is the same as for immutability, as discussed here:
http://forum.world.st/immutability-and-become-Was-Re-squeak-dev-immutability-td1597511.html

You could do this image-side as part of the analyze phase, as Eliot's post suggests, but it's not entirely safe when:
A child -> B
B parent -> A

1 . Process1 protects A with Mutex
2. Process2 calls method on B which does:
   a) B parent: C
   b) (B's tmpRef to A) child: somethingElse *Wait for Mutex*
3. Process 1 makes B, C immutable, serializes B with C as parent.
4. Process2 changes A child.

It could of course be argued that is a programming/logic error to update B parent ref before child ref in A, (you'd actually have to do extra work to get that order), still it's a hard one to debug if it does happen. (Note to the thread: Following the same logic, I would also say it should be considered an error to keep a mutable cache as part of an immutable object :) )

In fact, you *could* implement it with immutability, injecting handler contexts with behaviour as described above into existing/new processes for any resulting immutability errors.

Cheers,
Henry





Reply via email to