Progress on Feature

Martin Desruisseaux Tue, 20 May 2014 15:38:23 -0700

Hello all

A note on the implementation of org.apache.sis.feature package: the
current proposal on the JDK8 branch splits the Feature class in two
implementations: DenseFeature and SparseFeature. Their code are not as
straightforward as they could be. The reason is that SIS applications
may have a lot of Feature instances (millions are possible). While we
will try to implement data sources in a way that avoid long-lived
Feature instances, there is always a risk that some data sources will
try to load millions of features in memory.


Apache SIS feature implementations use lazy instantiation: Attribute and
Association instances are created only if explicitly requested.
Otherwise we will store only the attribute *values* (without wrapping
them in 'DefaultAttribute' object - as long as the later are not
requested). The intend is to reduce the amount of Java object to create
in the common case where we do not need them.

'SparseFeature' stores the properties in a java.util.HashMap, while
'DenseFeature' stores the properties in a plain Java array. Otherwise
those two classes should behave in the same way. Using a HashMap (in
'SparseFeature') is more memory efficient when we have a large set of
possible properties, and we know that only a few of them will be
assigned a value. Conversely, using a plain Java array (in
'DenseFeature') is more memory efficient when we know that almost all
properties will have a value.

The above is expected to be transparent to the user - Apache SIS should
be able to choose automatically between DenseFeature and SparseFeature
implementations. I mention this implementation details only in the hope
to clarify their intend.

This special care about memory consumption applies only to Feature
implementations. Most other classes in the org.apache.sis.feature
package do not need such attention, because they are not expected to be
as numerous.

    Martin

Progress on Feature

Reply via email to