While working on ATLAS-683, I had to understood the need of the class named
'FieldMapping' in the typesystem
project. Owing to the lack of javadoc, I had to trace its usage points to
figure out the need for it.
Incase anyone wants to know, here is my understanding... (Will try and
explain with simple code. Note
that Attribute/Field words are used interchangeably)

Lets just restrict to types which can only hold attributes of primitive
types (no references). In that
case, the ClassType would like..

class ClassType {

        List<Attribute> attributes;
}

class Attribute {

        String name;
        String dataType; // primitive
}

An instance of ClassType can be described like...

class Instance {

        Map<String, Object> values; // 'key' is attribute name and 'value' is
the
                                    // primitive value of it like Boolean, 
Integer,
Long, String etc.,

        Object getValue(String attrName) {
                return values.get(attrName);
        }
}

Instead of storing all values in a single Map, one can partition the values
in buckets based on the 'type' of the value as follows...

class Instance {

        Boolean[] booleanValues;
        String[] stringValues;
        Integer intValues;
        ...
}

If one has to persist the values in the above partitioned buckets, how does
one query the value
of an attribute. One would need to know the 'type' of the attribute as well
as the 'position'
of the attribute.

class Instance {

        Boolean[] booleanValues;
        String[] stringValues;
        ..

        Object getBooleanValue(String attrName) {

                int valuePosition = ... // get it from somewhere, where ? see
below
                return booleanValues[valuePosition];
        }

}

In the above code, the position of the boolean attribute can be captured in
a class like
FieldMapping as follows...

class FieldMapping {

        Map<String, Integer> fieldPos; // 'key' is attribute name and 'value'
is the position of
                                        // the 'attribute'.
}

In the Instance class above, the array 'booleanValues' needs to be
initialized with a fixed size
array with the total count equal to the number of 'boolean' valued
attributes in ClassType.
So we extend the FieldMapping a little further to precompute the total
number of boolean values.

class FieldMapping {

        Map<String, Integer> fieldPos;

        int numBools;   // boolean attributes count
        int numStrings; // string attributes count
        ...
}

Finally, when one thinks of the intent to store in buckets, it is obvious
that it is
for performance reasons. Performance would probably come into picture, when
there are too many attributes in
a given ClassType and the value lookup on the instance needs to be fast.
But the downside of that is, each Instance
holds fixed size buckets based on Type information, which is "constant". So
even if a single value is set in an instance,
the Instance is pre-sized large enough to hold all the values. I am
thinking the current design is this way. When one
looks at the case of a 'query' where a subset of columns/fields are
queried, one would only need to populate the
Instance with those values (which is probably the most frequent usecase).

Thanks
Venkat

Reply via email to