On 31/10/11 19:17, Samuel Lampa wrote:
> On 10/31/2011 07:55 PM, Markus Krötzsch wrote:
>> On 31/10/11 18:13, Samuel Lampa wrote:
>>> On 10/31/2011 06:52 PM, Samuel Lampa wrote:
>>>> === Q2: Status of SMWData/SMWDataItem as API? ===
>>>>
>>>> Also I wondered what status the SMWData/SMWDataItem classes are
>>>> supposed
>>>> to have, as a general API? ... Are they the supposed API, or is SMW
>>>> going towards preferring to talk SPARQL with all extensions ... or even
>>>> SMWExpElements?
>>>>
>>>> I ask this since it does not seem clear that I will really*need* to use
>>>> the SMWData/SMWDataItem combo as a representation, if I do the wiki
>>>> page
>>>> updates either with the Wiki Object Model extension or an own writer
>>>> class.
>>>>
>>>> I would still prefer to use it, if it is pushed as a preferred API for
>>>> these kind of things, but I wondered whether that is so for the
>>>> foreseeable future?
>>>
>>>
>>> The thing that makes me wonder, is since we're basically talking about
>>> two slightly different (though very much overlapping) representations:
>>> RDF (as represented by SMWExpElement rel. classes), and Semantic
>>> MediaWiki facts (as repr. by SMWData/SMWDataItem).
>>>
>>> My problem, in the context of RDFIO, is that it seems I actually need
>>> both of these to capture the information from both worlds ... since:
>>>
>>> a. I need to store the URI:s, which only SMWExpElement classes do
>>> b. I need to store the wiki page titles that I choose to use (as part of
>>> RDFIO:s algorithm), which only the SMWData/SMWData combo does.
>>>
>>> ... thus it seems there's at least two options:
>>>
>>> 1. RDFIO creates an own more general data container, which wraps both
>>> the SMWData/SMWDataItem one, and the RDF one (possibly both the
>>> SMWExpElement one, and ARC2:s data structures), with in-built converters
>>> between all of these,
>>>
>>> 2. SMWData/SMWDataItem classes are updated to contain the "Original
>>> URI", and then this format will be the only needed one, in addition to
>>> possibly the ARC2 format, just for making use of it's parsers.
>>>
>>>
>>> Number one is the one I've been pondering so far ... I just wanted to
>>> point out this now and ask whether there would be any interest in
>>> storing also the original URI directly in the SMWData/SMWDataItem
>>> classes ... (which would not need to be required, for data that has no
>>> counterpart in the outside world, though ... or maybe can just be
>>> prefilled with the URIResolver URI:s ... this maybe on-the-fly, in a
>>> getter method)?
>>>
>>> ... it seems that would make the SMWData/SMWDI combo more general, and
>>> of course would make RDFIO add a lot less overhead :")
>>>
>>> (I know we discussed this on SMWCon already, but these things weren't
>>> really that clear to me then, about the partly but not completely
>>> overlap between RDF and SMW data representations ... so wanted to point
>>> it out ... )
>>
>> I suggest to go for (1) if you need the full information in one object.
>> You should think of SMW data items as small and simple "values", similar
>> to an integer or a char in a programming language. They should be used
>> like constants of datatypes. They should only be used for storing data,
>> not for converting data or for augmenting it. They are pure data and
>> know nothing about HTML, wikitext or RDF. [Exception: the SMWDIContainer
>> type is a placeholder for compound data; it is not really considered as
>> an atomic value in SMW but just used for transporting compound data in
>> the API]
>>
>> With this view in mind, making an object that holds a URI and a dataitem
>> does not seem a bad idea (like making an object that holds an integer
>> and a string).
>>
>> Alternatively, you could of course represent URIs in an SMW data item as
>> well and relate them to wiki page with a property, stored together in an
>> SMWSemanticData.
>
>
> Ok, many thanks for the feedback!
>
> The suggestions sounds reasonable - keeping in line with the modelling
> approach already taken.
>
> The only little caution I'd like to make, is that the decision keeping
> data objects atomic makes them follow the Anemic Model antipattern [1] a
> bit. But that is of course a question about model design approach
> overall, and not this specific case only - that is, whether one wants to
> follow Domain Driven Design patterns [2] or not.

Reading [1], I think there is a misunderstanding in the way you seem to 
apply this text to SMW (probably due to my ill-chosen examples of 
property and wiki page out of all dataitems). The text states that 
domain specific behaviour of domain objects should be implemented in the 
classes that represent the objects. This is what we do. Our domain 
objects are strings, numbers, geographic coordinates. This is the very 
data that we want to manage in SMW, it just happens to be rather atomic, 
simple and (application) domain independent. Note that we do not 
artificially try to abstract or simplify the objects to get this 
representation -- these simple concepts are really the kinds of things 
that SMW users deal with.

Yet we include all related code into the objects whenever such code is 
needed. For example, you can have a look at SMWDITime to see a lot of 
calendar/date specific code. We could also have similar methods for 
strings (e.g., substring computation) and for numbers (e.g., for 
rounding) but this was not necessary so far. Our data items do not 
include parsing/rendering functions that are specific to syntactic 
formats like HTML, wikitext, JSON, RDF, SQL, ... which I think is good 
(and established) design (you don't mix all parsing/serialisation code 
into one class).

The big fallacy of [1] is to suggest that "object code" must always be 
much larger that "application/service code". If taken too serious, this 
could lead to a design that tries to merge all functionality into a few 
objects, thus contradicting the fundamental programming paradigm of 
separation of concerns. For example, SMW used to have HTML rendering and 
RDF serialisation methods for data in a single class, in spite of the 
fact that these functions are not at all related but merely work on the 
same input data.

This earlier design of SMW has also undermined another important idea of 
OO design: the definition of clear interfaces with limited visibility. 
The code for parsing, rendering, representation and serialisation used 
to have full access to all internal fields of the objects. Before the 
introduction of data items, it was quite unclear for some objects where 
the data is actually stored (there were multiple redundant/overlapping 
internal representations, sometimes optional, to reflect the internal 
state of the object; all code would directly read/write to any of the 
members).

A third main reason for keeping single objects small is that SMW is 
meant to be extendible. If each new storage backend or display format 
would rely on adding code to domain object classes, it would be very 
hard to extend the system.

Overall, I still think that SMW follows most of the guidelines of 
Domain-Driven Design but for a domain (data management) that is very 
different of what the author of [1] had in mind. Another special 
observation about SMW is that most of our "business logic" is related to 
parsing and serialisation -- tasks that should normally be separated 
from the data that they work on. But maybe one has to take a step back 
and ask what the "domain layer" and "application layer" in SMW really 
are to compare it to the DDD idea. :-)

Best regards,

Markus

>
> ... so for the moment I'm happy to follow the existing model design
> approach :)
>
> // Samuel
>
>
> [1] http://martinfowler.com/bliki/AnemicDomainModel.html
> [2] http://en.wikipedia.org/wiki/Domain-driven_design
>
>
>


------------------------------------------------------------------------------
RSA® Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to