Re: Integrating Disparate Information Systems

2010-11-09 Thread Alan Ruttenberg
On Tue, Nov 9, 2010 at 11:39 AM, Kingsley Idehen  wrote:
> On 11/9/10 10:23 AM, John F. Sowa wrote:
>
> John,
>
> Great response.  I am cc'ing in LOD mailing as your comments are poignant
> re. systems integration and the need to separate Logic from Syntax etc..
>
> Others: I encourage you to read on, and digest.

I have read it, and while it mentions a number of historical points
that might be of interest to younger folk, I also find that it clouds
a number of issues. Comments in line.

>
>> On 11/9/2010 1:24 AM, Alex Shkotin wrote:
>>>
>>> What do we need for our information systems to communicate properly?
>>> Integration? Alignment? Unification? Information system education?
>>
>> The first point I'd emphasize is that IT systems have been successfully
>> communicating for over a century.  Originally by punched cards, then
>> by paper tape, magnetic tape, direct connection, and telephone.

For a very limited set of pairs of systems. The movement now is to
make it much more likely that a pair of systems can communicate
meaningfully. That is new.
So I don't see the point that is being made by this statement.

>> When Arpanet was started in 1969, there had been a long history
>> of experience in data communication.  And the latest conventions
>> for the WWW are still based on extensions to those protocols.
>>
>> Those physical formats and layouts are very important for the
>> technology.  And they will remain buried in systems for ages
>> upon ages.
>>
>> But you never, ever want those formats to have the slightest
>> influence on the semantics.

Where do you see the influence of format on semantics being an issue
here? Any language is going to need an encoding. Here we are trying to
arrange things so that, at a minimum, there is at least one syntax
that any communicator can handle - a common denominator. How many of
the historical (and current) systems failed to communicate because of
stupid differences in syntax - bit ordering, choice of delimiters,
other arbitrary choices. We need, effectively, at least on arbitrary
choice that we all agree to work with.

But OWL, at least, has a straightforward translation to and from the
portion of logic it is capable of representing. That portion is not
constrained by the syntax, but by issues you discuss below.

>> The decision to force OWL into the
>> same straitjacket as RDF was hopelessly misguided.

I see only minor inconveniences.


>> In fact, even
>> the decision to force decidability down the throats of every
>> ontologist was another profoundly misguided technology-driven
>> decision.  (Note the subtle semantic distinction between profound
>> and merely hopeless.)

There was no global decision to do anything of the sort. There was an
effort to create some standard. When a standard is created, people who
make decisions get the people that work for them to work within that
standard, in the interest of interoperability. So there were thousands
of such decisions.

There are other standards. To my view it is interesting to analyze why
they are not as successful. Suggesting that this is due to some
conspiracy or choice of few doesn't give me confidence that a deep
analysis has been undertaken.

>>> What kind of language and dictionary we need to write question? SPARQL?
>>> What kind of language  and dictionary we need to write answer? XML, CSV?
>>
>> Use whatever notation is appropriate for your application.

Here we agree.

>> But you must design the overall system in such a way that the choice for one
>> application is *invisible* to anybody who is designing or using some
>> other application.

The overall system? I really don't understand what you are referring to.
There is a standard syntax. Anyone is able to now write a tool that
takes their favorite syntax and translate it into some other syntax
for which a translator has been written to RDF/XML. We are in a
culture of open source. Over time there will be enough translators
that, for all intents and purposes, there will be no reason why what
you suggest is not feasible. But are you suggesting this could or
should have happened from the outset? Standardization that serves all
needs?


>> Of course, there may be some cases where real-time constraints make it
>> necessary to avoid a conversion routine between two systems.  But that
>> is a very low-level optimization that should never affect the semantics.
>> For example, when was the last time that you thought about the packet
>> transmissions for your applications?  Some system programmers worry
>> about those things a lot.  But they're invisible at the semantic level.

As is the case for our current stack.

>>> Where is your SPARQL end point at least?
>>
>> When you are thinking about semantics, any thought about the
>> difference between SPARQL, SQL, or some bit-level access to data
>> is totally irrelevant.

Yes. Unfortunately we need a way to get to the semantics, and that way
is via syntax. So having one syntax to learn is much better than
having many t

Integrating Disparate Information Systems

2010-11-09 Thread Kingsley Idehen

On 11/9/10 10:23 AM, John F. Sowa wrote:

John,

Great response.  I am cc'ing in LOD mailing as your comments are 
poignant re. systems integration and the need to separate Logic from 
Syntax etc..


Others: I encourage you to read on, and digest.


On 11/9/2010 1:24 AM, Alex Shkotin wrote:

What do we need for our information systems to communicate properly?
Integration? Alignment? Unification? Information system education?

The first point I'd emphasize is that IT systems have been successfully
communicating for over a century.  Originally by punched cards, then
by paper tape, magnetic tape, direct connection, and telephone.

When Arpanet was started in 1969, there had been a long history
of experience in data communication.  And the latest conventions
for the WWW are still based on extensions to those protocols.

Those physical formats and layouts are very important for the
technology.  And they will remain buried in systems for ages
upon ages.

But you never, ever want those formats to have the slightest
influence on the semantics.  The decision to force OWL into the
same straitjacket as RDF was hopelessly misguided. In fact, even
the decision to force decidability down the throats of every
ontologist was another profoundly misguided technology-driven
decision.  (Note the subtle semantic distinction between profound
and merely hopeless.)


What kind of language and dictionary we need to write question? SPARQL?
What kind of language  and dictionary we need to write answer? XML, CSV?

Use whatever notation is appropriate for your application.  But you
must design the overall system in such a way that the choice for one
application is *invisible* to anybody who is designing or using some
other application.

Of course, there may be some cases where real-time constraints make it
necessary to avoid a conversion routine between two systems.  But that
is a very low-level optimization that should never affect the semantics.
For example, when was the last time that you thought about the packet
transmissions for your applications?  Some system programmers worry
about those things a lot.  But they're invisible at the semantic level.


Where is your SPARQL end point at least?

When you are thinking about semantics, any thought about the
difference between SPARQL, SQL, or some bit-level access to data
is totally irrelevant.  Please remember that commercial DB systems
provide all those ways of accessing the data if some programmer
who works down at the bit level needs them.  But anybody who is
working on semantics should never think about them (except in
those very rare cases when they go down to the subbasement to
talk with system programmers about real-time constraints.)


JS: "but every application will have... different vocabularies, and different
dialects." Inside. But with a stranger we usually change language to common.

Not necessarily.  Sometimes you learn their language, they learn
your language, or you bring a translator with you.

But it's essential to distinguish three kinds of languages:
natural languages, computer languages, and logic.

For NLs, translation is never exact because they all have hidden
ontology buried down in their lowest levels.  For computer languages,
the level of exactness depends on the amount of buried ontology.

Some computer systems (such as the TCP/IP protocols) do translation
from strings to packets very fast because they don't impose any
constraints on the ontology.  Therefore, programmers above the
lowest system levels never think about those translations.

For other systems, such as poorly designed software, the ontology
changes in subtle ways with every release and patch to any system.
(I won't name any names, but we've seen such things all too often.)

But first order logic was *discovered* independently by Frege and
Peirce 130 years ago, and *exact* translation between their notations
and all the modern notations for FOL is guaranteed.

Note the word 'discover'.  Frege and Peirce did not *invent* FOL.
My comment is that FOL was standardized by an authority that is
even higher than ISO -- namely, God.  (Please note the Bible,
John 1,1:  "In the beginning was the logos, and the logos was
with God, and God was the logos.")

Nobody has to learn FOL, because it's buried inside their native
language, whatever it may be.  But some notations for FOL are less
readable than others.  That's why I recommend controlled NLs for
many purposes.

But learning to write FOL is nontrivial, even in a controlled NL.
The reason for the difficulty is that people are used to the
flexibility of their native languages with all that built-in
ontology.  To write pure FOL requires a very strict discipline
to distinguish the logic from the implicit ontology.

Bottom line:  The distinction between logic and ontology is so
important that you should never confuse people with extraneous
issues about bit strings, angle brackets, or even decidability.

John

_