Re: Integrating Disparate Information Systems
On Tue, Nov 9, 2010 at 11:39 AM, Kingsley Idehen wrote: > On 11/9/10 10:23 AM, John F. Sowa wrote: > > John, > > Great response. I am cc'ing in LOD mailing as your comments are poignant > re. systems integration and the need to separate Logic from Syntax etc.. > > Others: I encourage you to read on, and digest. I have read it, and while it mentions a number of historical points that might be of interest to younger folk, I also find that it clouds a number of issues. Comments in line. > >> On 11/9/2010 1:24 AM, Alex Shkotin wrote: >>> >>> What do we need for our information systems to communicate properly? >>> Integration? Alignment? Unification? Information system education? >> >> The first point I'd emphasize is that IT systems have been successfully >> communicating for over a century. Originally by punched cards, then >> by paper tape, magnetic tape, direct connection, and telephone. For a very limited set of pairs of systems. The movement now is to make it much more likely that a pair of systems can communicate meaningfully. That is new. So I don't see the point that is being made by this statement. >> When Arpanet was started in 1969, there had been a long history >> of experience in data communication. And the latest conventions >> for the WWW are still based on extensions to those protocols. >> >> Those physical formats and layouts are very important for the >> technology. And they will remain buried in systems for ages >> upon ages. >> >> But you never, ever want those formats to have the slightest >> influence on the semantics. Where do you see the influence of format on semantics being an issue here? Any language is going to need an encoding. Here we are trying to arrange things so that, at a minimum, there is at least one syntax that any communicator can handle - a common denominator. How many of the historical (and current) systems failed to communicate because of stupid differences in syntax - bit ordering, choice of delimiters, other arbitrary choices. We need, effectively, at least on arbitrary choice that we all agree to work with. But OWL, at least, has a straightforward translation to and from the portion of logic it is capable of representing. That portion is not constrained by the syntax, but by issues you discuss below. >> The decision to force OWL into the >> same straitjacket as RDF was hopelessly misguided. I see only minor inconveniences. >> In fact, even >> the decision to force decidability down the throats of every >> ontologist was another profoundly misguided technology-driven >> decision. (Note the subtle semantic distinction between profound >> and merely hopeless.) There was no global decision to do anything of the sort. There was an effort to create some standard. When a standard is created, people who make decisions get the people that work for them to work within that standard, in the interest of interoperability. So there were thousands of such decisions. There are other standards. To my view it is interesting to analyze why they are not as successful. Suggesting that this is due to some conspiracy or choice of few doesn't give me confidence that a deep analysis has been undertaken. >>> What kind of language and dictionary we need to write question? SPARQL? >>> What kind of language and dictionary we need to write answer? XML, CSV? >> >> Use whatever notation is appropriate for your application. Here we agree. >> But you must design the overall system in such a way that the choice for one >> application is *invisible* to anybody who is designing or using some >> other application. The overall system? I really don't understand what you are referring to. There is a standard syntax. Anyone is able to now write a tool that takes their favorite syntax and translate it into some other syntax for which a translator has been written to RDF/XML. We are in a culture of open source. Over time there will be enough translators that, for all intents and purposes, there will be no reason why what you suggest is not feasible. But are you suggesting this could or should have happened from the outset? Standardization that serves all needs? >> Of course, there may be some cases where real-time constraints make it >> necessary to avoid a conversion routine between two systems. But that >> is a very low-level optimization that should never affect the semantics. >> For example, when was the last time that you thought about the packet >> transmissions for your applications? Some system programmers worry >> about those things a lot. But they're invisible at the semantic level. As is the case for our current stack. >>> Where is your SPARQL end point at least? >> >> When you are thinking about semantics, any thought about the >> difference between SPARQL, SQL, or some bit-level access to data >> is totally irrelevant. Yes. Unfortunately we need a way to get to the semantics, and that way is via syntax. So having one syntax to learn is much better than having many t
Integrating Disparate Information Systems
On 11/9/10 10:23 AM, John F. Sowa wrote: John, Great response. I am cc'ing in LOD mailing as your comments are poignant re. systems integration and the need to separate Logic from Syntax etc.. Others: I encourage you to read on, and digest. On 11/9/2010 1:24 AM, Alex Shkotin wrote: What do we need for our information systems to communicate properly? Integration? Alignment? Unification? Information system education? The first point I'd emphasize is that IT systems have been successfully communicating for over a century. Originally by punched cards, then by paper tape, magnetic tape, direct connection, and telephone. When Arpanet was started in 1969, there had been a long history of experience in data communication. And the latest conventions for the WWW are still based on extensions to those protocols. Those physical formats and layouts are very important for the technology. And they will remain buried in systems for ages upon ages. But you never, ever want those formats to have the slightest influence on the semantics. The decision to force OWL into the same straitjacket as RDF was hopelessly misguided. In fact, even the decision to force decidability down the throats of every ontologist was another profoundly misguided technology-driven decision. (Note the subtle semantic distinction between profound and merely hopeless.) What kind of language and dictionary we need to write question? SPARQL? What kind of language and dictionary we need to write answer? XML, CSV? Use whatever notation is appropriate for your application. But you must design the overall system in such a way that the choice for one application is *invisible* to anybody who is designing or using some other application. Of course, there may be some cases where real-time constraints make it necessary to avoid a conversion routine between two systems. But that is a very low-level optimization that should never affect the semantics. For example, when was the last time that you thought about the packet transmissions for your applications? Some system programmers worry about those things a lot. But they're invisible at the semantic level. Where is your SPARQL end point at least? When you are thinking about semantics, any thought about the difference between SPARQL, SQL, or some bit-level access to data is totally irrelevant. Please remember that commercial DB systems provide all those ways of accessing the data if some programmer who works down at the bit level needs them. But anybody who is working on semantics should never think about them (except in those very rare cases when they go down to the subbasement to talk with system programmers about real-time constraints.) JS: "but every application will have... different vocabularies, and different dialects." Inside. But with a stranger we usually change language to common. Not necessarily. Sometimes you learn their language, they learn your language, or you bring a translator with you. But it's essential to distinguish three kinds of languages: natural languages, computer languages, and logic. For NLs, translation is never exact because they all have hidden ontology buried down in their lowest levels. For computer languages, the level of exactness depends on the amount of buried ontology. Some computer systems (such as the TCP/IP protocols) do translation from strings to packets very fast because they don't impose any constraints on the ontology. Therefore, programmers above the lowest system levels never think about those translations. For other systems, such as poorly designed software, the ontology changes in subtle ways with every release and patch to any system. (I won't name any names, but we've seen such things all too often.) But first order logic was *discovered* independently by Frege and Peirce 130 years ago, and *exact* translation between their notations and all the modern notations for FOL is guaranteed. Note the word 'discover'. Frege and Peirce did not *invent* FOL. My comment is that FOL was standardized by an authority that is even higher than ISO -- namely, God. (Please note the Bible, John 1,1: "In the beginning was the logos, and the logos was with God, and God was the logos.") Nobody has to learn FOL, because it's buried inside their native language, whatever it may be. But some notations for FOL are less readable than others. That's why I recommend controlled NLs for many purposes. But learning to write FOL is nontrivial, even in a controlled NL. The reason for the difficulty is that people are used to the flexibility of their native languages with all that built-in ontology. To write pure FOL requires a very strict discipline to distinguish the logic from the implicit ontology. Bottom line: The distinction between logic and ontology is so important that you should never confuse people with extraneous issues about bit strings, angle brackets, or even decidability. John _