On Montag, 9. Juni 2008, Reini Urban wrote:
> 2008/6/8 Rolf Lampa <[EMAIL PROTECTED]>:

<snip>

>
> Articifical Intelligence had the simple concept of unification which
> is basically
> the concept of #ask.
> You match a pattern against a list of facts/rules, and recursively bind
> variables to the result.
> In the end you get a list of matched patterns with the bound variables.
> You might know that from perl regex pattern matching, though the perl
> syntax is considerably limited compared to the original lisp or prolog
> syntax. But even the perl matcher is still better than this proposal here.
> Not comparing to typical prolog or lisp matchers.
>
> Where is the logic? 

We are discussing syntax here, but, since you are asking: the logic on which 
OWL is based is description logic, and variable-free syntax is common there. 
Moreover, this is so for good reasons, that I shall try to explain.

> Where are the bound variables? I see only an AND combination and one public
> variable and two unnamed internal chained variables. 

There are no variables in the syntax. I also do not agree that the presence of 
variables equals simplicity or usability, since many users are not at home 
with variables either. But there are better reasons not to have them. Read 
on.

>
> {{#ask: [[works at::<q>[[located
> in::<q>[[population::>1000000]]</q>]]</q>]]}} to find someone working at an
> organisation located in a place with more than 1000000 inhabitants.
>
> Would be written as (reverse lisp syntax):
> (match (and (works_at _who _org) (located_in _org _place)(population
> _place _num)(> _num 10000000))
>          *database*)
> => (((_who someone)(_org name)(_place city)(_num 12000000))
>       (...)))
> but usually in more readable horn clauses.
> Wiki syntax should be simple and an advantage for the untrained user,
> not some horrifying xml mess.
>
> What you are trying here is using xml syntax as () replacement
> (<q>...</q>), but without
> using the advantages of nested trees.

Indeed, using <q> for () is not nice, and we would like to change it as well. 
The . notation is a special case where one could entirely get rid of () and 
<q>, hence this discussion. The reason why we do not use () is that it is not 
parseable in a semantically unique way: () may be part of query values. Much 
special syntax in wikis is rather complex because almost all symbols can 
occur directly without any escaping. We have to adjust to that, whether we 
like it or not.


> You omit the vars, which are temporarily bound and should most likely
> be returned to the user.

I have to disagree. We had that in very early SMW releases, and it leads to 
result formats that are not useful in practice, simply because you then 
retrieve all possible combinations of all bindings (as we know it from 
Prolog). Thus you get every result multiple times, the overall result becomes 
less readable, and the result size and query execution time increases a lot. 
Storing values for each of our "implicit" variables also prevents various 
optimisation methods in query execution. Summing up, this feature decreases 
performance and usability, even though it would sometimes be useful.

> You omit the possibility to do logic calculation in the query, the
> pattern matcher.

Logic calculation? Not being sure what you mean here, I just note that we have 
a declarative semantics. Things are, from a user perspective, 
not "calculated". There is no "control flow" or "logic as program" involved: 
A query *describes* a set of results, and SMW returns this set.

> Well, you can seperate the var definition which will be presented as
> result from the logic.
> This could be a seperate argument to the function.
>   (#ask rules-with-vars-to-be-bound format-to-represent-bound-vars)

We have no rules either, but of course one could have a conjunctive query (a 
conjunction of query atoms with variables and constant names), and compute 
results for that.

>
> [[..]] denotes in link syntax the fact and rule representation. okay.
> better than nothing (whitespace and operator precedence rules)

Again, this is mostly wiki-syntax legacy. We need to stay with syntactic 
constructs that MediaWiki users are already familiar with.

> <q>...</q> denote recursive subqueries. well, also better than nothing.

Indeed.

>
> But with this simple tail-recursive-only chain you are very limited in
> your query language.

Yes, exactly! This is absolutely on purpose -- a feature, not a bug. The 
reason is simple: we want a system that still works with large amounts of 
data. As you may know, the kind of queries you describe are NP-complete, even 
if we do not admit any schema information in the language. This means, that 
the best known algorithm for answering these queries requires *exponentially* 
long time in the size of the query. With a basic algorithm, you would have 
the size of the wiki database *to the power of* the number of variables in 
your query. I know of no practical implementation that could do that in the 
use cases that we already cater for.

In comparison, SMW's querying complexity without schema information is 
LogSpace, i.e. strictly *below* linear time. If you add property and class 
hierarchies, then it becomes linear in the size of these hierarchies (which 
is already a problem for really large hierarchies, but not so much in the 
typical SMW installation).

If we add some simple OWL DL features, conjunctive querying becomes PSpace 
complete (when there are just non-recursive role chains), ExpTime complete 
(for logics with disjunction in subclass axioms), or even 2ExpTime complete 
(for OWL Lite and similar logics). For OWL DL, it is not currently known 
whether conjunctive queries are decidable at all.

What you ask for is an increase of expressive power that is just beyond what 
we can practically implement for a large DB-based data store. This is the 
reason why we restrict to "tree-shaped" queries that can be efficiently 
executed in databases. Of course, you can still say that it would be better 
to have explicit variables, but then you would have a query language that can 
express queries that are not tree shaped. How would you explain to a user 
which queries are admissible and which are not? The current syntax makes the 
tree-structure implicit -- you just cannot create anything else -- without 
even requiring to explain the problems I just sketched to our users. I think, 
in spite of some ugly syntactic details, this is rather elegant.


> Also, what you cannot do with this syntax, is OR and NOT, and also not
> the explicitly bound vars.

OR is supported (in some places it is already, in others it lacked a syntax 
and implementation which was added for SMW 1.2, to be released).

NOT of course would only make sense in a "closed world" fashion, since the 
wiki does not entail any negative information. I think such a NOT would be a 
suitable addition. Of course it is not available for OWL DL, but it would 
only be allowed in queries and therefore there would be no problem with 
exporting it. Maybe we can add that in future releases.

>
> Better allow variables and find a syntax against the [[ ]] and xml mess.

As I said, the XML-like tags are about to go. The [[]] certainly is going to 
stay, since it is a basic syntax feature of MediaWiki.

> I would consult an AI expert who can explain the basics of a matcher /
> unifier and the typical query languages.

I understand. You are of course welcome to ask. The problem with logic 
programming based approaches of course is that most systems fail when dealing 
with large databases. The reason is that typical implementations assume 
in-memory availability of all data, which is not feasible for a large 
multi-user database system.

>
> phpwiki uses simple php syntax for the allowed mathematical
> expressions in its query language,

This is yet another topic: should there be mathematical built-ins in the query 
language? This is discussed for OWL 2 (but still very tentative), and maybe 
the outcomes of this could inspire SMW. But mathematical reasoning is not so 
simple, since you need high-precision math or something like interval 
arithmetic to deal with it.

> with typical () nesting and operator precedence and temporary
> variables, which are bound
> to the found results.
> The lisp or Visual Basic alike syntax (for _var ...) binds the
> pagename _var to the subquery.
> (with is also nice).
> Optionally you can define a formatting template to present the list of
> bound vars as list,
> for the default html output.
>
> A simple query with one implicit var, the pagename, and the attribute
> population:
>
> <?plugin SemanticSearchAdvanced
>        s="is_a::city and (population < 1.000.000 or population >
> 10.000.000)" ?>
>
> <?plugin SemanticSearchAdvanced
>       s="(is_a::city or is_a::country) and population < 10.000.000" ?>

For comparison, the SMW version of that:

{{#ask: [[Category:City||Country]] [[Population::<10,000,000]] }}

>
> Now your example with subqueries and temp. variables:
>
> <?plugin SemanticSearchAdvanced
>   s="works_at::_organization
>         and (for _organization located_in::_city
>                and (for _city population>1000000))" ?>
>
> => table of
> _organization _city population

I do not see that this is simpler than the SMW one-liner with the . syntax. 
And again, imagine how to educate a user to ask only tree-shaped queries when 
using the above format ...

>
> and now
>
> <?plugin SemanticSearchAdvanced
>   s="works_at::_organization
>         and (for _organization
>                      (located_in::_city and (for _city is_a::City and
> population>1000000))
>                  or (located_in::_country and (for _country
> is_a::Country and population>5000000)))" ?>
> => table of
> _organization _city _country population


Mmh, I understand that we all have syntax preferences, but this example does 
not appear to be so appealing to me. But it is good to see that phpwiki 
addresses your needs already. No wiki can ever meet all possible 
requirements, so maybe phpwiki is just more well suited for your needs?

-- Markus

-- 
Markus Krötzsch
Semantic MediaWiki    http://semantic-mediawiki.org
http://korrekt.org    [EMAIL PROTECTED]

Attachment: signature.asc
Description: This is a digitally signed message part.

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to