Re: [SMW-devel] [PATCH] Support LIKE in queries

2008-01-02 Thread Thomas Bleher
* Markus Krötzsch [EMAIL PROTECTED] [2008-01-02 08:37]:
 On Sonntag, 30. Dezember 2007, Thomas Bleher wrote:
  * Markus Krötzsch [EMAIL PROTECTED] [2007-12-30 22:10]:
   OK, my conclusion now was to support the following syntax:
  
   [[property% *subs?r*]]
  
   where ? and * represent _ and % in SQL.
 
  I think this is fine generally, but now you cannot query for a literal * or
  ? anymore, AFAIK.
 
 I would not consider this to be a major issue, given that those characters 
 are 
 not too common in typical application strings, and given the fact that 
 using ? still queries for some symbol in that place -- it seems to be 
 very unlikely that too strings differ only in one position where the query 
 string has a ?. So in most cases it will have the same hits anyway (yes, 
 there are some cases that could be problematic [1] ;).

Agreed.

 Anyway, I will leave this issue at rest until any user actually complains 
 about this limitation.

Here I have to respectfully disagree.
It seems unwise to wait until someone complains, when there is already a
patch resolving the issue. Why spend more time later on when the issue
can just be fixed right now?

OK, the regexes where not very readable, but it doesn't really make the
code more complicated.

FWIW, the regexes where so ugly only because backslashes have to be escaped
twice for PHPs preg_replace (so a single \ becomes ).

If we used ! as an escape sequence instead of \, the regexes would look
like this (untested):

$value = str_replace(array('%', '_'), array('!%', '!_'), $value);
$value = preg_replace('/(?!!)((?:!!)*)\*/', '$1%', $value); // if there's an 
even number of \, change * to % 
$value = preg_replace('/(?!!)((?:!!)*)\?/', '$1_', $value); // ditto for ?  
and _ 
$value = preg_replace('/(?!!)((?:!!)*)!\*/', '$1*', $value); // if there's an 
odd number, * was escaped and should stay as is; but the last \ is removed 
$value = preg_replace('/(?!!)((?:!!)*)!\?/', '$1?', $value); // ditto for ?

(?: ) is a subexpression for grouping, not capturing,
(?! ) is zero-width negative look-behind (i.e. we make sure that the
character before our match is not !).

Regards,
Thomas
 
 [1] http://de.wikipedia.org/wiki/Die_drei_%3F%3F%3F
 
 
  Not a huge deal, but before, a_b searched for a, followed by any char,
  followed by b, while a\_b searched for exactly a_b.
 
  Properly escaping everything gets messy rather quickly, as \ can also be
  escaped to query for a literal \, so you need translations like:
 
  ?= _
  \?   = ?
  \\?  = \\_
  \\\? = \\?
 
  The following regular expressions work fine for me, but unfortunately they
  are quite ugly:
 
  $value = str_replace(array('%', '_'), array('\%', '\_'), $value); // escape
  % and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1%',
  $value); // if there's an even number of \, change * to % $value =
  preg_replace('/(?!)((?:)*)\?/', '$1_', $value); // ditto for ?
  and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1*',
  $value); // if there's an odd number, * was escaped and should stay as is;
  but the last \ is removed $value =
  preg_replace('/(?!)((?:)*)\?/', '$1?', $value); // ditto
  for ?
 
  I think these should be added to SMW, so all characters can be queried.
 
  Regards,
  Thomas
 
 
 
 -- 
 Markus Krötzsch
 Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe
 phone +49 (0)721 608 7362fax +49 (0)721 608 5998
 [EMAIL PROTECTED]www  http://korrekt.org



signature.asc
Description: Digital signature
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] [PATCH] Support LIKE in queries

2008-01-01 Thread Markus Krötzsch
On Sonntag, 30. Dezember 2007, Thomas Bleher wrote:
 * Markus Krötzsch [EMAIL PROTECTED] [2007-12-30 22:10]:
  OK, my conclusion now was to support the following syntax:
 
  [[property% *subs?r*]]
 
  where ? and * represent _ and % in SQL.

 I think this is fine generally, but now you cannot query for a literal * or
 ? anymore, AFAIK.

I would not consider this to be a major issue, given that those characters are 
not too common in typical application strings, and given the fact that 
using ? still queries for some symbol in that place -- it seems to be 
very unlikely that too strings differ only in one position where the query 
string has a ?. So in most cases it will have the same hits anyway (yes, 
there are some cases that could be problematic [1] ;).

Anyway, I will leave this issue at rest until any user actually complains 
about this limitation.

Regards,

Markus

[1] http://de.wikipedia.org/wiki/Die_drei_%3F%3F%3F


 Not a huge deal, but before, a_b searched for a, followed by any char,
 followed by b, while a\_b searched for exactly a_b.

 Properly escaping everything gets messy rather quickly, as \ can also be
 escaped to query for a literal \, so you need translations like:

 ?= _
 \?   = ?
 \\?  = \\_
 \\\? = \\?

 The following regular expressions work fine for me, but unfortunately they
 are quite ugly:

 $value = str_replace(array('%', '_'), array('\%', '\_'), $value); // escape
 % and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1%',
 $value); // if there's an even number of \, change * to % $value =
 preg_replace('/(?!)((?:)*)\?/', '$1_', $value); // ditto for ?
 and _ $value = preg_replace('/(?!)((?:)*)\*/', '$1*',
 $value); // if there's an odd number, * was escaped and should stay as is;
 but the last \ is removed $value =
 preg_replace('/(?!)((?:)*)\?/', '$1?', $value); // ditto
 for ?

 I think these should be added to SMW, so all characters can be queried.

 Regards,
 Thomas



-- 
Markus Krötzsch
Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe
phone +49 (0)721 608 7362fax +49 (0)721 608 5998
[EMAIL PROTECTED]www  http://korrekt.org


signature.asc
Description: This is a digitally signed message part.
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-31 Thread Markus Krötzsch
On Sonntag, 30. Dezember 2007, Yaron Koren wrote:
 Dan - I doubt that there will ever be both a regex and a wildcard
 option in SMW's query language - that seems like overkill, and somewhat bad
 design. A single such option is enough, and if it happens, behind the
 scenes, to use both SQL's and PHP's pattern-matching capabilities at
 different times, that should be hidden from the user. So I doubt that
 there'll be a need for two different symbols (Markus, or anyone else,
 correct me if I'm wrong).

 So, let me argue in favor of the ~ symbol - hopefully it's not too late
 before the Sunday evening deadline. :) 

There was a drastic change in the parser of MediaWiki 1.12 that has caused 
some delay. So deadline is moved to today ;-)

 The Halo extension is a helpful one, 
 but it's a spinoff of SMW, and thus there's no reason why it should hamper
 design decisions in SMW. That goes for all extensions that use Semantic
 MediaWiki - I know, for my own part, that the extensions I've created have
 to do all sorts of work to be compatible with the different versions of
 SMW. That's as it should be - the spinoffs work around the main
 application. From what I understand, Halo is currently not compatible with
 the most recent versions of SMW anyway, so it needs to be modified anyway -
 there's no need to try to ensure backwards compatibility.

 And, as you point out, that functionality in Halo might not be getting used
 at all - though even if it were, that shouldn't affect how SMW is designed.

OK, I am convinced. Done.

Markus


 -Yaron

 On Dec 29, 2007 9:54 PM, DanTMan  [EMAIL PROTECTED] wrote:
  ^_^ ok, I thought we escaped with a \, which isn't something that normal
  users would find easy to use. But a starting space escape is ok.
 
  I still would pick ~ as the best thing for use of REGEX and prefer a
  different operator for wild cards
  I guess the % is probably best for the wild card operator. Which brings
  me the thought of:
 
  EQ:[[property::value]]
  NEQ:   [[property::!value]]
  GT:[[property::value]]
  LT:[[property::value]]
  WILD:  [[property::%value]] (Using ? and *)
 
  Also, I propose a few more additions since they will probably have some
  good use to.
 
  GTEQ:  [[property::=value]]
  LTEQ:  [[property::=value]]
  NWILD: [[property::!%value]] (Negated wild card)
  REGEX: [[property::~value]] or perhaps [[property::~/value/i]] (/ could
  of course be replaced with !, [], etc... any valid in preg.
  NGT:   [[property::#value]] (Natural order greater than)
  NLT:   [[property::#value]] (Natural order less than)
  NGTEQ: [[property::#=value]] (Natural order greater than or equal to)
  NLTEQ: [[property::#=value]] (Natural order less than or equal to)
 
  Of course, the REGEX one is provided that we can fix the issue of
  colliding with Halo.
  But on note of that negated wild card. I added that one for one primary
  reason. Unlike any of the other things, you cannot negate a wild card
  with any other format. ( can be negated with =, eq with !, and regex
  can negate things inside of it. But you can't negate a wild card) Also,
  remember to escape things so that we can use (\* and \? to use those
  literally; I could draft all the replaces needed, but I got to go do
  something first)
  As for the Natural order ones, if you don't know what those are for,
  it's things like values of 1.2.3 and 1.12.3. Using a normal  it
  thinks that 1.2.3 is greater than 1.12.3 because the third character
  is a two and the third character in the other is a 1. But a natural
  order properly distinguishes the second number as 12. PHP has functions
  for these built in and would be nice for use.
 
  ~Daniel Friesen(Dantman) of:
  -The Gaiapedia ( http://gaia.wikia.com)
  -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
  -and Wiki-Tools.com ( http://wiki-tools.com)
 
  Markus Krötzsch wrote:
   On Samstag, 29. Dezember 2007, DanTMan wrote:
   A lot of people are accustomed to the ? (single-character match) and *
   (multi-character match) format. It would be easy to escape the '_'s
   and '%'s in a match and then do a replace of ? to _ and * to %. (A
   little preg and \ could still easily escape those.)
  
   Yes, I agree to that. I think, if nobody objects, this fixes the
   pattern syntax. So it remains to find a good symbol for the comparator.
  
   I don't know about ~ though, in the languages I've used I recall ~
   having something to do with regex. I'd rather save that character for
 
  in
 
   case we want to be able to use the REGEXP matching inside of SQL.
  
From what I remember, I think most people with only a little insight
   into technical stuff, would adjust easiest to using this set:
   = Equals
  
 Greater than
= Greater than or equal to
  
Less than or equal to
   ! Not
   * Multi-character match
   ? Single-character match
   ~ regex
  
   As a note: = is not available in parser function #ask, since it has a
   special meaning as parameter assignment, 

Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-30 Thread Yaron Koren
Dan - I doubt that there will ever be both a regex and a wildcard option
in SMW's query language - that seems like overkill, and somewhat bad design.
A single such option is enough, and if it happens, behind the scenes, to use
both SQL's and PHP's pattern-matching capabilities at different times, that
should be hidden from the user. So I doubt that there'll be a need for two
different symbols (Markus, or anyone else, correct me if I'm wrong).

So, let me argue in favor of the ~ symbol - hopefully it's not too late
before the Sunday evening deadline. :) The Halo extension is a helpful one,
but it's a spinoff of SMW, and thus there's no reason why it should hamper
design decisions in SMW. That goes for all extensions that use Semantic
MediaWiki - I know, for my own part, that the extensions I've created have
to do all sorts of work to be compatible with the different versions of SMW.
That's as it should be - the spinoffs work around the main application. From
what I understand, Halo is currently not compatible with the most recent
versions of SMW anyway, so it needs to be modified anyway - there's no need
to try to ensure backwards compatibility.

And, as you point out, that functionality in Halo might not be getting used
at all - though even if it were, that shouldn't affect how SMW is designed.

-Yaron


On Dec 29, 2007 9:54 PM, DanTMan  [EMAIL PROTECTED] wrote:

 ^_^ ok, I thought we escaped with a \, which isn't something that normal
 users would find easy to use. But a starting space escape is ok.

 I still would pick ~ as the best thing for use of REGEX and prefer a
 different operator for wild cards
 I guess the % is probably best for the wild card operator. Which brings
 me the thought of:

 EQ:[[property::value]]
 NEQ:   [[property::!value]]
 GT:[[property::value]]
 LT:[[property::value]]
 WILD:  [[property::%value]] (Using ? and *)

 Also, I propose a few more additions since they will probably have some
 good use to.

 GTEQ:  [[property::=value]]
 LTEQ:  [[property::=value]]
 NWILD: [[property::!%value]] (Negated wild card)
 REGEX: [[property::~value]] or perhaps [[property::~/value/i]] (/ could
 of course be replaced with !, [], etc... any valid in preg.
 NGT:   [[property::#value]] (Natural order greater than)
 NLT:   [[property::#value]] (Natural order less than)
 NGTEQ: [[property::#=value]] (Natural order greater than or equal to)
 NLTEQ: [[property::#=value]] (Natural order less than or equal to)

 Of course, the REGEX one is provided that we can fix the issue of
 colliding with Halo.
 But on note of that negated wild card. I added that one for one primary
 reason. Unlike any of the other things, you cannot negate a wild card
 with any other format. ( can be negated with =, eq with !, and regex
 can negate things inside of it. But you can't negate a wild card) Also,
 remember to escape things so that we can use (\* and \? to use those
 literally; I could draft all the replaces needed, but I got to go do
 something first)
 As for the Natural order ones, if you don't know what those are for,
 it's things like values of 1.2.3 and 1.12.3. Using a normal  it
 thinks that 1.2.3 is greater than 1.12.3 because the third character
 is a two and the third character in the other is a 1. But a natural
 order properly distinguishes the second number as 12. PHP has functions
 for these built in and would be nice for use.

 ~Daniel Friesen(Dantman) of:
 -The Gaiapedia ( http://gaia.wikia.com)
 -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
 -and Wiki-Tools.com ( http://wiki-tools.com)

 Markus Krötzsch wrote:
  On Samstag, 29. Dezember 2007, DanTMan wrote:
 
  A lot of people are accustomed to the ? (single-character match) and *
  (multi-character match) format. It would be easy to escape the '_'s and
  '%'s in a match and then do a replace of ? to _ and * to %. (A little
  preg and \ could still easily escape those.)
 
 
  Yes, I agree to that. I think, if nobody objects, this fixes the pattern
  syntax. So it remains to find a good symbol for the comparator.
 
 
  I don't know about ~ though, in the languages I've used I recall ~
  having something to do with regex. I'd rather save that character for
 in
  case we want to be able to use the REGEXP matching inside of SQL.
 
   From what I remember, I think most people with only a little insight
  into technical stuff, would adjust easiest to using this set:
  = Equals
 
Greater than
   = Greater than or equal to
 
   Less than or equal to
  ! Not
  * Multi-character match
  ? Single-character match
  ~ regex
 
 
  As a note: = is not available in parser function #ask, since it has a
  special meaning as parameter assignment, as e.g. in format=table. The
 query
  is distinguished from the other parameters and print requests in #ask
 since
  it has no = symbol and does not start on ?.
 
 
  But I did have a thought about the @... It's not used anywhere afaik.
  I did make a suggestion on using a pattern to separate the 

Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-29 Thread Markus Krötzsch
On Samstag, 29. Dezember 2007, DanTMan wrote:
 A lot of people are accustomed to the ? (single-character match) and *
 (multi-character match) format. It would be easy to escape the '_'s and
 '%'s in a match and then do a replace of ? to _ and * to %. (A little
 preg and \ could still easily escape those.)

Yes, I agree to that. I think, if nobody objects, this fixes the pattern 
syntax. So it remains to find a good symbol for the comparator.

 I don't know about ~ though, in the languages I've used I recall ~
 having something to do with regex. I'd rather save that character for in
 case we want to be able to use the REGEXP matching inside of SQL.

  From what I remember, I think most people with only a little insight
 into technical stuff, would adjust easiest to using this set:
 = Equals

   Greater than
  = Greater than or equal to

  Less than or equal to
 ! Not
 * Multi-character match
 ? Single-character match
 ~ regex

As a note: = is not available in parser function #ask, since it has a 
special meaning as parameter assignment, as e.g. in format=table. The query 
is distinguished from the other parameters and print requests in #ask since 
it has no = symbol and does not start on ?.


 But I did have a thought about the @... It's not used anywhere afaik.
 I did make a suggestion on using a pattern to separate the comparators
 from the match value. It was using [[Property::comparitor::match]], but
 as I now remember SMW lets you use :: to specify multiple properties.
 However it may be a good idea if the separator was one which wouldn't
 cause conflicting issues with other things. 

Maybe I should remark that the comparator we chose will never block any symbol 
from being used in values. You can always escape the initial comparator by 
inserting an initial space (which is ignored in all values). For instance, to 
look for pages with property value strange value, one could write 

[[some property:: strange value]]

whereas [[some property::strange value]] would be equivalent to 

[[some property:: strange value]]

which matches all values (alphabetically) smaller than strange value. So we 
can pick any comparator letter without conflicts.

 @ is not commonly used and 
 does provide a little bit of a way for people to understand it's use. Or
 if you want a little farther from what can actually be used in a title
 (To avoid clashing with things) the # is always invalid.
 Say, [[prop::[EMAIL PROTECTED] or [[prop::comp#match]]. So for a not [[Has
 value::[EMAIL PROTECTED] or [[Has value::!#Value]].

Basically, spaces already play the role of your proposed @ or #.

 I'm probably droning on now... But what about finding a good separator
 and allowing textual names ie: EQ[=], NOT/NEQ/[!] (!= could be thought
 of),LT[], GT[], REGEX(P)[~], LIKE[%_], wildcard[*?], etc...

Not sure whether that would be better internationally.  seems to be more 
universally understood than LT.

Another remark: ::! stands for inequality (NEQ), not for negation (NOT). It 
looks for pages that have some property value unequal to the one that was 
given, and it does not matter whether or not they also have some value that 
is equal. So a page that is annotated with [[property::1]] and 
[[property::2]] would match a query atom [[property::!1]].

 There also is the possibility of instead of a separator, using brackets
 to encompass a comparator. I can hardly think of many places which would
 use (NOT) at the start of a title ([[Has value::(NOT) Title]]) or, we
 also have the {} and [] type brackets. [] is used by external links, but
 {} is only used in multiples as a template or variable bit but never has
 use singularly, templates and values will have already been parsed out
 so only the singles remain, and as a bonus, { and } are illegal in
 titles. So [[Has value::{NOT} Title]] is guaranteed to never clash with
 a legal title or match you can make. If you're worried about templates
 and parsing issues, those can't occur when your using something like
 {{{1}}} as the title ([[Has value:{NOT} {{{1}}}]]) so there's no clash.
 The only potential class is if someone wants to use {{{comparator|EQ}}}
 to specify the comparator. In that case, we could easily make { EQ }
 valid (trim spaces), so { {{{comparator|EQ}}} } would work.

Yes, that would work too. But I am happy with our spaces (the fact that 
initial and trailing spaces are ignored in all property values is the key to 
make that work, and I think there is no harm in assuming that).

There is, in principle, no problem with having multi-char sequences for 
comparators, but I would prefer something that does not require 
internationalisation. So, given that we use * and ? instead of % and _, there 
are the following options:

1-  [[property::%*substring*]]
2-  [[property::#*substring*]]
3-  [[property::~*substring*]] (clashes with Halo)
4-  [[property::@*substring*]]
5-  maybe more ...

My order of preference would be 3, 1, 4, 2, and I opt for 1 due to the Halo 
issue. Further 

Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-29 Thread DanTMan
^_^ ok, I thought we escaped with a \, which isn't something that normal
users would find easy to use. But a starting space escape is ok.

I still would pick ~ as the best thing for use of REGEX and prefer a
different operator for wild cards
I guess the % is probably best for the wild card operator. Which brings
me the thought of:

EQ:[[property::value]]
NEQ:   [[property::!value]]
GT:[[property::value]]
LT:[[property::value]]
WILD:  [[property::%value]] (Using ? and *)

Also, I propose a few more additions since they will probably have some
good use to.

GTEQ:  [[property::=value]]
LTEQ:  [[property::=value]]
NWILD: [[property::!%value]] (Negated wild card)
REGEX: [[property::~value]] or perhaps [[property::~/value/i]] (/ could 
of course be replaced with !, [], etc... any valid in preg.
NGT:   [[property::#value]] (Natural order greater than)
NLT:   [[property::#value]] (Natural order less than)
NGTEQ: [[property::#=value]] (Natural order greater than or equal to)
NLTEQ: [[property::#=value]] (Natural order less than or equal to)

Of course, the REGEX one is provided that we can fix the issue of
colliding with Halo.
But on note of that negated wild card. I added that one for one primary
reason. Unlike any of the other things, you cannot negate a wild card
with any other format. ( can be negated with =, eq with !, and regex
can negate things inside of it. But you can't negate a wild card) Also,
remember to escape things so that we can use (\* and \? to use those
literally; I could draft all the replaces needed, but I got to go do
something first)
As for the Natural order ones, if you don't know what those are for,
it's things like values of 1.2.3 and 1.12.3. Using a normal  it
thinks that 1.2.3 is greater than 1.12.3 because the third character
is a two and the third character in the other is a 1. But a natural
order properly distinguishes the second number as 12. PHP has functions
for these built in and would be nice for use.

~Daniel Friesen(Dantman) of:
-The Gaiapedia (http://gaia.wikia.com)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
-and Wiki-Tools.com (http://wiki-tools.com)

Markus Krötzsch wrote:
 On Samstag, 29. Dezember 2007, DanTMan wrote:
   
 A lot of people are accustomed to the ? (single-character match) and *
 (multi-character match) format. It would be easy to escape the '_'s and
 '%'s in a match and then do a replace of ? to _ and * to %. (A little
 preg and \ could still easily escape those.)
 

 Yes, I agree to that. I think, if nobody objects, this fixes the pattern 
 syntax. So it remains to find a good symbol for the comparator.

   
 I don't know about ~ though, in the languages I've used I recall ~
 having something to do with regex. I'd rather save that character for in
 case we want to be able to use the REGEXP matching inside of SQL.

  From what I remember, I think most people with only a little insight
 into technical stuff, would adjust easiest to using this set:
 = Equals

   Greater than
  = Greater than or equal to

  Less than or equal to
 ! Not
 * Multi-character match
 ? Single-character match
 ~ regex
 

 As a note: = is not available in parser function #ask, since it has a 
 special meaning as parameter assignment, as e.g. in format=table. The query 
 is distinguished from the other parameters and print requests in #ask since 
 it has no = symbol and does not start on ?.

   
 But I did have a thought about the @... It's not used anywhere afaik.
 I did make a suggestion on using a pattern to separate the comparators
 from the match value. It was using [[Property::comparitor::match]], but
 as I now remember SMW lets you use :: to specify multiple properties.
 However it may be a good idea if the separator was one which wouldn't
 cause conflicting issues with other things. 
 

 Maybe I should remark that the comparator we chose will never block any 
 symbol 
 from being used in values. You can always escape the initial comparator by 
 inserting an initial space (which is ignored in all values). For instance, to 
 look for pages with property value strange value, one could write 

 [[some property:: strange value]]

 whereas [[some property::strange value]] would be equivalent to 

 [[some property:: strange value]]

 which matches all values (alphabetically) smaller than strange value. So 
 we 
 can pick any comparator letter without conflicts.

   
 @ is not commonly used and 
 does provide a little bit of a way for people to understand it's use. Or
 if you want a little farther from what can actually be used in a title
 (To avoid clashing with things) the # is always invalid.
 Say, [[prop::[EMAIL PROTECTED] or [[prop::comp#match]]. So for a not [[Has
 value::[EMAIL PROTECTED] or [[Has value::!#Value]].
 

 Basically, spaces already play the role of your proposed @ or #.

   
 I'm probably droning on now... But what about finding a good separator
 and allowing textual names ie: EQ[=], NOT/NEQ/[!] (!= could be thought
 

Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-28 Thread Markus Krötzsch
On Freitag, 28. Dezember 2007, Yaron Koren wrote:
 How about ~%substring% instead? The ~ is the symbol for pattern matching
 in Perl and some UNIX languages, and it might be a clearer indicator of
 function than %.


I would immediately use that, but IFRC the Halo extension has a similar syntax 
for a custom editing-distance database function (requires modified MySQL 
version, and probably also has significant performance issues).

So the question is whether we want to overwrite that (assuming that this 
particular Halo function is not used widely), or is there another idea for 
doing it? Other imaginable operators on my keyboard would be #, , ?, @ -- 
none really as nice as ~ ...

Markus  


 On Dec 27, 2007 2:16 PM, Markus Krötzsch [EMAIL PROTECTED] wrote:
  Thanks. I have applied the patch, and added a way of configuring this
  feature:
  the parameter $smwgQComparators gives a (|-separated) list of supported
  comparators, and can be used to enable or disable any of , , !, and %.
  By
  default its value is  '||!|%'.
 
  In this way one can also disable ! or even ,  if these are considered
  to be
  problematic.
 
  I wonder whether one should use another character instead of % as a
  wildcard
  inside the pattern string, so that no double-% confusion can arise. Would
  *
  be an alternative or would it be too confusing w.r.t. the old ask print
  requests? What about +? According examples (preprocessing would in each
  case
  ensure full compatibility with SQL):
 
  - %%substring%
  - %*substring*
  - %+substring+
 
  Cheers
 
  Markus
 
  On Donnerstag, 20. Dezember 2007, Asheesh Laroia wrote:
   On Thu, 20 Dec 2007, Thomas Bleher wrote:
Yesterday I needed LIKE queries for properties, so I added it to SMW
(patch attached). It was surprisingly simple.
  
   This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki.
  
   It would be great if later SMW could have Valgol support
   http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html.
  
   -- Asheesh.
  
   P.S. In all total like seriousness, queries with LIKE support are a
   good idea
  
   --
   The star of riches is shining upon you.
 
  -
 
   This SF.net email is sponsored by: Microsoft
   Defy all challenges. Microsoft(R) Visual Studio 2005.
   http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
   ___
   Semediawiki-devel mailing list
   Semediawiki-devel@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
 
  --
  Markus Krötzsch
  Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe
  phone +49 (0)721 608 7362fax +49 (0)721 608 5998
  [EMAIL PROTECTED]www  http://korrekt.org
 
  -
  This SF.net email is sponsored by: Microsoft
  Defy all challenges. Microsoft(R) Visual Studio 2005.
  http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
  ___
  Semediawiki-devel mailing list
  Semediawiki-devel@lists.sourceforge.net
  https://lists.sourceforge.net/lists/listinfo/semediawiki-devel



-- 
Markus Krötzsch
Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe
phone +49 (0)721 608 7362fax +49 (0)721 608 5998
[EMAIL PROTECTED]www  http://korrekt.org


signature.asc
Description: This is a digitally signed message part.
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-28 Thread DanTMan
A lot of people are accustomed to the ? (single-character match) and * 
(multi-character match) format. It would be easy to escape the '_'s and 
'%'s in a match and then do a replace of ? to _ and * to %. (A little 
preg and \ could still easily escape those.)
I don't know about ~ though, in the languages I've used I recall ~ 
having something to do with regex. I'd rather save that character for in 
case we want to be able to use the REGEXP matching inside of SQL.


From what I remember, I think most people with only a little insight 
into technical stuff, would adjust easiest to using this set:

= Equals
 Greater than
= Greater than or equal to
 Less than or equal to
! Not
* Multi-character match
? Single-character match
~ regex

But I did have a thought about the @... It's not used anywhere afaik.
I did make a suggestion on using a pattern to separate the comparators 
from the match value. It was using [[Property::comparitor::match]], but 
as I now remember SMW lets you use :: to specify multiple properties. 
However it may be a good idea if the separator was one which wouldn't 
cause conflicting issues with other things. @ is not commonly used and 
does provide a little bit of a way for people to understand it's use. Or 
if you want a little farther from what can actually be used in a title 
(To avoid clashing with things) the # is always invalid.
Say, [[prop::[EMAIL PROTECTED] or [[prop::comp#match]]. So for a not [[Has 
value::[EMAIL PROTECTED] or [[Has value::!#Value]].
I'm probably droning on now... But what about finding a good separator 
and allowing textual names ie: EQ[=], NOT/NEQ/[!] (!= could be thought 
of),LT[], GT[], REGEX(P)[~], LIKE[%_], wildcard[*?], etc...
There also is the possibility of instead of a separator, using brackets 
to encompass a comparator. I can hardly think of many places which would 
use (NOT) at the start of a title ([[Has value::(NOT) Title]]) or, we 
also have the {} and [] type brackets. [] is used by external links, but 
{} is only used in multiples as a template or variable bit but never has 
use singularly, templates and values will have already been parsed out 
so only the singles remain, and as a bonus, { and } are illegal in 
titles. So [[Has value::{NOT} Title]] is guaranteed to never clash with 
a legal title or match you can make. If you're worried about templates 
and parsing issues, those can't occur when your using something like 
{{{1}}} as the title ([[Has value:{NOT} {{{1}}}]]) so there's no clash. 
The only potential class is if someone wants to use {{{comparator|EQ}}} 
to specify the comparator. In that case, we could easily make { EQ } 
valid (trim spaces), so { {{{comparator|EQ}}} } would work.


But... now I'm droning a bit much...

~Daniel Friesen(Dantman) of:
-The Gaiapedia (http://gaia.wikia.com)
-Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG)
-and Wiki-Tools.com (http://wiki-tools.com)

Markus Krötzsch wrote:

On Freitag, 28. Dezember 2007, Yaron Koren wrote:
  

How about ~%substring% instead? The ~ is the symbol for pattern matching
in Perl and some UNIX languages, and it might be a clearer indicator of
function than %.




I would immediately use that, but IFRC the Halo extension has a similar syntax 
for a custom editing-distance database function (requires modified MySQL 
version, and probably also has significant performance issues).


So the question is whether we want to overwrite that (assuming that this 
particular Halo function is not used widely), or is there another idea for 
doing it? Other imaginable operators on my keyboard would be #, , ?, @ -- 
none really as nice as ~ ...


Markus  

  

On Dec 27, 2007 2:16 PM, Markus Krötzsch [EMAIL PROTECTED] wrote:


Thanks. I have applied the patch, and added a way of configuring this
feature:
the parameter $smwgQComparators gives a (|-separated) list of supported
comparators, and can be used to enable or disable any of , , !, and %.
By
default its value is  '||!|%'.

In this way one can also disable ! or even ,  if these are considered
to be
problematic.

I wonder whether one should use another character instead of % as a
wildcard
inside the pattern string, so that no double-% confusion can arise. Would
*
be an alternative or would it be too confusing w.r.t. the old ask print
requests? What about +? According examples (preprocessing would in each
case
ensure full compatibility with SQL):

- %%substring%
- %*substring*
- %+substring+

Cheers

Markus

On Donnerstag, 20. Dezember 2007, Asheesh Laroia wrote:
  

On Thu, 20 Dec 2007, Thomas Bleher wrote:


Yesterday I needed LIKE queries for properties, so I added it to SMW
(patch attached). It was surprisingly simple.
  

This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki.

It would be great if later SMW could have Valgol support
http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html.

-- Asheesh.

P.S. In all total like seriousness, queries with LIKE support are a
good 

Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-27 Thread Markus Krötzsch
Thanks. I have applied the patch, and added a way of configuring this feature: 
the parameter $smwgQComparators gives a (|-separated) list of supported 
comparators, and can be used to enable or disable any of , , !, and %. By 
default its value is  '||!|%'.

In this way one can also disable ! or even ,  if these are considered to be 
problematic.

I wonder whether one should use another character instead of % as a wildcard 
inside the pattern string, so that no double-% confusion can arise. Would * 
be an alternative or would it be too confusing w.r.t. the old ask print 
requests? What about +? According examples (preprocessing would in each case 
ensure full compatibility with SQL):

- %%substring%
- %*substring*
- %+substring+

Cheers

Markus

On Donnerstag, 20. Dezember 2007, Asheesh Laroia wrote:
 On Thu, 20 Dec 2007, Thomas Bleher wrote:
  Yesterday I needed LIKE queries for properties, so I added it to SMW
  (patch attached). It was surprisingly simple.

 This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki.

 It would be great if later SMW could have Valgol support
 http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html.

 -- Asheesh.

 P.S. In all total like seriousness, queries with LIKE support are a good
 idea

 --
 The star of riches is shining upon you.

 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2005.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 Semediawiki-devel mailing list
 Semediawiki-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/semediawiki-devel



-- 
Markus Krötzsch
Institut AIFB, Universät Karlsruhe (TH), 76128 Karlsruhe
phone +49 (0)721 608 7362fax +49 (0)721 608 5998
[EMAIL PROTECTED]www  http://korrekt.org


signature.asc
Description: This is a digitally signed message part.
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] [PATCH] Support LIKE in queries

2007-12-20 Thread Asheesh Laroia
On Thu, 20 Dec 2007, Thomas Bleher wrote:

 Yesterday I needed LIKE queries for properties, so I added it to SMW
 (patch attached). It was surprisingly simple.

This would be LIKE TOTALLY AWESOME to get in to Semantic MediaWiki.

It would be great if later SMW could have Valgol support 
http://www.indwes.edu/Faculty/bcupp/things/computer/VALGOL.html.

-- Asheesh.

P.S. In all total like seriousness, queries with LIKE support are a good 
idea

--
The star of riches is shining upon you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel