On 22/11/12 11:22, Yury Katkov wrote:
> The premises are clear:
> (1) the current implementation of the max parser function is slow and
> (2)  there is a workaround for making max queries quicker.
>
> The conclusion is not clear: "let's drop the max ASAP".
>
> It's not that hard to replace the current implementation of MAX format
> with the faster one and save the backward compatibility there.

It would be hard to do this (cleanly) in an automated way, because it 
requires changes to many other query parameters that are superior to 
format (format is the last thing to play a role in query evaluation). 
Also, as Jeroen illustrated, there are cases where the user really wants 
to have a "local" maximum, and we would not be able to recognise these. 
So let us just keep it as it is but improve the docs (maybe this is what 
you meant).

Markus


> -----
> Yury Katkov, WikiVote
>
>
>
> On Thu, Nov 22, 2012 at 1:23 PM, Markus Krötzsch
> <mar...@semantic-mediawiki.org> wrote:
>> Hi,
>>
>> I would like to ask about this:
>>
>> http://semantic-mediawiki.org/wiki/Help:Max_format
>>
>> I am afraid to say that this idea seems to be fundamentally broken. The
>> above page seriously suggests to find the largest population number in
>> the wiki by querying for a list of *all cities with and without
>> population* and invoke PHP code that scans through this list to find the
>> maximum (this is what format=max does, AFAIK). The query to do this is:
>>
>> {{#ask: [[Category:City]]
>> | ?Population
>> | format=max
>> }}
>>
>> This is an extremely slow method of producing wrong results (the results
>> will be wrong as soon as there are enough pages in the wiki so that the
>> one with the maximum value is after the default query limit when
>> ordering results alphabetically).
>>
>> What one would do instead is to ask for the one result that has the
>> largest value right away, like this:
>>
>> {{#ask: [[Category:City]]
>> | ?Population
>> | sort=population
>> | order=DESC
>> | limit=1
>> | format=max
>> }}
>>
>> The max format in this case is obsolete, since one could also just do
>>
>> {{#ask: [[Category:City]]
>> | ?Population=
>> | mainlabel=-
>> | sort=population
>> | order=DESC
>> | limit=1
>> }}
>>
>> This has the big advantage that one can also use further output
>> formatting on the resulting number, e.g., to get it in a plain format
>> without any beautification.
>>
>>
>> I just noted these problems since there seem to be cases where PHP runs
>> out of time/memory due to users following the above query anti-pattern
>> [1]. My conclusion would be: let's drop max/min as soon as possible and
>> change the documentation to give the efficient query pattern I gave above.
>>
>> Markus
>>
>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=42347
>>
>> ------------------------------------------------------------------------------
>> Monitor your physical, virtual and cloud infrastructure from a single
>> web console. Get in-depth insight into apps, servers, databases, vmware,
>> SAP, cloud infrastructure, etc. Download 30-day Free Trial.
>> Pricing starts from $795 for 25 servers or applications!
>> http://p.sf.net/sfu/zoho_dev2dev_nov
>> _______________________________________________
>> Semediawiki-devel mailing list
>> Semediawiki-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
>


------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel

Reply via email to