Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-23 Thread Nikola Smolenski

On 23/05/12 13:53, Daniel Kinzler wrote:

On 23.05.2012 13:14, Nikola Smolenski wrote:

{{#data:color|item=Blah}} - this uses item linked to "Blah" in the local 
language.
{{#data:color|item=en:Blah}} - this uses item linked to "Blah" in English 
language.
{{#data:color|id=q123}} - this uses item with ID q123.
{{#data:Blah->color}} - we can do this since ->  can't appear in a page name -
this is my favorite of course :)


3) no explicit reference/use of the item at all. Just use parser function to
access properties, and specify the item if need be (otherwise, the page's "own"
item is used).


That is what I am suggesting, yes.


The parser function should be able to override itself by template parameters - I
believe it is possible to do this.


That makes the hair in my neck stand up :)


 From the user point of view or the implementation point of view? :)


Both. And as Gabriel pointed out, something like this would be really bad for
things like the visual editor, snippet caching, etc.


How is it different from overwriting value of a data item in the initial 
suggestion?



If there will be no pre-loading, disregard this.


Don't know what you mean by "pre"... the item will be loaded into memory once,
whenever the page is rendered. It would typically be loaded from a local cache
(e.g. in the database), an http request to the repo while rendering would be
annoyingly slow.


This is what I mean. An alternative would be to load every assertion 
from the DB at the time it is requested.


There is also a hybrid possibility: load what we expect to be used (this 
would probably be entire item in the default language), then load every 
remaining assertion when it is requested.


A quick calculation: infobox at http://en.wikipedia.org/wiki/Berlin has 
some 2K of parameters, and the article exists in 200 languages, so that 
would amount to 400K of data. Of course, not all data will be 
translatable or translated, but still 100K does not seem unreasonable.



Except that it's not wrapped in any HTML. Perhaps there should be an option to
{{#data-value}} to turn that off completely, using form=plain or some such.


Now shorten #data-value to #data, use form=plain as the default and that's it :)


Yes, we can drop the "simple" parameter-like syntax an use parser functions for
everything.

The remaining question is... how do i specify the item i want to get the
property from (if it's not the default)? Will be be assigning local names to


Well, this is what we we talked about above, at the start of this 
e-mail, right?



items, using #load-data or some such? Or will we just use the item id directly -
which would probably be a parameter, so we'd end up with something like this:

   {{#property:population|item={{{item-id|*}

...with * representing the default (the page's "own") item. Not very pretty,


The default would be used if item is not specified. I'm not sure I 
understand why are you introducing this.



I would try not to introduce new syntax if it is not necessary. How about this:

[[wikidata:Berlin]] - links to en.wikidata.org/wiki/Berlin
[[wikidataid:q1234]] - links to en.wikidata.org/id/q1234
{{canonicalurl:wikidata:Berlin|action=edit}} - links to edit page

All of this syntax already exists, is widely used and could be introduced
without additional coding :)


You'd have to do  [[wikidata:id/{{#property:id}}|the data item]].


Again, I don't see why.


But this doesn't work for edit links. Especially not if the edit link is
supposed to invoke the on-site ajax editing interface How to you generate a
link/button for doing that?


Yes, for that you have to have a new parser function, similar to 
canonicalurl above.




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-23 Thread Daniel Kinzler
On 23.05.2012 13:14, Nikola Smolenski wrote:
> I see multiple possibilities:
> 
> {{#data:color|item=Blah}} - this uses item linked to "Blah" in the local 
> language.
> {{#data:color|item=en:Blah}} - this uses item linked to "Blah" in English 
> language.
> {{#data:color|id=q123}} - this uses item with ID q123.
> {{#data:Blah->color}} - we can do this since -> can't appear in a page name -
> this is my favorite of course :)

I agree that it would be nice to allow adressing by wiki-page as well as item id
would be nice. The exact syntax should be the same that we will also use for
interwiki links to the repository. It's still to be decided.

> Every template or article could read any item without the need to pass it.

This is really the major point... do we

1) want *pass* a data item to a template for formatting?

2) or do we want the template to control the loading of the data item?

I prefer 1), but many people appear to favor 2), and I am beginning to get the
impression that 2) is easier and cleaner to implement.

We could even do

3) no explicit reference/use of the item at all. Just use parser function to
access properties, and specify the item if need be (otherwise, the page's "own"
item is used).

> By the way, in some cases a single assertion might have multiple sources, 
> also a
> single source might support multiple assertions, this needs to be taken into
> account.

Yes, indeed.

>>> The parser function should be able to override itself by template 
>>> parameters - I
>>> believe it is possible to do this.
>>
>> That makes the hair in my neck stand up :)
> 
> From the user point of view or the implementation point of view? :)

Both. And as Gabriel pointed out, something like this would be really bad for
things like the visual editor, snippet caching, etc.

> The problem is that we are going to pre-load all the data of an item before 
> the
> article renders, right? So now we need to pre-load all the data in all the
> languages.

Well, at the moment, we would be loading the entire item, which includes all the
languages. If we don't do this, item-level caching is going to be a nightmare.

> If there will be no pre-loading, disregard this.

Don't know what you mean by "pre"... the item will be loaded into memory once,
whenever the page is rendered. It would typically be loaded from a local cache
(e.g. in the database), an http request to the repo while rendering would be
annoyingly slow.

>> Except that it's not wrapped in any HTML. Perhaps there should be an option 
>> to
>> {{#data-value}} to turn that off completely, using form=plain or some such.
> 
> Now shorten #data-value to #data, use form=plain as the default and that's it 
> :)

Yes, we can drop the "simple" parameter-like syntax an use parser functions for
everything.

The remaining question is... how do i specify the item i want to get the
property from (if it's not the default)? Will be be assigning local names to
items, using #load-data or some such? Or will we just use the item id directly -
which would probably be a parameter, so we'd end up with something like this:

  {{#property:population|item={{{item-id|*}

...with * representing the default (the page's "own") item. Not very pretty,
especially not if you have to do it for 20 or 50 properties. So perhaps this is
nicer:

  {{#load-data|thingy|item={{{item-id|*}
  ...
  {{#property:population|item=thingy}}

> I would try not to introduce new syntax if it is not necessary. How about 
> this:
> 
> [[wikidata:Berlin]] - links to en.wikidata.org/wiki/Berlin
> [[wikidataid:q1234]] - links to en.wikidata.org/id/q1234
> {{canonicalurl:wikidata:Berlin|action=edit}} - links to edit page
> 
> All of this syntax already exists, is widely used and could be introduced
> without additional coding :)

You'd have to do  [[wikidata:id/{{#property:id}}|the data item]].
That would be possible, i guess, and it would go to the correct table (iwlinks).

But this doesn't work for edit links. Especially not if the edit link is
supposed to invoke the on-site ajax editing interface How to you generate a
link/button for doing that?

-- daniel

-- 
Daniel Kinzler, Softwarearchitekt

Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
http://wikimedia.de  | Tel. (030) 219 158 260

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-23 Thread Nikola Smolenski

On 22/05/12 15:49, Daniel Kinzler wrote:

On 22.05.2012 15:15, Nikola Smolenski wrote:

Rather, I would include a template normally and use a parser function within the
template to access the data.


So, that would be option (2) from above.


So, instead of {{{data}}} there would be {{#data:}}, instead of {{{data.color}}}
there would be {{#data:color}}, instead of {{{data.color(ACME_SURVEY_2010)}}}
there would be {{#data:color|ref=ACME_SURVEY_2010}} and so on.


How would you specify which item {{#data:}} refers to, in case it's not the
articles "default" item? How would you pass multiple items to a template?


I see multiple possibilities:

{{#data:color|item=Blah}} - this uses item linked to "Blah" in the local 
language.
{{#data:color|item=en:Blah}} - this uses item linked to "Blah" in 
English language.

{{#data:color|id=q123}} - this uses item with ID q123.
{{#data:Blah->color}} - we can do this since -> can't appear in a page 
name - this is my favorite of course :)


Every template or article could read any item without the need to pass it.


If this is needed, one can always use the {{#data-value}} function with
source=ACME_SURVEY_2010 (perhaps 'ref' is better than 'source').


By the way, in some cases a single assertion might have multiple 
sources, also a single source might support multiple assertions, this 
needs to be taken into account.



The parser function should be able to override itself by template parameters - I
believe it is possible to do this.


That makes the hair in my neck stand up :)


From the user point of view or the implementation point of view? :)


I see that there is need to also select desired content language (for example, a
lot of infoboxes display name of the topic in the content language and in the
topic's language(s)). This has the potential to introduce additional problems,
of course.


The parameter syntax is the simply way to access property values. For all fancy
needs, like picking the language, use {{#data-value}} instead. Basically, the
{{{data.foo}}} syntax is just a shorthand for the more powerful {{#data-value}}
stuff.


The problem is that we are going to pre-load all the data of an item 
before the article renders, right? So now we need to pre-load all the 
data in all the languages.


If there will be no pre-loading, disregard this.


Except that it's not wrapped in any HTML. Perhaps there should be an option to
{{#data-value}} to turn that off completely, using form=plain or some such.


Now shorten #data-value to #data, use form=plain as the default and 
that's it :)



 {{#data-link:data|the data item}}

Why not the usual interwiki syntax of [[wikidata:data|the data item]]?


because "data" is not an identifier for that item on the wikidata repo. YOu
would have to use something like:

[[wikidata:/id/{{{data.id}}}|the data item]]

But constructing edit links this way is very cumbersome. The main intend for
this function is to make it easy to provide an edit link in the infobox.

But perhaps a different syntax should be used for this. After all, the edit link
would, with javascript enabled, not really be a link, but a button to activate
the on-site editing feature.


I would try not to introduce new syntax if it is not necessary. How 
about this:


[[wikidata:Berlin]] - links to en.wikidata.org/wiki/Berlin
[[wikidataid:q1234]] - links to en.wikidata.org/id/q1234
{{canonicalurl:wikidata:Berlin|action=edit}} - links to edit page

All of this syntax already exists, is widely used and could be 
introduced without additional coding :)




smime.p7s
Description: S/MIME Cryptographic Signature
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Gabriel Wicke
On 05/22/2012 03:49 PM, Daniel Kinzler wrote:
> On 22.05.2012 15:15, Nikola Smolenski wrote:
> 2) alternatively, don't pass item from the article to the template at all.
> Instead, use parser functions for every access to an item property, and 
> specify
> the item id as a parameter if necessary. That would create a lot of 
> rundundancy
> and wouldn't allow "normal" template parameter syntax to be used to access 
> data
> properties.

You could still pass the data item to the template:

{{Infobox|di=q556677}}

and inside Template:Infobox:

{{#data:{{{di}}}|color}}

This is admittedly slightly more verbose.

On the other hand, It would also work in articles, and supports multiple
items in a single template. More importantly (to me), it is familiar to
current editors, compatible with alternate parsers (e.g. Parsoid) or a
generic Lua -> parser function API and makes it easier to implement
generic fragment caching.

Overall I'd prefer any magic-free and forward-compatible solution even
if it comes at the cost of small syntactic inconveniences.

Gabriel


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Daniel Kinzler
On 22.05.2012 18:11, Helder Wiki wrote:
> On Tue, May 22, 2012 at 10:49 AM, Daniel Kinzler
>  wrote:
>> For the most part, I tried to keep the normal syntax for template parameters,
>> just introducing structured names (dots and colons). The syntax for
>> selection-by-reference-id is indeed a bit awkward. Perhaps we can just drop 
>> it.
>> If this is needed, one can always use the {{#data-value}} function with
>> source=ACME_SURVEY_2010 (perhaps 'ref' is better than 'source').
> 
> Whatever the choice is, woudn't it need to be translated to other languages?

As far as MediaWiki supports the localization of parameter names for parser
functions and tag extensions.

> Here it seems appropriated to ask the same as Jérémie:
> On Tue, May 22, 2012 at 10:08 AM, Jérémie Roquet  wrote:
>> v. What about lua? :)

There will be a Lua interface for calling parser functions. Using this, all
functionality is covered. Once we know more about the Lua bindings, we can come
up with a nicer interface.

-- daniel


-- 
Daniel Kinzler, Softwarearchitekt

Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
http://wikimedia.de  | Tel. (030) 219 158 260

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Helder Wiki
On Tue, May 22, 2012 at 10:49 AM, Daniel Kinzler
 wrote:
> For the most part, I tried to keep the normal syntax for template parameters,
> just introducing structured names (dots and colons). The syntax for
> selection-by-reference-id is indeed a bit awkward. Perhaps we can just drop 
> it.
> If this is needed, one can always use the {{#data-value}} function with
> source=ACME_SURVEY_2010 (perhaps 'ref' is better than 'source').

Whatever the choice is, woudn't it need to be translated to other languages?

>> form
>>     Specifies in what form rendered, that is, in which HTML element the value
>> should be wrapped.
>>
>>         span: wrap in  tags, use  tags for parts
>>         div: wrap in  tags, use  tags for parts
>>         li: wrap in  tags, use  tags for parts
>>
>> I don't like this at all, since it limits the number of possibilities,
>> introduces yet another syntax parallel to HTML.
>>
>> Yet, I don't see anything much better. A half-baked idea is to leave it to 
>> the
>> client wikis to create their data display templates that could be used to 
>> format
>> data appropriately.
>
> Yes, this is kind of a nasty compromise.
>
> On the one hand, it would be extremely cumbersome to have to re-implement the
> table (and other) structures for representing property values with various
> qualifiers, not to speak of the rendering logic for these qualifiers 
> themselves.
>
> On the other hand, we want to allow template authors to integrate property
> values in different kinds of structures, like tables and lists.
>
> Note that with the help of the show=... parameters, template authors can still
> implement their own rendering of the value (or rather, statement) with all 
> it's
> parts, using one {{#data-value}} call for each one.

Here it seems appropriated to ask the same as Jérémie:
On Tue, May 22, 2012 at 10:08 AM, Jérémie Roquet  wrote:
> v. What about lua? :)

Best regards,
Helder

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Daniel Kinzler
On 22.05.2012 15:47, Jeroen De Dauw wrote:
> Hey,
> 
> Great writeup, after doing a quick read, I agree with most stuff, but have 
> some
> minor remarks:
> 
>>  {{{data.color(ACME_SURVEY_2010)}}}
> 
> This suggests that we do not want property names with brackets in them.
> Sometimes an item might have 2 distinct properties with the same name (I can't
> think of any example right now but things this does occur) in which case you
> need to add some extra stuff in their name to distinguish them. WP often uses
> brackets for this when it happens with page names, so it does not seem to far
> fetched that people would want to use that here as well.

Yea, as I said in my reply to Nikola: it's probably best to just drop that 
syntax.

>> {{#data-link:|action=edit|the data item}}
> 
> I really don't like providing an empty value to have it use the default. 
> Should
> be possible to just omit the parameter altogether. Also, I think it's nice to
> have the arguments be order independent. So using parameter names for 
> everything
> except the identifier might be good. Right now it's for example impossible to
> have action=edit or similar at it's start.

Well, that would mean that the link text can't be a positional parameter, so
we'd have to use {{#data-link:action=edit|text=the data item}}. A bit un-pretty,
but then, this stuff will only show up in templates anyway.

So yea, agreed.

-- daniel


-- 
Daniel Kinzler, Softwarearchitekt

Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
http://wikimedia.de  | Tel. (030) 219 158 260

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Daniel Kinzler
On 22.05.2012 15:15, Nikola Smolenski wrote:
> Including Items in an Article:
> 
> {{#data-template:Infobox
> |data_item=q332211
> |data_param=stuff
> |foo=some value
> |stuff.color=green
> }}
> 
> I see no reason for creating a new parser function for inclusion of templates.

Basically, we need a way to pass a complex data object as a parameter to a
template. If we don't have that, we have two alternative choices:

1) assign per-page variables to items, e.g. make {{{thingy}}} available in the
article after a call to {{#load-data|thingy}}. Then you can use
{{Infobox|data={{{thingy}, as usual. But I think this "global variables"
approach is ugly and confusing, at least when stuff is loaded from inside 
templates.

2) alternatively, don't pass item from the article to the template at all.
Instead, use parser functions for every access to an item property, and specify
the item id as a parameter if necessary. That would create a lot of rundundancy
and wouldn't allow "normal" template parameter syntax to be used to access data
properties.

To me, this seems natural, because it behaves like a function call. Being able
to pass complex objects as parameters directly would of course be better still.

Actually, if the template in question wants to pass the data object on to
another template, we kind of need this...

> Also, this syntax would not allow a template to draw data from more than one
> item

Correct. This could be simple enough to overcome, though I can't think of a
*nice* way off hand. Off the top of my head, I'd do something like this:

 ...
 |data_item_2=q556677
 |data_param_2=other_stuff
 |data_item_3=q114488
 |data_param_3=more_stuff
 ...

or even:

 |data_item=q332211 as stuff
 |data_item_2=q556677 as other_stuff
 |data_item_3=q114488 as more_stuff


> which would probably be a requirement in phase3.

No, it's not. phase3 is about automatic listings, which require a different
syntax anyway. Also, in a list, all items would use the same template and the
same rendering options. No need for the ability to pass multiple items to a
single template.

> Rather, I would include a template normally and use a parser function within 
> the
> template to access the data.

So, that would be option (2) from above.

> So, instead of {{{data}}} there would be {{#data:}}, instead of 
> {{{data.color}}}
> there would be {{#data:color}}, instead of {{{data.color(ACME_SURVEY_2010)}}}
> there would be {{#data:color|ref=ACME_SURVEY_2010}} and so on.

How would you specify which item {{#data:}} refers to, in case it's not the
articles "default" item? How would you pass multiple items to a template?

> One advantage is that commonly used syntax is always used, instead of 
> inventing
> new syntax (as for the reference in this example).

For the most part, I tried to keep the normal syntax for template parameters,
just introducing structured names (dots and colons). The syntax for
selection-by-reference-id is indeed a bit awkward. Perhaps we can just drop it.
If this is needed, one can always use the {{#data-value}} function with
source=ACME_SURVEY_2010 (perhaps 'ref' is better than 'source').

> Another advantage, this way would make data usable directly in article text, 
> if
> that is wanted.

Yes, that's also something that is bothering me. I would propose for this case
to implement option (1) from above: use {{#load-data|stuff}} to make the item
available as {{{stuff}}} in the current scope (preprocessor frame).

That's kind of like saying "pass the item to me" instead of "pass the item to
the template".

> The parser function should be able to override itself by template parameters 
> - I
> believe it is possible to do this.

That makes the hair in my neck stand up :)

> Unrelated to the above,
> 
> "This will return the value of the color property, in the page's content
> language, as plain text."
> 
> I see that there is need to also select desired content language (for 
> example, a
> lot of infoboxes display name of the topic in the content language and in the
> topic's language(s)). This has the potential to introduce additional problems,
> of course.

The parameter syntax is the simply way to access property values. For all fancy
needs, like picking the language, use {{#data-value}} instead. Basically, the
{{{data.foo}}} syntax is just a shorthand for the more powerful {{#data-value}}
stuff.

Except that it's not wrapped in any HTML. Perhaps there should be an option to
{{#data-value}} to turn that off completely, using form=plain or some such.

> form
> Specifies in what form rendered, that is, in which HTML element the value
> should be wrapped.
> 
> span: wrap in  tags, use  tags for parts
> div: wrap in  tags, use  tags for parts
> li: wrap in  tags, use  tags for parts
> 
> I don't like this at all, since it limits the number of possibilities,
> introduces yet another syntax parallel to HTML.
> 
> Yet, I don't see any

Re: [Wikidata-l] [wikidata-intern] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)

2012-05-22 Thread Jeroen De Dauw
Hey,

Great writeup, after doing a quick read, I agree with most stuff, but have
some minor remarks:

>  {{{data.color(ACME_SURVEY_2010)}}}

This suggests that we do not want property names with brackets in them.
Sometimes an item might have 2 distinct properties with the same name (I
can't think of any example right now but things this does occur) in which
case you need to add some extra stuff in their name to distinguish them. WP
often uses brackets for this when it happens with page names, so it does
not seem to far fetched that people would want to use that here as well.

> {{#data-link:|action=edit|the data item}}

I really don't like providing an empty value to have it use the default.
Should be possible to just omit the parameter altogether. Also, I think
it's nice to have the arguments be order independent. So using parameter
names for everything except the identifier might be good. Right now it's
for example impossible to have action=edit or similar at it's start.

Cheers

--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil.
--
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l