Re: [Wikitech-l] Now live: Shared structured data

2016-12-28 Thread Yuri Astrakhan
The 400 chat limit is to be in sync with Wikidata, which has the same
limitation. The origins of this limit is to encourage storage of "values"
rather than full strings (sentences). Also, it discourages storage of wiki
markup.

On Wed, Dec 28, 2016, 16:45 mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:

> Thank you Yuri. Is there some rational explanation behind this limits? I
> understand the limit over performance concern, and 2Mb seems already
> very large for intented glossaries. But 400 chars might be problematic
> for some definition I guess, especially since translations can lead to
> varying lenght needs.
>
>
> Le 25/12/2016 à 17:03, Yuri Astrakhan a écrit :
> > Hi Mathieu, yes, I think you can totally build up this glossary in a
> > dataset. Just remember that each string can be no longer then 400 chars,
> > and total size under 2mb.
> >
> > On Sun, Dec 25, 2016, 10:45 mathieu stumpf guntz <
> > psychosl...@culture-libre.org> wrote:
> >
> >> Hi Yuri,
> >>
> >> Seems very interesting. Am I wrong thinking this could helpto create
> >> multi-lingual glossary as drafted in
> >> https://phabricator.wikimedia.org/T150263#2860014 ?
> >>
> >>
> >> Le 22/12/2016 à 20:30, Yuri Astrakhan a écrit :
> >>> Gift season! We have launched structured data on Commons, available
> from
> >>> all wikis.
> >>>
> >>> TLDR; One data store. Use everywhere. Upload table data to Commons,
> with
> >>> localization, and use it to create wiki tables, lists, or use directly
> in
> >>> graphs. Works for GeoJSON maps too. Must be licensed as CC0. Try this
> >>> per-state GDP map demo, and select multiple years. More demos at the
> >> bottom.
> >>> US Map state highlight
> >>> 
> >>>
> >>> Data can now be stored as *.tab and *.map pages in the data namespace
> on
> >>> Commons. That data may contain localization, so a table cell could be
> in
> >>> multiple languages. And that data is accessible from any wikis, by Lua
> >>> scripts, Graphs, and Maps.
> >>>
> >>> Lua lets you generate wiki tables from the data by filtering,
> converting,
> >>> mixing, and formatting the raw data. Lua also lets you generate lists.
> Or
> >>> any wiki markup.
> >>>
> >>> Graphs can use both .tab and .map directly to visualize the data and
> let
> >>> users interact with it. The GDP demo above uses a map from Commons, and
> >>> colors each segment with the data based on a data table.
> >>>
> >>> Kartographer (/) can use the .map data as an extra
> >> layer
> >>> on top of the base map. This way we can show endangered species'
> habitat.
> >>>
> >>> == Demo ==
> >>> * Raw data example
> >>> 
> >>> * Interactive Weather data
> >>> 
> >>> * Same data in Weather template
> >>> 
> >>> * Interactive GDP map
> >>> 
> >>> * Endangered Jemez Mountains salamander - habitat
> >>> 
> >>> * Population history
> >>> 
> >>> * Line chart 
> >>>
> >>> == Getting started ==
> >>> * Try creating a page at data:Sandbox/.tab on Commons. Don't
> forget
> >>> the .tab extension, or it won't work.
> >>> * Try using some data with the Line chart graph template
> >>> A thorough guide is needed, help is welcome!
> >>>
> >>> == Documentation links ==
> >>> * Tabular help 
> >>> * Map help 
> >>> If you find a bug, create Phabricator ticket with #tabular-data tag, or
> >>> comment on the documentation talk pages.
> >>>
> >>> == FAQ ==
> >>> * Relation to Wikidata:  Wikidata is about "facts" (small pieces of
> >>> information). Structured data is about "blobs" - large amounts of data
> >> like
> >>> the historical weather or the outline of the state of New York.
> >>>
> >>> == TODOs ==
> >>> * Add a nice "table editor" - editing JSON by hand is cruel. T134618
> >>> * "What links here" should track data usage across wikis. Will allow
> >>> quicker auto-refresh of the pages too. T153966
> >>> * Support data redirects. T153598
> >>> * Mega epic: Support external data feeds.
> >>> ___
> >>> Wikitech-l mailing list
> >>> Wikitech-l@lists.wikimedia.org
> >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >> ___
> >> Wikitech-l mailing list
> >> Wikitech-l@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > 

Re: [Wikitech-l] Now live: Shared structured data

2016-12-28 Thread mathieu stumpf guntz
Thank you Yuri. Is there some rational explanation behind this limits? I 
understand the limit over performance concern, and 2Mb seems already 
very large for intented glossaries. But 400 chars might be problematic 
for some definition I guess, especially since translations can lead to 
varying lenght needs.



Le 25/12/2016 à 17:03, Yuri Astrakhan a écrit :

Hi Mathieu, yes, I think you can totally build up this glossary in a
dataset. Just remember that each string can be no longer then 400 chars,
and total size under 2mb.

On Sun, Dec 25, 2016, 10:45 mathieu stumpf guntz <
psychosl...@culture-libre.org> wrote:


Hi Yuri,

Seems very interesting. Am I wrong thinking this could helpto create
multi-lingual glossary as drafted in
https://phabricator.wikimedia.org/T150263#2860014 ?


Le 22/12/2016 à 20:30, Yuri Astrakhan a écrit :

Gift season! We have launched structured data on Commons, available from
all wikis.

TLDR; One data store. Use everywhere. Upload table data to Commons, with
localization, and use it to create wiki tables, lists, or use directly in
graphs. Works for GeoJSON maps too. Must be licensed as CC0. Try this
per-state GDP map demo, and select multiple years. More demos at the

bottom.

US Map state highlight


Data can now be stored as *.tab and *.map pages in the data namespace on
Commons. That data may contain localization, so a table cell could be in
multiple languages. And that data is accessible from any wikis, by Lua
scripts, Graphs, and Maps.

Lua lets you generate wiki tables from the data by filtering, converting,
mixing, and formatting the raw data. Lua also lets you generate lists. Or
any wiki markup.

Graphs can use both .tab and .map directly to visualize the data and let
users interact with it. The GDP demo above uses a map from Commons, and
colors each segment with the data based on a data table.

Kartographer (/) can use the .map data as an extra

layer

on top of the base map. This way we can show endangered species' habitat.

== Demo ==
* Raw data example

* Interactive Weather data

* Same data in Weather template

* Interactive GDP map

* Endangered Jemez Mountains salamander - habitat

* Population history

* Line chart 

== Getting started ==
* Try creating a page at data:Sandbox/.tab on Commons. Don't forget
the .tab extension, or it won't work.
* Try using some data with the Line chart graph template
A thorough guide is needed, help is welcome!

== Documentation links ==
* Tabular help 
* Map help 
If you find a bug, create Phabricator ticket with #tabular-data tag, or
comment on the documentation talk pages.

== FAQ ==
* Relation to Wikidata:  Wikidata is about "facts" (small pieces of
information). Structured data is about "blobs" - large amounts of data

like

the historical weather or the outline of the state of New York.

== TODOs ==
* Add a nice "table editor" - editing JSON by hand is cruel. T134618
* "What links here" should track data usage across wikis. Will allow
quicker auto-refresh of the pages too. T153966
* Support data redirects. T153598
* Mega epic: Support external data feeds.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l