New task https://phabricator.wikimedia.org/T195435
On Thu, May 24, 2018 at 9:38 AM, John Mark Vandenberg <[email protected]> wrote: > cliff notes: Wikidata has a new entity type which is quite different > from existing entity types. > > https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model > > ---------- Forwarded message ---------- > From: Léa Lacroix <[email protected]> > Date: Wed, May 23, 2018 at 7:33 PM > Subject: [Wikidata] First experiment of lexicographical data is out > To: "Discussion list for the Wikidata project." <[email protected]> > > > Hello all, > > After several years discussing about it, and one year of development > and discussion with the communities, the development team has now > released the first version of lexicographical data support on > Wikidata. > > Since the start of Wikidata in 2012, the multilingual knowledge base > was mainly focused on concepts: Q-items are related to a thing or an > idea, not to the word describing it. Starting now, Wikidata stores a > new type of data: words, phrases and sentences, in many languages, > described in many languages. This information will be stored in new > types of entities, called Lexemes, Forms and Senses. It will allow > editors to describe precisely all words in all languages, and will be > reusable, just like the whole content of Wikidata, by multiple tools > and queries, everything that the community creates to play with words. > Lexicographical data can be reused inside and outside the Wikimedia > projects, and can provide support for Wiktionary. > > The first release > > A new namespace and several new entity types have been created in > order to model words and phrases. If you’re new to this project, you > can learn more by looking at the documentation, briefly describing the > data model and the interface. The technical structure is set, but the > editors remain free to model and organize data as they prefer, with > the usual open discussions and community processes that we apply on > Wikidata. Some discussions about new properties to create have already > started: if you want to be involved in the early stage of the project > to shape it, please participate! > > Please note that the version that is now deployed is a first > experiment, that will be continuously improved in the future. Some > features are missing, some bugs may certainly occur. Here are the > features that are included in the first release: > > Add, edit and delete Lexemes, Forms, statements, qualifiers, references > Link between the different entity types (Item to Lexeme, Form to Item, etc.) > Entity suggestion when adding a property or a value > > And the following features will not be included in the first version, > but are planned for the future: > > Find Lexemes and Forms via Special:Search > RDF support (which also means: the ability to query it with > query.wikidata.org) > Support for Senses > Merging of Lexemes > Including the data on other Wikimedia projects, such as Wiktionary > > How to try it? > > The features described above are now deployed on Wikidata.org. Here > are some suggestions of what you can do to explore this new territory: > > If you’re not familiar with the structure of Lexemes, have a look at > the documentation > Look at what is already existing. Please note that Special:Search and > the search bar on the top right corner of pages is not supporting > Lexemes yet. We’re working on this. > Create a new Lexeme with Special:NewLexeme > If a property that you need is missing, you can suggest it here > Discuss about how to model words and ask questions on Wikidata > talk:Lexicographical data > Report bugs or issues that you may encounter: either on the talk page > or on Phabricator, if you’re comfortable using it (create a task, add > the tag Lexicographical data, and add Lea_Lacroix_(WMDE) as a > subscriber) > > About mass imports and tools > > We kindly ask you to not plan any mass import from any source for the > moment. There are several reasons behind that: first of all, like > mentioned above, the release is a first version and we need to observe > how our system reacts to the manual edits before starting considering > automatic ones. The system may not be ready for big massive imports at > the beginning. Second reason is legal. Lexicographical data in > Wikidata is released under CC0, and the responsibility of each editor > is to make sure that the data they will add is compatible with CC0. > For more information, you can have a look at the advice of WMF Legal > team. Finally, we strongly encourage you to discuss with the > communities before considering any import from the Wiktionaries. > Wiktionary editors have been putting a lot of efforts during years to > build definitions, and we should be respectful of this work, and > discuss with them to find common solutions to work on lexicographical > data and enjoy the use of it together. > > We also suggest you to wait a bit before building tools or scripts on > the top of lexicographical data. The interface and its API are > probably going to evolve during the next months, and the system may > not be stable enough to support such tools. We will inform you as soon > as it will be possible. > > Next steps > > After this first release, some improvements will be made on a very > regular basis (new deployments every week). Once you tried playing > with the new data, feel free to give us feedback. We’re looking > especially to know what are the most important features for you to be > worked on next. > > What did you experiment while editing lexicographical data? What went > wrong or was unexpected? > What bugs or troubles during the process did you encounter? > What are the features that are, in your opinion, the most important? > Which one should we work on next? > > If you’re interested in following the discussions and further > announcements about lexicographical data, I encourage you to follow > Wikidata:Lexicographical data and its talk page, where we will discuss > about how to organize and structure data, new features to be added, > ideas of tools and queries, and a lot of other things. > > Additional note: with this new kind of data enabled on Wikidata, we > expect some new editors to get interest in it, edit Lexemes, suggest > properties or ask questions. They may not be familiar with all of our > community processes and our ways to organize content. They will need > help and support as well as links to useful resources to understand > how the Wikidata community works. I hope that we will all be kind and > patient, both with other editors and with the software that may not > work exactly as we want it to at the beginning :) > > Thanks to the people who tested the model and the interface before the > release, who showed support and curiosity about lexicographical data > on Wikidata! > > If you have any question or idea, feel free to write on Wikidata > talk:Lexicographical data or contact me. > > -- > Léa Lacroix > Project Manager Community Communication for Wikidata > > Wikimedia Deutschland e.V. > Tempelhofer Ufer 23-24 > 10963 Berlin > www.wikimedia.de > > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. > > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das > Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207. > > _______________________________________________ > Wikidata mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > -- > John Vandenberg -- John Vandenberg _______________________________________________ pywikibot mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikibot
