Re: [Wiki-research-l] [Wikitech-l] URL-addressable Predicate Calculus

2018-10-18 Thread Daniel Kinzler
Am 18.10.2018 um 06:09 schrieb Adam Sobieski:
> Leila,
> 
> I’m hoping to share some new knowledge representation techniques which could 
> be of use to a number of projects for purposes of brainstorming. A number of 
> new projects could be made possible with the new techniques; one could, for 
> instance, envision a “wiki knowledgebase” project where predicate calculus 
> expressions are hyperlinks to wiki experiences for users.

Have you tried applying that technique to the Wikidata Query Service?
<https://query.wikidata.org/>

In particular, your approach seems quite similar to Linked Data Fragements as
described on <http://linkeddatafragments.org/>.

Wikidata's LDF interface can be found at https://query.wikidata.org/bigdata/ldf


-- 
Daniel Kinzler
Principal Software Engineer, Core Platform
Wikimedia Foundation

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Registration open for the Hackathon in Berlin, May 13-15

2011-03-23 Thread Daniel Kinzler
Hi all!

Wikimedia Germany invites anyone interested in improving MediaWiki to come and
join us at or third developer meet-up. Like the last two years, it's going to be
awesome! Unlike the last two years, there will be more hacking and less talking
- it'll be a Hackathon, not a BarCamp.

We'll meet on May 13 to 15, in Berlin, on the 4th floor of the betahaus
coworking space <http://betahaus.de/>.

There will not be an entrance fee, but registration is mandatory and now open:

<http://de.amiando.com/hackathon2011>.

Registration will close on April 10. If you like to attend, please register in
time!

More information can be found at

<http://www.mediawiki.org/wiki/Berlin_Hackathon_2011>.

The Berlin Hackathon 2011 is an opportunity for MediaWiki hackers to come
together, squash bugs and write crazy new features. Our main focus this time
around will probably be:

* Improving usability / accessibility
* Interactive Maps
* Fixing the parser
* WMF Ops (new data center, virtualization)
* Supporting the Wiki Loves Monuments image hunt
* Squashing bugs

If you have different ideas, please let us know:

<http://www.mediawiki.org/wiki/Berlin_Hackathon_2011#Topics>

The Hackathon will be hosting the Language committee and Wiki loves Monuments
group. There is a limited number of seats reserved for these groups and if you
belong to one of them, you should receive an invitation code soon.

If you have any doubts or questions, contact us at .

We’re excited to see you in Berlin, your Hackathon Team

Daniel Kinzler (Program Coordinator)
Nicole Ebber (Logistics)
Cornelius Kibelka (Assistant)



___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] searching for specific publications

2010-09-30 Thread Daniel Kinzler
Hi Pasko

There's a research bibliography on zotero:
http://www.zotero.org/groups/wikipedia_research/items

sadly, it's not well organized, but there are quite a few gems there.

-- daniel

pa...@irmo.hr schrieb:
> Dear fellow researchers,
> 
> I am new to the list so I will first introduce myself. My name is Pasko
> Bilic and I am doing PhD research in sociology on media events and
> Wikipedia editing practices. I am working as a research assistant at the
> Institute for International relations, Department for Culture and
> Communication, Zagreb, Croatia.
> I have already read numerous publications on Wikipedia which are mostly
> interdisciplinary. Currently I am looking for previous publications on
> various metrics in analyzing article content or in analyzing editor
> networks around single Wikipedia articles.
> Preferable research areas include: sociology, communication science,
> social-psychology, computer-mediated communication.
> Any kind of information, link, advice would be most helpful.
> 
> Thank you in advance and with kind regards,
> 
> Pasko
> 
> 
> 
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Demo of RC-Events over XMPP

2010-08-20 Thread Daniel Kinzler
Felipe Ortega schrieb:
> Hello.
> 
> Nice work, Daniel.
> 
> The question may be weird, but it's something we have been thinking here for
> a while. Could we use recent changes to recreate the same info one gets from
> full XML dumps? Could it be the starting point for some alternative to the
> current dump process?

In theory, one could process a dump covering revisions up to date X, and the
process recentchanges starting from that date. However, this would require the
RC-event to include the full revision text. This is currently not possible, due
to technical limitations (in the case of XMLRC, the limitation is imposed by the
limited size of UDP packets).

This can however be simulated, even if no so efficiently: for every RC-event,
pull the corresponding revision text from the API. This would be a decent way to
keep up to date, a way comparable to the deprecated OAI interface.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Web 2.0 recent changes patrol tool demo (WPCVN)

2010-08-20 Thread Daniel Kinzler
Hi Dimitry:

Dmitry Chichkov schrieb:
> Some time ago as a Python/Django/JQuery/pywikipedia exercise I've hacked
> a web based recent changes patrol tool. An alpha version can be seen at
> the: http://www.wpcvn.com
> 
> It includes a few interesting features that may be useful to the
> community (& researchers designing similar tools):
> 1. tool uses editors ratings, primarily based on user counters (includes
> reverted revisions counters) calculated using the wiki dump;

Perhaps have a look at the WikiTrust API: 


> WPCVN aggregates recent changes IRC feed, IRC feed from the
> MiszaBot and WPCVN user actions. 

I'm currently prototyping an XMPP based RC feed, which has much more detailed
info, and is more reliable, than the IRC feed:


> It also uses pre-calculated Wikipedia
> users "karma" (based on the recent en-wiki dump analysis) to separate
> edits made by users with clearly good or bad reputation. 

No *this* definitly sounds like WikiTrust, though I'm not sure if they expose
this info via the API.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Demo of RC-Events over XMPP

2010-08-19 Thread Daniel Kinzler
(I sent this to a couple of lists already, but i though it might also be
interresting for the research community)

Hi all! For a long time I wanted a decent push interface for
RecentChanges-Events, so it becomes easy to follow changes on a wiki. Parsing
messages on IRC is unreliable, and polling the API sucks (and is also
unreliable, see Bug 24782).

So, I have written XMLRC  and set
up a prototype on the Toolserver - have a look at
 for details. Basically,
you point any Jabber client to the chat room
 to see the change events, like on IRC.
However, if you use a client aware of the extra data attached to the messages,
like , you will get
all the information you can get from the API (in fact, you can get the exact
same XML tag).

Try it out and let me know what you think!

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] What data would you like Wikimedia to provide?

2010-07-23 Thread Daniel Kinzler
HI all

At WikiSym, there where two sessions about what data researches would like to
have for their work:  and
.

>From that, Ariel Glenn compiled a wiki page
 and asks reasearchers
for input about what data they would need.

So, go ahead, add to the wish list!

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Foundation-l] WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

2010-07-21 Thread Daniel Kinzler
> Hey Daniel,
> 
> Bibsonomy seems to suffer from the same problem as CiteULike - urls
> which convey no meaning. An example url id from CiteULike is 2434335,
> and one from Bibsonomy is 29be860f0bdea4a29fba38ef9e6dd6a09. I hope to
> continue to steer the conversation away from that direction. These IDs
> guarantee uniqueness, but I believe that we can create keys that both
> guarantee uniqueness and convey some meaning to humans. Consider that
> this key will be embedded in wiki articles any time a source is cited.
> It's important that it make some sense.

Oh, I didn#t mean we should use hashes or IDs as keys or identifiers in the URL.
I mean we can employ the hashing technique to detect dupes. Because you will
inadvertably get information about the same thing under two different keys,
because of issues with translitteration, etc.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Foundation-l] WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

2010-07-21 Thread Daniel Kinzler
Jodi Schneider schrieb:
> On 21 Jul 2010, at 09:42, Daniel Kinzler wrote:
>>> Kang+Hsu+Krajbich+2009+the+wick+in
> 
> This seems best to me of what's proposed so far. 
>> Both seem good, though i would suggest to form a convention to ignore any
>> leading "the" and "a", to a more distinctive 3 word suffix.
> 
> While that's a good idea, then we'd have to know all "indistinctive" words in 
> all languages. (Die, Der, La, L', ...)

Stopword lists for major languages exists, and where they don't, they are easily
created, even automatically. Word frequency analysis on a few megabyte of text
is cheap these days :)

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] [Foundation-l] WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

2010-07-21 Thread Daniel Kinzler
>> 1) The first three author names separated by slashes
> why not separate by pluses? they don't form part of names either, and
> don't cause problems with wiki page titles.

I like this... however, how would you represent this in a URL? Also note that
using plusses in page names don't work with all server configurations, since
plus has a special meaning in URLs.

>> 3) Some or all of the date. For instance, if there is only one source by
>> this set of authors that year, we can just use . However, once another
>> source by those set of authors is added, the key should change to MMDD
>> or similar.
> I don't think it is a good idea to change one key as a function of
> updates on another, except for a generic disambiguation tag.

I agree. And if you *have* to use the full date, use MMDD, not the other way
around, please.

>> Since the slashes are somewhat cumbersome, perhaps we can not make them
>> mandatory, but similarly use them only when they are necessary in order to
>> "escape" a name. In the case that one of the authors does not have a slash
>> in their name - the dominant case - we can stick to the easily legible and
>> niecly compact CamelCase format.
>>
>> Example keys generated by this algorithm:
>>
>> KangHsuKrajbichEtAl2009
> Kang+Hsu+Krajbich+2009+the+wick+in
> or
> Kang+Hsu+Krajbich+2009+twi

Both seem good, though i would suggest to form a convention to ignore any
leading "the" and "a", to a more distinctive 3 word suffix.

> Of course, it does not have to be _exactly_ three authors, nor three
> words from the title, and it does not solve the John Smith (or Zheng
> Wang) problem.

It also doesn't solve issues with transliteration: Merik Möller may become
"Moeller" or "Moller", Jakob Voß may become "Voss" or "Vosz"  or even "VoB",
etc. In case of chinese names, it's often not easy to decide which part is the
last name.

To avoid this kind of ambiguity, i suggest to automatically apply some type of
normalization and/or hashing. There is quite a bit of research about this kind
of normalisation out there, generally with the aim of detecting duplicates.
Perhaps we can learn from bibsonomy.org, have a look how they do it:
.

Gotta love open source university research projects :)

-- daniel



___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

2010-07-20 Thread Daniel Kinzler
Hi all

A central place for managing Bibliographic data for use with Citations is
something that has been discussed by the German community for a long time. To
me, it consists of two parts: a project for managing the structured data, and a
machanism for uzsing that data on the wikis.

I have been working on the latter recently, and there's a working prototype: on
  you
can see how data records can be included from external sources. A demo for the
actual on-wiki use can be found at
, where
{{ISBN|0868400467}} is used to show the bibliographic info for that book. (side
note: the prototype wikis are slow. sorry about that).

Fetching and showing the data is done using
. Care has been taken
to make this secure and scalable.

For a first demo, I'm using teh ISBN as the key, but any kind of key could be
used to reference resources other than books.

For demoing managing the data by ourselves, I have set up ab SMW instance. An
example bib record is at
, it's used across
wikis at
. Note
that changes will show delayed, as the data is cached for a while.


When discussing these things, please keep in mind that there are two components:
fetching and displaying external data records, and managing structured data in a
wiki style. The former is much simpler than the latter. I think we should really
aim at getting both, but we can start off with transclusing external data much
faster, if we allow no-so-wiki data sources. For ISBN-based queries, we could
simply fetch information from http://openlibrary.org - or the open knowledge
foundation's http://bibliographica.org, once it's working.

In the context of bibdex, I recommend to also have a look at
http://bibsonomy.org - it's a university research project, open source, and is
quite similar to bibdex (and to what citeulike used to be).

As to managing structured data ourselves: I have talked a lot with Erik Möller
and Markus Krötzsch about this, and I'm in touch with the people wo make DBpedia
and OntoWiki. Everyone wants this. But it's not simple at all to get it right
(efficient versioning of multilingual data in a document oriented database,
anyone? want inference? reasoning, even? yay...). So the plan is currently to
hatch a concrete plan for this. And I imagine that bibliographical and
biographical info will be among the first used cases.

cheers,
daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Technology demos at WikiSym - one week left to submission deadline

2010-03-21 Thread Daniel Kinzler
Hi all

Technology demonstrations are a great way to show of your research and
engineering projects - no research paper is required, just a short description
of what you like to present.

The submission deadline for technology demos at WikiSym has been extended until
March 28. For details, see .

So, if you have any pet projects you want to share, I invite you to submit it to
WikiSym!

Regards,
Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Only one week left to register for the Developers' Workshop in April!

2010-03-15 Thread Daniel Kinzler
Hi all!

This is a quick reminder that registration for the Wikimedia Developers'
Workshop (in Berlin, April 14-16) will end on Sunday, March 21. Most places are
already take, but we have room for 16 more people to attend.

So, if you want to come, sign up now at 
<http://www.amiando.com/WMCON10DEV.html>!

More information about the workshop is available at 
<http://tinyurl.com/wmdev10>.

-- Daniel Kinzler


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Registration open for the Developer Workshop in Berlin!

2010-03-05 Thread Daniel Kinzler
Wikimedia Germany invites all MediaWiki developers, Toolserver users, Gadget
hackers, and other people interested in the technical side of Wikimedia projects
to come to the Developers' Workshop in Berlin on April 14.-16. We have a very
nice venue and a cool option for accommodation, details to be announced soon.

Registration is now open at <http://www.amiando.com/WMCON10DEV.html>.
Registration is *required* and will be open until March 21., but there are only
50 places available. So, sign up soon!


For updates and more information, watch
<http://meta.wikimedia.org/wiki/Wikimedia_Conference_2010/Developers%27_Workshop>.
 If you have questions, please contact us at .

-- Daniel Kinzler




___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] DEADLINE for WMDE contract applications

2009-04-16 Thread Daniel Kinzler
Hello all

At the developer meetup, I announced that Wikimedia Deutschland is offering
contracts for a couple of projects we feel are important. We again invite anyone
to apply NOW for any project that interests you.

  The DEADLINE for applying is SUNDAY, APRIL 19!

We did not receive any offer for the most urgent project:  Evaluate the impact
of using flagged revisions on the German Wikipedia, see
.

We feel that it would be very helpful to run a full analysis on this before the
English language Wikipedia decides on how to implement flagged revisions. It's a
powerful tool, and we should make sure we use it to it's full potential.

Below, the other projects are listed again:

* Rewrite CatScan, a tool for finding pages in a set of categories recursively,
based on various criteria -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Rewrite_CatScan

* Store interwiki-links in the database, just like we already store
interlanguage-links -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Store_interwiki-links_in_the_database

* Improve the Gadgets extension to allow for gadgets to be enabled per default,
be restricted to specific user groups, etc -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Improve_the_Gadets_extension

* Implement full support for TIFF files, including multi-page TIFFs, similar to
how DjVu is handled -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Implement_full_support_for_TIFF_files

If you would like to help with any of the above, please contact me at
 and provide the following information:
* Your real name and country of residence
* How you plan to go about implementing the desired function
* Any experience working with MediaWiki
* How many working hours you would spend on it, and how much you ask for it
* In what time frame you would be able to do the job

This information is also available at
http://www.mediawiki.org/wiki/WMDE_contract_offers

Thanks you all for your interest!

-- daniel



___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Wikimedia Deutschland offering short term contracts

2009-04-07 Thread Daniel Kinzler
Wikimedia Deutschland is offering contracts for a couple of projects we feel are
important. If you are interested in earning some money by helping Wikimedia to
improve our Wikis, have a look at these projects:

* Evaluate the impact of using flagged revisions on the German Wikipedia -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Evaluate_the_impact_of_using_flagged_revisions

* Rewrite CatScan, a tool for finding pages in a set of categories recursively,
based on various criteria -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Rewrite_CatScan

* Store interwiki-links in the database, just like we already store
interlanguage-links -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Store_interwiki-links_in_the_database

* Improve the Gadgets extension to allow for gadgets to be enabled per default,
be restricted to specific user groups, etc -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Improve_the_Gadets_extension

* Implement full support for TIFF files, including multi-page TIFFs, similar to
how DjVu is handled -
http://www.mediawiki.org/wiki/WMDE_contract_offers/Implement_full_support_for_TIFF_files

If you would like to help with any of the above, please contact me at
 and provide the following information:
* Your real name and country of residence
* How you plan to go about implementing the desired function
* Any experience working with MediaWiki
* How many working hours you would spend on it, and how much you ask for it
* In what time frame you would be able to do the job

This information is also available at
http://www.mediawiki.org/wiki/WMDE_contract_offers

Thanks for helping us make the web a better place!

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] developer meet-up is out of room

2009-03-17 Thread Daniel Kinzler
We have been completly overrun by registrations for the developer meet-up in
Berlin. That’s exhilarating, but forces on me the sad duty to tell you: we are
out of room, we are closing registration early.

So: if you have not yet send a registration mail, you will not be able to 
attend!

Sorry. We may even have to reject some registrations we have already received.

There’s some good news too, though: anyone interested my join us at the c-base
for the party on saturday March 4., starting 8pm. The developers will be there
and people from the chapter and board meeting will also come. This will be a
good opportunity for getting to know Wikimedians from all over the world.

Regards,
Daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] MediaWiki developer meeting is drawing close

2009-03-06 Thread Daniel Kinzler
The meet-up[1] is drawing close now: between April 3. and 5. we meet at the
c-base[2] in Berlin to discuss MediaWiki development, extensions, toolserver
projects, wiki research, etc. Registration[3] is open until March 20 (required
even if you already pre-registered).

The schedule[4] is slowly becomming clear now: On Friday, we'll start at noon
with a who-is-who-and-does-what session and in the evening there will be an
opportunity to get to know Berlin a bit. On Saturday we have all day for
presentations and discussions, and in the evening we will have a party together
with all the folks from the chapter and board meetings. On Sunday there will be
a wrap-up session and a big lunch for everyone.

We have also organized affordable accommodation: we have reserved rooms in the
Apartmenthaus am Potsdamer Platz[5]. Staying there is a recommended way of
getting to know your fellow Wikimedians!

I'm happy that so many of you have shown interest, and I'm sure we'll have a
great time in Berlin!

Regards,
Daniel

[1] http://www.mediawiki.org/wiki/Project:Developer_meet-up_2009
[2] http://en.wikipedia.org/wiki/C-base
[3] http://www.mediawiki.org/wiki/Project:Developer_meet-up_2009/Registration
[4] http://www.mediawiki.org/wiki/Project:Developer_meet-up_2009#Outline
[5]
http://www.mediawiki.org/wiki/Project:Developer_meet-up_2009#Apartmenthaus_am_Potsdamer_Platz

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] MediaWiki developer meet-up in Berlin, April 3-5

2009-01-19 Thread Daniel Kinzler
Hello All

I'm happy to announce the MediaWiki Developer Meet-Up will happen April 3.-5. in
Berlin, at the c-base. The event is for everyone who works on MediaWiki, writes
extensions, builds bots, writes scripts for the toolserver, or is otherwise
interested in the technical aspects of Wikimedia. We are happy that we can now
have the meet-up after our plans for 25C3 and FOSDEM failed. If you want to come
to the Developer Meetup, please sign up at
.

The event will take place in parallel to the Wikimedia Foundation's board
meeting and chapter meeting, so there will be a lot of Wikimedians in Berlin at
the time. We plan to have a party to bring everyone together and give an
opportunity for developers, board members and chapter people to mingle.

The meet-up will be a loose BacCamp-like event so topics and schedule are
largely up to you. The goal is to get to know new aspects of MediaWiki and
Wikimedia and to develop ideas on how we can make things even better. And of
course to have a lot of fun with wiki hackers from around the world!

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] A small Commons research project

2009-01-15 Thread Daniel Kinzler
I'm happy to tell you: COmmonist is now on Betawiki, translation is in progress.

See http://translatewiki.net/wiki/Translating:Commonist

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] A small Commons research project

2009-01-14 Thread Daniel Kinzler
Gerard Meijssen schrieb:
> Hoi,
> Is Commonist localised ? Is it usable in a Persian context ?

Ok, I just checked: it does indeed use the standard property file format, and
currently supports en, de, fr and sk. I don't know how well it will handle a RTL
language -- Java/Swing does have support built in, but you have to take care to
consider it when building the UI.

But ask the author about it, he's quite competent, and a nice guy.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] A small Commons research project

2009-01-14 Thread Daniel Kinzler
Gerard Meijssen schrieb:
> Hoi,
> Is Commonist localised ? Is it usable in a Persian context ?

Commonist is GPL. It's Java-based, and Java as pretty good unicode support. As
to localization, I'm not sure, but I think it uses the standard java mechanisnm,
that is "resource bundels" in the form of "property files". The syntax is pretty
simple, it would be nice for betawiki to support it.

I havn't checked what languages it supports so far. At least german and english
I expect.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] "Regular contributor"

2008-11-17 Thread Daniel Kinzler
Desilets, Alain schrieb:
> Interesting. So, in summary:
> 
> - Most edits done by a small core
> - But, most of the text created by the long tail
> - However, most of the text that people actually read, was created by
> the small core
> 
> Is that a good summary of what we know about this question?

Oh... that's pretty, I want to show that around! Care to, err, blog it?

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiScience announce

2008-10-02 Thread Daniel Kinzler

> I'm right now in Germany, with not very good connectivity, so I'll try to
> expand the main page and sections to provide more information.

If you happen to be in Berlin, come to the C-Base tonight, there's a Wikipedia
party going on :)

- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiScience announce

2008-10-01 Thread Daniel Kinzler
>> To clarify: it's incorrect that Wikiversity is not focused on
>> research; research is one of the core focuses of Wikiversity,
>> complementing learning resources and learning activities. Just like
>> other learning 'institutions', research is seen as an educational
>> activity, and as a driver of further knowledge and resources. It's
>> always been a core focus to get researchers (including, but not
>> limited to, on wikis) together on Wikiversity - to share and discuss
>> each other's work, and to collaborate on further work. However, I
>> admit this isn't so clear from the main page, so I'll edit that now. :-)
>>
>> Conversely, I'd also like a bit more clarity on what WikiScience aims
>> to do, and how it aims to do it. For example, it seems that there is
>> an explicit aim to integrate other tools alongside a wiki in this work
>> - is that right? It would be good, in general, to expand on its scope
>> and vision - as well as an indication of the
>> activity/people/organisation behind it so far. That might help set a
>> foundation for further development and collaboration. ;-)
>>
>> Cheers,
>> Cormac

Ok now... I didn't mean to say that research how no place at Wikiversity - I was
just under the impression that it's not the main focus. But I may be wrong 
there.

In my mind, comparing the proposed research wiki to Wikiversity is odd, because
Wikiversity is so much broader in scope. Comparing it to Wikiversitie's
"Wikimedia Studies" portal
 makes more sense to
me. And there is indeed some overlap.

Now, I don't know exactly what Filipe has in mind, but from what I understand of
Wikiversity, and what I would like WikiScience to be, here are the main points I
see for WikiScience:

* It would be a place to write abut individual projects, wiki engines,
frameworks, etc.
* WikiScience would be a place to collect, share and document individual tools
and libraries.
* WikiScience would offer additional services, such as a dedicated news feed or
a mailing list (though perhaps Wiki-research-l is sufficient, if we replace
"Wikimedia" in the title by "Wiki"). Maye there could even be a repository for
code and data.
* Bibliographies would of course also be important -- though I also see that for
Wikiversity. Maybe they could be federated somehow? (btw, cooperation with the
openlibrary is currently pondered)

In short, I imagine  WikiScience would be the place for me to go when I want to
find a tool for a specific task, or discuss experiences with specific problems
with fellow researchers. While Wikiversity is where I would go when I want to
get an overview of the results of research done on wikis, to "learn about
wikis". Also, by nature, Wikiversity focuses on wikimedia, while WikiScience
would not automatically -- though of course Wikipedia is the most common
research subject.

Anyway... I don't think the effort is redundant, just like Wookipedia is not
redundant to Portal:Star_Wars. But we should of course be careful not to
fragment community and content. So, let's cooperate and work on making these
things *complementary*.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiScience announce

2008-10-01 Thread Daniel Kinzler
A technical issue:

I created an account, but i'm unable to confirm my email address. I'm not
getting any mail.

Now, I know that my hoster is VERY strict about the mail protocol, and rejects
anything that is even slightly divergent from the spec. Can you please check
your log if a mail to brightbyte.de got bounced?

-- daniel

PS: doing this via the list because others may have similar problems

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiScience announce

2008-10-01 Thread Daniel Kinzler
Michael Reschke schrieb:
> a virtual space promoting interaction and interdisciplinary research
> among wiki researchers around the world
> 
> http://en.wikiversity.org
> 
> so what?

Uh... from the front page: "Wikiversity is a Wikimedia Foundation project
devoted to learning resources and learning projects for all levels, types, and
styles of education from pre-school to university, including professional
training and informal learning."

That is something complety different. The focus is on learning/teaching, not on
research. And it's for learning about anything, not focused on researching
wikis. Cooperation with Wikiversity is of course a good idea, but it's be no
means redundant.

-- daniel


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] WikiScience announce

2008-10-01 Thread Daniel Kinzler
Felipe Ortega schrieb:
> Hello.
> 
> We would like to announce the release of WikiScience, an initiative to create
> a virtual space promoting interaction and interdisciplinary research among
> wiki researchers around the world.

Great, I hope it takes off! Maybe it would hep to say what should go on the
wiki. Is it a place to write about your research project, or any research
project people know? About tools? Papers? People? Wiki engines? Defining the
scope is really important - and inviting people to add this stuff. now :)

> You can access the wiki at http://wikiscience.libresoft.es/wiki, and help us
> with your ideas and contributions. If you also want to collaborate in
> developing services for the new demo, please let us know too.

One thing I would suggest is a "WikiResearchPlanet" aggregated RSS feed. It
could be based on a wiki page where people can add (and remove) feeds to 
aggregate.

-- daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Put your name and project on meta:Research

2008-09-16 Thread Daniel Kinzler
Hi all

apparently as a side effect of the "about wikipedia projects" thread, some
people (including myself) have started to put their names and projects on
.

I encurrage everyone to do the same. It's a great way to get an overview and to
find people to talk to.

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] About Wikipedia projects

2008-09-16 Thread Daniel Kinzler
Hello Anne, hello Hanteng

It was fun to stay with you for WikiSym - I hope we'll meet again soon. We'll
probably justbump into each other at some conference :)

Anyway - Anne, you asked about "litterature about Wikipedia projects (and
portals) and their status in regards with the Wikipedia construction" -- I'm not
sure what you mean. To get an overview of the projects run by Wikimedia, see
. A more or less complete list
is at . Or
maybe you are looking for something entirely different? "Wikipedia" is a
specific project (it's even a trademark). Not sure if a good overview of "wiki
portal" in general exists... there is
 but that's mostly about
places where you can start your own wiki.
 could also be a good
starting point. I'm not aware of any scientific research comparing wiki
communities -- would be interesting :) The WikiTracer thing may go into that
direction
,
though it seems to me mostly about metrics for community dynamics.

Hanteng, thanks for the link to meta, I didn't even know that page :) which
shows that it's probably not very comprehensive -- or i'm not well informed.
 seems to be a good
starting point for looking for bibliographies, I have added a few links to
"folksonomic" collections. I have tried to use Zotero before, but didn't quite
get into it. It lacks the social, wiki-ish aspect. Bibsonomy is a bit better for
that, though not great. I have ranted about this a while ago here:
. Even sparked some 
discussion.

In the long run, all this should be tied into a "wiki research planet" type
thing, independant of wikimedia. I look forward to exchange more ideas with more
people :)

Han-Teng Liao (OII) schrieb:
> Hello,
>The 'Research' page on the Meta is the first place to go.
>  http://meta.wikimedia.org/wiki/Research
> 
>I am looking forward to using Zotero for future collaboration.  Even 
> now we can start using Zotero and share the references easily by 
> exporting the individual's local reference database onto the any wiki 
> pages via Wikipedia citation format. 
>  http://www.zotero.org/
> It has been speculated that Zotero will be mature and ready for real 
> collaborative citation.  Hence I feel it might be a great way to start 
> your own citation database while contributing to the research community 
> via exporting them periodically.
>  
> After I finished my proposal exam next Monday, I will begin to write 
> how-to pages.  For the moment, may I suggest we use the Meta page as 
> portal? 
> 
> Best regards,
> hanteng
> 
> Anne GOLDENBERG wrote:
>> Hi all,
>> Hi Andrea,
>>
>> I'm looking for litterature about Wikipedia projects (and portals) and
>> their status in regards with the Wikipedia construction.
>>
>> I know that Andrea Forte mentionned that there's been a focus on
>> Wikipedia projects, I think it was during her presentation at Wikisym.
>> But any other references about this is also welcome :).
>>
>> Anne Goldenberg
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>   
> 
> 


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Inviting researchers to wiki-research-l

2008-06-04 Thread Daniel Kinzler
As a folow-up to my suggestion to invite researchers to this mailing list, here
are some I was thinking of, along with some titles of the papers they wrote, to
give an impression:

* Andrew Gregorowicz and Mark A. Kramer (MITRE): "Mining a Large-Scale
Term-Concept Network from Wikipedia", 2006
* Kataro Nakayama, Takahiro Hara, and Shojiro Nishio (University of Osaka):
"Wikipedia Mining for an Association Web Thesaurus Construction", 2007
* Rada Mihalcea (University of North Texas): "Using Wikipedia for Automatic Word
Sense Disambiguation", 2007
* E. Gabrilovich and S. Markovich (Israel Institute of Technology): "Computing
Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis", 2007
* Rainer Hammwöhner (Uni Regensburg): "Qualitätsaspekte der Wikipedia", 2007
* Sören Auer and Jens Lehmann (Uni Leipzig): "DBpedia: A Nucleus ofor a Web of
Open Data", 2007
* Michael Strube and Simone Ponzetto (EML Research): "Creating a Knowledge Base
from a Collaboratively Generated Encyclopedia", 2007
* Torsten Zesch and Iryna Gurevych (TU Darmstadt): "Analysis of the Wikipedia
Category Graph for NLP Applications"

I have omitted several of whom I believe that they are already on this list,
including Jakob Voss, Markus Krötzsch, Denny Vrandecic, Gerard Meijssen and
Barend Mons. I believe the people mentioned above could have an interest in this
mailing list, and it would sure be great to hear the latest about their research
from them.

Can you think of any other people you would like to have on this list? Should we
just ask them?

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Inviting researchers to wiki-research-l

2008-06-03 Thread Daniel Kinzler
Hello all

>From my thesis research, I have a list of about 15 people who have done some
interesting research with and about wikipedia in the last couple of years. I
would suggest to "officially" invite them to this mailing list. Do you think
this is a good idea? And who should do this "officially"?

This also raises once more the question: how and where can we build a common
worspace for wiki research? A mailing list is a good thing to have, but far from
ideal. A Wiki somewhere... maybe? Or should be integrate with Wikiversity?

Perhaps we could set up a bibsomomy group for wiki research too -- there is
already a lot of papers there: . A
system with support for tagging, bibtex, etc seems better to me than the flat
list at WP:ACST.

So, what do you think?
-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Outline of a method for building a multilingual thesaurus from Wikipedia -- now in english

2008-06-02 Thread Daniel Kinzler
Hello all

Due to popular demand, I have translated some key chapters of my thesis about
creating a multilingual thesaurus from wikipedia into english. See here:

  

I hope this text gives a good impression of what I'm doing. I will soon add
translation of the appendices that describe the usage of the WikiWord program as
well as the individual source bundles and the data files used for evaluation.

Regards,
Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
> Well, I'm almost done splitting the TXT file into chapters using an
> EMACS macro. I'll post that.

Cool! And I got started with translating. I guess I can have it ready
tomorrow... or the day after that.

I'm off to bed now.

Thanks to everyone for all the comments
-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
> Hum... If I go to the above translation link, only the first bit is
> actually translated. I guess Google gives up after a while and leaves
> the rest in German.
> 
> If you can split it into separate HTML pages, it would make it easier
> for people to read it with Google translate.

I guess before spending a day messing with bad converters, I should rather spend
that day translating the important bits myself :)

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
>> PS: this must be the fastest mail exchange i have had in a while. Do
>> you use IRC? I'm on freenode.net as Duesentrieb frequently (and right
>> now).
> 
> I use Skype myself. I am alain_desilets there.
> 

Hm, I dislike Skype... it doesn't play too well with (my) linux, and they have
shifty policies. Anyway, mail will do for now :)

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingualthesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
>>> Is there documentation about how to use this code?
>> Some -- in german, in the thesis. Frankly, finishing 30k lines of code
>> and 220 pages of thesis in 7 monthes proved to be a bit tight :)
> 
> Which pages of the thesis? 

Well, it depends on what you mean by "use the code". The command line interface
is described on pages 142ff (numbers as shown on the pages; it's actually
152/220 i think). Some of the core classes and methods are described all over
part III of the thesis.

>>> Are you planning to do more work on this, or are you moving on to
>> other things?
>>
>> If I can, I would try to continue working on this. Currently I'm
>> planning to finish university by the end of the year, and I don't know
>> yet how i'll be earning my living then. Preferrably, by working on
> this
>> -- or something similarly wiki-related, my head is full of ideas :)
> 
> Might be hard to earn a living working on this but who knows.

It's worth a try :)

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
> BTW Daniel,
> 
> Would you be interested in submitting a position paper for the BabelWiki 08
> workshop which I am co-organising:
> 
> Babelwiki.notlong.com
> 
> It will be at the same time as WikiSym 2008, in Porto, Portugal, Sept 8-10.
> We also had full papers, but the deadline for those is now passed.

That sounds very interesting. I saw the WikiSym CFP and was sorely tempted, but
had to let it pass in order to get my thesis done in time.

> I hope you can make it. Your research is smack down in the middle of the
> workshop theme.

I'll try, but it's a question of money, family and exams. I'm a bit in a tight
spot right now. I'd sure like to come :)

-- Daniel

PS: this must be the fastest mail exchange i have had in a while. Do you use
IRC? I'm on freenode.net as Duesentrieb frequently (and right now).

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
> I was able to convert the PDF file to .txt. Not very readable, but should be 
> good enough to allow me to gist the content through Google translate.
> 
> But in order to do that, it would be useful if I posted the .txt file 
> somewhere on the web.
> 
> Do you mind if I do that? 
> 
> Alain

Go right ahead, it's GFDL :) Post the link for the benefit of others too.
Hm... I guess we should be careful about one point: please make sure this is
clearly credited to me. I only handed it in yesterday. Someone is going to check
if i stole the text from somehwere. So when it does show up, it better has my
name on it :)

Anyway, I'll try to provide a readable HTML version soon, and an english
translation of  some selected chapters.

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingualthesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
> One of my goals in the next year or two is to participate in the creation of
> large, open, wiki-like terminology databases for translators. We call this
> concept a WikiTerm.

That sounds quite interesting

...

> So, let's talk! Do you have contacts at OmegaWiki? If not, I can put you in
> touch with them.

Yes, I have talked with Gerard Meijssen about this several times, and he seemed
quite exited :) I have also talked to Barend Mons of Knewco obout this, we seem
to have pretty similar ideas. So, yes, let's talk :) Ideally, you, me, and them.
To bad I'll probably not going to make it to WikiSym or Wikimania this year,
that would be an ideal opportunity.

> Is it unit tested?
> 
> If so, then I forgive you  ;-) .

Yes. Not every single method, but all the important bits are unit tested.

> Is there documentation about how to use this code?

Some -- in german, in the thesis. Frankly, finishing 30k lines of code and 220
pages of thesis in 7 monthes proved to be a bit tight :)

> Are you planning to do more work on this, or are you moving on to other 
> things?

If I can, I would try to continue working on this. Currently I'm planning to
finish university by the end of the year, and I don't know yet how i'll be
earning my living then. Preferrably, by working on this -- or something
similarly wiki-related, my head is full of ideas :)

Regards,
Daniel




___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Luca de Alfaro wrote:
> 
> This looks very interesting!
> Is this a thesaurus that can be used for translation of words across
> languages?

Yes, in the sense that it (potentially) contains labels in different languages
for the same concept.

> Is there some way to quickly have a demo or view the data?

Sadly, no. I started to implement a web based query interface, but there was no
time to finish it while working on the thesis. Maybe I'll get it up one day.

On the other hand: if you find a decent viewer/explorer for SKOS (I didn't), you
should be able to explore the contents without problems. It's a standard RDF
vocabulary.

> I browsed some files, and I see entries of the kind:
> 
> :xf5bfa ww:displayLabel "de:Feliner_Diabetes_mellitus" .
> :xf5bfa ww:type wwct:OTHER .
> :xf5bfa rdf:type skos:Concept .
> :xf5bfa skos:inScheme
>  
> which tells me that Diabetes Mellitus of a feline is a concept... I was
> interested in the animal thesaurus as a way to translate animal names
> across languages... there are a lot of files, and I don't know if I am
> looking at the right ones.  Perhaps if you pointed us to the most
> interesting / understandable datasets, it would be very useful.

the animals:thesaurus dataset, as found in
, *should* contain
what you are looking for, namely different names for the same animal in
different languages. However, due to the way the sample was taken, the overlap
of pages analyzed from the different wikis is not as good as it should, and the
english wikipedia is missing entierly from this dataset. This is due to the fact
that the categories deadling with domesticated animals appear to be structured
in very different way in the different wikipedias. This is why it's a bit hard
to find a working example of a trans-language concept in that dataset. One
example would be x4c4b45, the entry fro domestic cattly, providing information
for German and French (english is, as i said, missing from that dataset).


A better example for seeing this WORK is probably colors:thesaurus as found in
 (or, if you want
plain SKOS, ).
Here's an excerpt for the color green (xa7d8c5):

:xa7d8c5 ww:displayLabel
"de:Grün|en:Green|fr:Vert|nl:Groen_(kleur)|no:Grønn|simple:Green" .
:xa7d8c5 ww:type wwct:OTHER .
:xa7d8c5 rdf:type skos:Concept .
...
:xa7d8c5 skos:definition "Grün ist jener Farbreiz der wahrgenommenen wird, wenn
Licht mit einer spektralen Verteilung ins Auge fällt bei dem das Maximum im
Wellenlängenintervall zwischen 520 und 565 nm liegt."@de .
:xa7d8c5 skos:altLabel "Blassgrün"@de .
:xa7d8c5 skos:altLabel "Dunkelgrün"@de .
:xa7d8c5 skos:altLabel "Grün"@de .
:xa7d8c5 skos:altLabel "Grüne"@de .
:xa7d8c5 skos:altLabel "Grünliche"@de .
...
:xa7d8c5 skos:definition "Green is a color, the perception of which is evoked by
light having a spectrum dominated by energy with a wavelength of roughly
520–570 nm."@en .
:xa7d8c5 skos:altLabel "Avacado"@en .
:xa7d8c5 skos:altLabel "Avocado"@en .
:xa7d8c5 skos:altLabel "Dark green"@en .
:xa7d8c5 skos:altLabel "Dark pastel green"@en .
:xa7d8c5 skos:altLabel "Dark spring green"@en .
:xa7d8c5 skos:altLabel "GREEN"@en .
:xa7d8c5 skos:altLabel "Green"@en .
:xa7d8c5 skos:altLabel "Green (HTML/CSS green)"@en .
:xa7d8c5 skos:altLabel "Greenness"@en .
...
:xa7d8c5 skos:definition "Le vert est une couleur complémentaire correspondant à
la lumière qui a une longueur d'onde comprise entre 490 et 570 nm."@fr .
:xa7d8c5 skos:altLabel "Couleur vert"@fr .
:xa7d8c5 skos:altLabel "Vert"@fr .
:xa7d8c5 skos:altLabel "Verte"@fr .
:xa7d8c5 skos:altLabel "Viridis"@fr .
:xa7d8c5 skos:altLabel "green"@fr .
:xa7d8c5 skos:altLabel "vert"@fr .
:xa7d8c5 skos:altLabel "verte"@fr .
...
:xa7d8c5 skos:definition "Groen is een secundaire kleur bij de subtractieve
kleurmenging."@nl .
:xa7d8c5 skos:altLabel "Groen"@nl .
:xa7d8c5 skos:altLabel "groen"@nl .
:xa7d8c5 skos:altLabel "groenachtige"@nl .
:xa7d8c5 skos:altLabel "groenblauw"@nl .
:xa7d8c5 skos:definition "Grønn er en farge som inngår i fargespekteret."@no .
:xa7d8c5 skos:altLabel "Grønn"@no .
:xa7d8c5 skos:altLabel "green"@no .
:xa7d8c5 skos:altLabel "grønn"@no .
:xa7d8c5 skos:altLabel "grønne"@no .
:xa7d8c5 skos:altLabel "grønt"@no .
:xa7d8c5 skos:definition "Green is one of the colors of the rainbow."@simple .
:xa7d8c5 skos:altLabel "Green"@simple .
:xa7d8c5 skos:altLabel "green"@simple .
:xa7d8c5 skos:altLabel "greenish"@simple .

I hope this gives an impression of the labels and glosses available in different
languages. In addition to this, there are the relations "broader"/"narrower",
"similar" and "related" for navigating the structure, as well as cross-links to
the respective wikipedia-pages, etc.

x89548b (First-order logic) from the dataset logic:thesaurus may also be a good
example.


Reg

Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Han-Teng Liao (OII) wrote:
>  Dear Mr. Kinzler,
>   Could you give me an indication if your code is ready for other
> languages as well? I am asking particularly about the Unicode processing
> because I am really interested in trying it out in East Asian context
> (e.g. Chinese, Japanese, and Korean)

The code should be fully unicode-capable, at least as far as the encoding is
concerned. The methods and algorithms I used are designed to be mostly
language-independant, but some of them will probably have to be adopted for CJK
languages. Especially the code for word- and sentence-splitting as well as for
measuring lexicographic similarity/distance would have to be looked at closely.
However, providing a suitable implementation for different languages or scripts
should be possible without problems, due to the modular design I used for the
text processing classes.

Applying my code to CJK languages would be a great challange to my design, and I
would be very interested to see how it works out. I did not test it, simply
because I know next to nothing about those languages. I would be happy to assist
you in trying to adopt it to CJK languages and scripts.


Regards,
Daniel

PS: I have to appologize in advance to anyone trying to understand the code. I
tried to kep the design clean, but the code is not always pretty, and worst of
all, there are close to no comments. The thesis explains the most important
bits, but if you don't read german, that does you little good i'm afraid. I hope
I will be able to improve on this over time.



___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingualthesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Desilets, Alain wrote:
> Really interesting! Can you post an HTML or text only version so I could read
> it using Google Translate?

I don't know a good way off hand to get that from TeX, but I'll see what I can 
do.

> At WikiMania 07, I presented a paper that looked at how useful wiki resources
> like wikipedia, wiktionary and OmegaWiki might be for the needs of
> translators.
> 
> http://wikimania2007.wikimedia.org/wiki/Proceedings:AD1

Oh, interesting. To bad I didn't find your publications before, I could have
nicely cited them :)

> One of the things we found was that in isolation, each of those resources at
> best covered ~30% of the translation difficulties typically encountered by
> professional translators for the English-French pair. But combined, they were
> able to cover ~50%. We also found that the presentation of information on
> Wikipedia and Wiktionary was not suited for the needs of translators.

I have not looked at Wiktionary. Thinking about it, I'm afraid I failed to
mention my reasons for that in the thesis (losing some points there i guess).
Some of the methods I used are surely applicable there too, though the pages are
a bit more difficult to parse, and we quickly get into "structured record"
terrioy, which i generally avoided (Auer and Lehmann did some interesting
reasearch there with their DBpedia project).

> Based on those two findings, we proposed the idea of a robot capable of
> pulling cross-lingual information from those resources and presenting in a
> way that is better suited for the needs of translators. Sounds like you may
> have just done this!

Well... I hope I did :) The aim was to automatically generate a multilingual
thesaurus, which is surely a good tool for translators. However, the quality of
the results are not what a translator would expect from a traditional lexicon or
thesaurus. The data is probably most useful directly when used in the context of
information retrieval and language processing, that is, as a basis for computers
to link text to conceptual world knowledge ("common sense"). I hope however that
the data I generate could be usefull to translators anyway, as an addition or
extension to traditional, manually maintained dictionaries and thesauri. One
point that would veryx much interest me is to try to use my work aqs a basis of
building a bridge between Wikipedia and OmegaWiki.

> Is there a web interface to this multilingual resource that I could try?

Sadly, no. I was planning one and started to implement it, but there was no time
to get it up and running. Maybe there will be such an interface in the future.
But I really see the place of my code more as a backend -- a good interface to
that data may in fact be OmegaWiki, if we find a way to integrate it nicely. But
I do feel the urge to provide a simple query interface to my data for testing,
so maybe it'll still happen :)

Regards,
Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
Magnus Manske wrote:
> Congratulations! Looks nice (quickly scanning over the pages).
> 
thanks!

Oh, I should have mentioned: to get a good impression without reading 200 pages,
reas pages 26-31. They contain a good overview.

cheers
Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] thesis: automatically building a multilingual thesaurus from wikipedia

2008-05-30 Thread Daniel Kinzler
My diploma thesis about a system to automatically build a multilingual thesaurus
from wikipedia, "WikiWord", is finally done. I handed it in yesterday. My
research will hopefully help to make Wikipedia more accessible for automatic
processing, especially for applications natural languae processing, machine
translation and information retrieval. What this could mean for Wikipedia is:
better search and conceptual navigation, tools for suggesting categories, and 
more.

Here's the thesis (in German, i'm afraid): 
<http://brightbyte.de/DA/WikiWord.pdf>

  Daniel Kinzler, "Automatischer Aufbau eines multilingualen Thesaurus durch
  Extraktion semantischer und lexikalischer Relationen aus der Wikipedia",
  Diplomarbeit an der Abteilung für Automatische Sprachverarbeitung, Institut
  für Informatik, Universität Leipzig, 2008.

For the curious, http://brightbyte.de/DA/ also contains source code and data.
See <http://brightbyte.de/page/WikiWord> for more information.

Some more data is for now avialable at
<http://aspra27.informatik.uni-leipzig.de/~dkinzler/rdfdumps/>. This includes
full SKOS dumps for en, de, fr, nl, and no covering about six million concepts.

The thesis ended up being rather large... 220 pages thesis and 30k lines of
code. I'm plannign to write a research paper in english soon, which will give an
overview over WikiWord and what it can be used for.

The thesis is licensed under the GFDL, WikiWord is GPL software. All data taken
or derived from wikipedia is GFDL.


Enjoy,
Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Research-Wiki

2008-02-07 Thread Daniel Kinzler
Andrew Krizhanovsky wrote:
> I think the aclwiki has potential to cover papers and software related 
> to Wiki and science.

While that wiki looks quite relevant for the type of wiki(pedia) research *I* am
doing (thanks for the link), I don't think it would be the right place for
general "Wikipedia studies". Wikipedia studies can be about the social and
psychological aspects of working on a wiki, or about technical issues, like
syntax-independent storage of wikitext or decentralizing wiki infrastructure,
among other things. So, a wiki focused on linguistics wouldn't be the right
place, IMHO.

-- Daniel

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Research-Wiki

2008-02-07 Thread Daniel Kinzler
Hello All

As I'm currently working on my major thesis about extracting a multilingual
thesaurus from wikipedia data, I have collected quite a bit of research
resources about wikipedia. Here are a few links:

* My Wikipedia Research link collection:
http://del.icio.us/brightbyte/wikipedia%2Bresearch
* Wikipedia tag on CiteULike: http://www.citeulike.org/tag/wikipedia
* Wikipedia group on CiteULike: http://www.citeulike.org/group/382/library
* My own wikipedia stuff there:
http://www.citeulike.org/user/brightbyte/tag/wikipedia
* Overview page for my thesis work: http://brightbyte.de/page/WikiWord

I hope this will be useful to someone. I mainly focused on Wikipedia as a
resource for linguistic and semantical analysis.

As to having a central place to coordinate and discuss research: yes, that would
be great. Though I'm also not sure of the best form. A good bibliography system
would sure help, and wiki-style flexible creation of topic pages, and some sort
of discussion system, and perhaps a "planet" style aggregated news feed?
Ideally, all this could be provided by a single system - I have discussed my
dreams about a Bibliography Thing / research platform a few weeks ago here:
http://brightbyte.de/page/The_Bibliography_Thing


Regards,
Daniel Kinzler,
aka Duesentrieb,
aka BrightByte

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wiki-research-l