[Wikitech-l] Call for participation in OpenSym 2015, Aug 19-20, San Francisco!

2015-07-04 Thread Dirk Riehle

Call for participation in OpenSym 2015!

Aug 19-20, 2015, San Francisco, http://opensym.org



FOUR FANTASTIC KEYNOTES

Richard Gabriel (IBM) on Using Machines to Manage Public Sentiment on Social 
Media

Peter Norvig (GOOGLE) on Applying Machine Learning to Programs

Robert Glushko (UC BERKELEY) on Collaborative Authoring, Evolution, and 
Personalization


Anthony Wassermann (CMU SV) on Barriers and Pathways to Successful Collaboration

More at 
http://www.opensym.org/category/conference-contributions/keynotes-invited-talks/




GREAT RESEARCH PROGRAM

All core open collaboration tracks, including

- free/libre/open source
- open data
- Wikipedia
- wikis and open collaboration, and
- open innovation

More at 
http://www.opensym.org/2015/06/25/preliminary-opensym-2015-program-announced/




INCLUDING OPEN SPACE

The facilities provide room and space for your own working groups.



AT A WONDERFUL LOCATION

OpenSym 2015 takes place from Aug 19-20 at the Golden Gate Club of San 
Francisco, smack in the middle of the Presidio, with a wonderful view of the 
Golden Gate Bridge.


More at http://www.opensym.org/os2015/location/



REGISTRATION

Is simple, subsidized, and all-encompassing.

Find it here: http://www.opensym.org/os2015/registration/

Prices will go up after July 12th, so be sure to register early!



We would like to thank our sponsors Wikimedia Foundation, Google, TJEF, and 
the ACM.






___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] WikiSym + OpenSym 2013: Less than 2 weeks for Community Track Submissions

2013-05-07 Thread Dirk Riehle
 the demo, a  specific description of what you plan 
to demo, what you hope to get out of demoing, and how the audience will 
benefit. A short note of any special technical requirements should be included.


Demo submissions will  be reviewed based on their relevance to the community. 
All accepted demos will given space at a joint demo session (90 minutes) 
during the conference.


Tutorials

Tutorials tutorials are half-day classes, taught by experts, designed to help 
professionals rapidly come up to speed on a specific technology or 
methodology. Tutorials can be lecture-oriented or participatory. Tutorial 
attendees deserve the highest standard of excellence in tutorial preparation 
and delivery. Tutorial presenters are typically experts in their chosen topic 
and experienced speakers skilled in preparing and delivering educational 
presentations. When selecting tutorials, we will consider the presenter’s 
knowledge of the proposed topic and past success at teaching it.




SUBMISSION INFORMATION AND INSTRUCTIONS

There are two submission deadlines, an early and a regular one. The early 
deadline is for those who need to know early that their community track 
submission has been accepted. This mostly applies to workshops that require a 
program committee and their own paper submission and review process (as 
opposed, for example, to walk-in workshops). Also, some may need the 
additional time to raise funds and acquire a visa.


Submissions should follow the standard ACM SIG proceedings format. For advice 
and templates, please see 
http://www.acm.org/sigs/publications/proceedings-templates. All papers must 
conform at time of submission to the formatting instructions and must not 
exceed the page limits, including all text, references, appendices and 
figures. All submissions must in PDF format.


All papers and proposals should be submitted electronically through EasyChair 
using the following URL: 
https://www.easychair.org/conferences/?conf=opensym2013community




SUBMISSION AND NOTIFICATION DEADLINES

* Early submission deadline: March 17, 2013
* Notification for early submissions: March 31, 2013
* Regular submission deadline: May 17, 2013
* Notification for regular submissions: May 31, 2013
* Camera-ready for both rounds: June 9, 2013

As long as it is May 17 somewhere on earth, your submission will be accepted.



COMMUNITY TRACK PROGRAM COMMITTEE

Chairs

Regis Barondeau (Université du Québec à Montréal)
Dirk Riehle (Friedrich-Alexander University Erlangen-Nürnberg)

--
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] programmatically extracting lists from list pages on Wikipedia

2011-11-22 Thread Dirk Riehle
Try the Sweble parser for extracting structured data from Wikitext
http://sweble.org

http://dirkriehle.com, +49 157 8153 4150, +1 650 450 8550
On Nov 22, 2011 9:35 PM, "Fred Zimmerman"  wrote:

> hi,
>
> I want to programmatically extract lists from list pages on Wikipedia. That
> is to say, if there is a page that mostly consists of a list (list of
> episodes, list of presidents, etc.) I want to be able to extract the list
> from the page, with article names/links.  Has anyone already done this? can
> anyone suggest a good strategy?
>
> FredZ
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing Wikihadoop: using Hadoop to analyze Wikipedia dump files

2011-09-14 Thread Dirk Riehle
Hello everyone!

Wikihadoop sounds like a great project!

I wanted to point out that you can make it even more powerful for many 
research applications by combining it with the Sweble Wikitext parser.

Doing so, you could enable Wikipedia dump processing not only on the rough XML 
dump level, but on the fine grain individual element (bold piece, heading, 
paragraph, category, page, etc.) level.

You can learn more about Sweble here: http://sweble.org

Cheers,
Dirk


On 08/17/2011 06:58 PM, Diederik van Liere wrote:
> Hello!
>
> Over the last few weeks, Yusuke Matsubara, Shawn Walker, Aaron Halfaker and
> Fabian Kaelin (who are all Summer of Research fellows)[0] have worked hard
> on a customized stream-based InputFormatReader that allows parsing of both
> bz2 compressed and uncompressed files of the full Wikipedia dump (dump file
> with the complete edit histories) using Hadoop. Prior to WikiHadoop and the
> accompanying InputFormatReader it was not possible to use Hadoop to analyze
> the full Wikipedia dump files (see the detailed tutorial / background for an
> explanation why that was not possible).
>
> This means:
> 1) We can now harness Hadoop's distributed computing capabilities in
> analyzing the full dump files.
> 2) You can send either one or two revisions to a single mapper so it's
> possible to diff two revisions and see what content has been addded /
> removed.
> 3) You can exclude namespaces by supplying a regular expression.
> 4) We are using Hadoop's Streaming interface which means people can use this
> InputFormat Reader using different languages such as Java, Python, Ruby and
> PHP.
>
> The source code is available at: https://github.com/whym/wikihadoop
> A more detailed tutorial and installation guide is available at:
> https://github.com/whym/wikihadoop/wiki
>
>
> (Apologies for cross-posting to wikitech-l and wiki-research-l)
>
> [0] http://blog.wikimedia.org/2011/06/01/summerofresearchannouncement/
>
>
> Best,
>
> Diederik
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] WYSIWYG and parser plans (was What is wrong with Wikia's WYSIWYG?)

2011-05-03 Thread Dirk Riehle


On 05/03/2011 08:28 PM, Neil Harris wrote:
> On 03/05/11 19:44, MZMcBride wrote:
...
>> The point is that the wikitext and its parsing should be completely separate
>> from MediaWiki/PHP/HipHop/Zend.
>>
>> I think some of the bigger picture is getting lost here. Wikimedia produces
>> XML dumps that contain wikitext. For most people, this is the only way to
>> obtain and reuse large amounts of content from Wikimedia wikis (especially
>> as the HTML dumps haven't been re-created since 2008). There needs to be a
>> way for others to be able to very easily deal with this content.
>>
>> Many people have suggested (with good reason) that this means that wikitext
>> parsing needs to be reproducible in other programming languages. While
>> HipHop may be the best thing since sliced bread, I've yet to see anyone put
>> forward a compelling reason that the current state of affairs is acceptable.
>> Saying "well, it'll soon be much faster for MediaWiki to parse" doesn't
>> overcome the legitimate issues that re-users have (such as programming in a
>> language other than PHP, banish the thought).
>>
>> For me, the idea that all that's needed is a faster parser in PHP is a
>> complete non-starter.
>>
>> MZMcBride
>>
>
> I agree completely.
>
> I think it cannot be emphasized enough that what's valuable about
> Wikipedia and other similar wikis is the hard-won _content_, not the
> software used to write and display it at any given, which is merely a
> means to that end.
>
> Fashions in programming languages and data formats come and go, but the
> person-centuries of writing effort already embodied in Mediawiki's
> wikitext format needs to have a much longer lifespan: having a
> well-defined syntax for its current wikitext format will allow the
> content itself to continue to be maintained for the long term, beyond
> the restrictions of its current software or encoding format.
>
> -- Neil

+1 to both MZMcBride and Neil.

So relieved to see things put so eloquently.

Dirk


-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing the Open Source Sweble Wikitext Parser v1.0

2011-05-01 Thread Dirk Riehle
>>> You should identify whether you mean "MediaWikitext", or some other
>>> dialect -- MediaWiki Is Not The Only Wiki...
>>>
>>> and you should post to wikitext-l as well. The real parser maniacs
>>> hang out over there, even though traffic is low.
>>
>> It is MediaWiki's Wikitext; elsewhere it is usually called wiki
>> markup.
>
> Improperly and incompletely, perhaps, yes.
>
> I'm a MW partisan, and think it's better than nearly all its competitors,
> for nearly all uses... but even I try not to be *that* partisan.

Hmm, never viewed it that way. IMO, MediaWiki (developers) invented a wiki 
markup language and called it Wikitext; other engines just call it wiki markup 
or what not. For me, Wikitext always was the particular markup of MediaWiki, 
much like php or C++ are particular language names.

Is there any other engine that calls it's markup Wikitext? I'd be surprised. 
Even for WikiCreole wikicreole.org we used wiki markup.

Cheers,
Dirk

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Announcing the Open Source Sweble Wikitext Parser v1.0

2011-05-01 Thread Dirk Riehle
> You should identify whether you mean "MediaWikitext", or some other
> dialect -- MediaWiki Is Not The Only Wiki...
>
> and you should post to wikitext-l as well.  The real parser maniacs hang
> out over there, even though traffic is low.

It is MediaWiki's Wikitext; elsewhere it is usually called wiki markup.

Cheers,
Dirk

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] WikiCreole (was Re: What would be a perfect wiki syntax? (Re: WYSIWYG))

2011-01-04 Thread Dirk Riehle

> As long as we're hung up on details of the markup syntax, it's going to be
> very very hard to make useful forward motion on things that are actually
> going to enhance the capabilities of the system and put creative power in
> the hands of the users.
>
> Forget about syntax -- what do we want to *accomplish*?

I think you got this sideways. The concrete syntax doesn't matter, but the 
abstract syntax does. Without a clear specification no competing parsers, no 
interoperability, no decoupling APIs, no independently evolving components.

(Abstract syntax here means "XML representation" or structured representation 
or DOM tree i.e. an abstract syntax tree. But for that you need a language 
i.e. Wikitext specification and an implementation of a parser as of today 
doesn't do the job.)

> worrying about memorizing ASCII code points, it's let us go beyond
> fixed-width ASCII text (a monitor emulating a teletype, which was really a
> friendlier version of punch cards) to have things like _graphics_. Text can
> be in different sizes, different styles, and different languages. We can see
> pictures; we can draw pictures; we can use colors and shapes to create a far
> richer, more creative experience for the user.
>
> GUIs didn't come about from a better, more universal way of encoding text --
> Unicode came years after GUI conventions were largely standardized in
> practice.

In order to have a visual editor or three, combined with a plain text editor, 
combined with some fancy other editor we have yet to invent, you will still 
need that specification that tells you what a valid wiki instance is. This is 
the core data; only if you have a clear spec of that can you have tool and UI 
innovation on top of that.

Cheers,
Dirk

-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] WikiCreole (was Re: What would be a perfect wiki syntax? (Re: WYSIWYG))

2011-01-04 Thread Dirk Riehle
>> (Note that I think any conversation about parser changes should consider
>> the GoodPractices page from http://www.wikicreole.org/wiki/GoodPractices.)
>>
>> If nothing else, perhaps there would be some use for the EBNF grammar
>> that was developed for WikiCreole.
>> http://dirkriehle.com/2008/01/09/an-ebnf-grammar-for-wiki-creole-10/
>
> WikiCreole used to not be parsable by a grammar, either. And it has
> inconsistencies like "italic is // unless it appears in a url".
> Good to see they improved.

WikiCreole only had a prose specification, hence it was ambiguous. Our syntax 
definition improved that so that in theory (and practice) you could now have 
multiple competing parser implementations. The issue with WikiCreole now is 
that it is simply too small---lots of stuff that it can't do but that any wiki 
engine will want.

The real reason why to care about a precise specification (that is not, as in 
the case of Mediawiki, simply the implementation), is the option to evolve 
faster. The real paper for this is 
http://dirkriehle.com/2008/07/19/a-grammar-for-standardized-wiki-markup/ - 
wouldn't it be nice if we could be innovating on a wiki platform?

Cheers,
Dirk


-- 
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


[Wikitech-l] Alternative Mediawiki (Parser) Implementations (was: Re: Extension of Wikitext

2008-11-18 Thread Dirk Riehle
One option might be to use one of the alternative Mediawiki (parser)
implementations.

I know of JAMwiki and Bliki.

JAMwiki is a mostly complete Java implementation. The parser can be
taken out and is reasonably well factored, based on a grammar for
JFlex, a parser generator (if I remember this correctly).

Bliki is purely a Mediawiki parser implementation, not a full-blown
wiki engine, also done in Java.

I'm generally interested in finding a non-php well-factored Mediawiki
syntax parser, ideally written in Java, that I can use for my own
projects.

Are there new alternatives; does anyone have opinions/insights into
the state of the tools mentioned above? It seems pretty tough to track
the Mediawiki syntax...

Cheers,
Dirk

On Mon, Nov 17, 2008 at 5:34 AM, Alex Bernier <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I hope it is the right place to ask my question...
>
> I work on a "collaborative correction of books" project. I know there is
> already some projects related to this subject, like Wikisource. The main
> difference between my project and Wikisource is that my books are stored
> in text using DAISY (see http://www.daisy.org/), a format based on XML. I
> have some questions :
>
> 1) Is there some tools to import XML files in a Wiki ?
>
> 2) Is there tools to export a Wiki page in XML ?
>
> 3) I will have to extend the Wikitext (I want to import DAISY XML files in
> my Wiiki and export them from the Wiki to XML DAISY after correction,
> without loosing information). I think it would be easy for the majority of
> the new tags I want to add, but it would be more difficult for some of them.
> For example, I need to improve the headings possibilities of the Wikitext.
> For the moment, it is limited to 5 levels. I need potentially infinite
> imbrications, like this:
> 
>   Title 1
>   
>  Title 2
> ...
> 
>Title n
> 
>   
> 
>
> Is it possible to add this kind of thing in the Wikitext ? If yes, is it
> possible to do this with an extension, or is it necessary to do "low-levels"
> modifications of the Wikitext parser ?
>
> Best regards,
>
> Alex Bernier
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Phone: +1 650 215 3459
Weblog: http://www.riehle.org
Twitter: http://twitter.com/driehle

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l