Dirk Riehle schreef: > Hi, > > looking through the API: http://en.wikipedia.org/w/api.php I can't find > any way to get at the actual page contents. Is this correct? > Like someone else said before me, you can get unparsed wikitext through prop=revisions&rvprop=content . If you want HTML without the sidebar and all that, you can get it with
index.php?action=render&title=Foo > Finally, and that's why I'm sending this email to this mailing list: How > does Powerset do this: Go to powerset.com, search for something you > might find in Wikipedia, and see how it provides an uptodate > (click-through) copy of the Wikipedia page. My hunch is that the they > use a database dump for search and then screen-scrape, or is there a > better explanation? You can actually search Wikipedia through the API: http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=Foo&srwhat=text&srlimit=5 (gets a list of 5 pages containing 'Foo') http://en.wikipedia.org/w/api.php?action=query&generator=search&gsrsearch=Foo&gsrwhat=text&&gsrlimit=5prop=revisions&rvprop=content (gets the contents of those pages) You can also get search suggestions in the OpenSearch format with http://en.wikipedia.org/w/api.php?action=opensearch&search=Te Roan Kattouw (Catrope) _______________________________________________ Mediawiki-api mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
