Thank you. This will give me the bios, however, I still want the associated wikipedia links. Previously someone had given me a query that included the english wikipedia along with another property. You can see it below:
PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX schema: <http://schema.org/> SELECT ?item ?twitter ?article WHERE { ?item wdt:P2002 ?twitter OPTIONAL {?item rdfs:label ?item_label filter (lang(?item_label) = "en") .} ?article schema:about ?item . ?article schema:inLanguage "en" . FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") } ORDER BY ASC (?article) *I tried to take the PREFIX header and this portion to append to some of your queries. * ?article schema:about ?item . ?article schema:inLanguage "en" . FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") *The first one, which seems to be only for 1 record, just as a test seemed to give me an ERROR though:* PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX schema: <http://schema.org/> SELECT * WHERE { <http://www.wikidata.org/entity/Q1652291> <http://schema.org/description> ?o . filter(lang(?o)='en'). ?article schema:about ?item . ?article schema:inLanguage "en" . FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") } *So I assume the other queries like this would not work (would timeout on query.wikidata.org <http://query.wikidata.org> so can't test):* PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX schema: <http://schema.org/> SELECT * WHERE { ?s <http://schema.org/description> ?o . filter(lang(?o)='en'). ?article schema:about ?item . ?article schema:inLanguage "en" . FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") } So am I doing something wrong with these combined queries in the syntax? Thanks in advance again, and the help thus far! On Mon, Feb 1, 2016 at 1:19 AM, Edgard Marx <m...@informatik.uni-leipzig.de> wrote: > Yep, > > Please notes that RDFSlice will take the subset. > That is, the triples that contain the property that you are looking for. > Here go three examples of SPARQL queries: > > ps: you can try them here https://query.wikidata.org. > > ** For your example,* > > SELECT * > WHERE > { > <http://www.wikidata.org/entity/Q1652291> < > http://schema.org/description> ?o . > filter(lang(?o)='en'). > } > > > ** For all English bios:* > > SELECT * > WHERE > { > ?s <http://schema.org/description> ?o . > filter(lang(?o)='en'). > } > > ** For all language bios:* > > SELECT * > WHERE > { > <http://www.wikidata.org/entity/Q1652291> < > http://schema.org/description> ?o . > } > > > best, > Edgard > > > > On Mon, Feb 1, 2016 at 4:34 AM, Hampton Snowball < > hamptonsnowb...@gmail.com> wrote: > >> Thanks. I see it requires constructing a query to only extract the data >> you want. E.g. the graph pattern: >> >> <graphPatterns> - desired query, e.g. "SELECT * WHERE {?s ?p ?o}" or >> graph pattern e.g. "{?s ?p ?o}" >> >> Since I don't know about constructing queries, would you be able to tell >> me what would be the proper query to extract from all the pages the short >> bio, english wikipedia, maybe other wikipedias? >> >> For example from: https://www.wikidata.org/wiki/Q1652291" >> >> "Turkish female given name" >> https://en.wikipedia.org/wiki/H%C3%BClya >> and optionally https://de.wikipedia.org/wiki/H%C3%BClya >> >> Thanks in advance! >> >> >> On Sun, Jan 31, 2016 at 3:53 PM, Edgard Marx < >> m...@informatik.uni-leipzig.de> wrote: >> >>> Hey, >>> you can simple use RDFSlice ( >>> https://bitbucket.org/emarx/rdfslice/overview) directly on the dump >>> file (https://dumps.wikimedia.org/wikidatawiki/entities/20160125/) >>> >>> best, >>> Edgard >>> >>> On Sun, Jan 31, 2016 at 7:43 PM, Hampton Snowball < >>> hamptonsnowb...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> I am interested in a subset of wikidata and I am trying to find the >>>> best way to get it without getting a larger dataset then necessary. >>>> >>>> Is there a way to just get the "bios" that appear on the wikidata pages >>>> below the name of the person/organization, as well as the link to the >>>> english wikipedia page / or all wikipedia pages? >>>> >>>> For example from: https://www.wikidata.org/wiki/Q1652291" >>>> >>>> "Turkish female given name" >>>> https://en.wikipedia.org/wiki/H%C3%BClya >>>> and optionally https://de.wikipedia.org/wiki/H%C3%BClya >>>> >>>> I know there is SPARQL which previously this list helped me construct a >>>> query, but I know some requests seem to timeout when looking at a large >>>> amount of data so I am not sure this would work. >>>> >>>> The dumps I know are the full dataset, but I am not sure if there's any >>>> other subset dumps available or better way of grabbing this data >>>> >>>> Thanks in advance, >>>> HS >>>> >>>> >>>> _______________________________________________ >>>> Wikidata mailing list >>>> Wikidata@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>>> >>>> >>> >>> _______________________________________________ >>> Wikidata mailing list >>> Wikidata@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/wikidata >>> >>> >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata