Available approaches for keyword based querying RDF federations
Hi, I would like to know available approaches for keyword based querying RDF federations. I found the following approach : FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation by Andriy Nikolov http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22, Andreas Schwarte http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22, Christian Hütter http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22 I would like to know whether there are any other approaches. Regards, Thilini Cooray
Re: Available approaches for keyword based querying RDF federations
I would tend to stick up for a non-federated approach, in the sense of gathering a lot of federated data into a centralized knowledge base and then querying that. This is akin to how Google or Bing does web search by crawling the web and forming a distributed index. I can point to a number of reasons for this, but some major ones are * many of the better IR algorithms depend on corpus-wide statistics, topic modeling, and other methods that need a global view (or at least a good sample of a global view) * even distributed search systems such as Solr (which in contrast to federated search are well controlled because the machines are in the same data center, there is a deliberate approach to dealing with failures, etc.) are not terribly scalable for the following reason. If you run queries against N shards, the time it takes to complete the query is greater than the the maximum response time. As N gets bigger the probability that some glitch happens gets bigger and bigger. Specifically when N10 it is pretty hard to maintain an acceptable response time for interactive use. I'd say practically centralized search engines like Google and Bing have won the internet search war. For various reasons, meta-search, deep web search and similar services haven't really caught on. On Wed, Aug 13, 2014 at 8:12 AM, Thilini Cooray thilinicooray.u...@gmail.com wrote: Hi, I would like to know available approaches for keyword based querying RDF federations. I found the following approach : FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation by Andriy Nikolov http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22, Andreas Schwarte http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22, Christian Hütter http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22 I would like to know whether there are any other approaches. Regards, Thilini Cooray -- Paul Houle Expert on Freebase, DBpedia, Hadoop and RDF (607) 539 6254paul.houle on Skype ontolo...@gmail.com
Re: Available approaches for keyword based querying RDF federations
Thank you,Paul. On Wed, Aug 13, 2014 at 8:19 PM, Paul Houle ontolo...@gmail.com wrote: I would tend to stick up for a non-federated approach, in the sense of gathering a lot of federated data into a centralized knowledge base and then querying that. This is akin to how Google or Bing does web search by crawling the web and forming a distributed index. I can point to a number of reasons for this, but some major ones are * many of the better IR algorithms depend on corpus-wide statistics, topic modeling, and other methods that need a global view (or at least a good sample of a global view) * even distributed search systems such as Solr (which in contrast to federated search are well controlled because the machines are in the same data center, there is a deliberate approach to dealing with failures, etc.) are not terribly scalable for the following reason. If you run queries against N shards, the time it takes to complete the query is greater than the the maximum response time. As N gets bigger the probability that some glitch happens gets bigger and bigger. Specifically when N10 it is pretty hard to maintain an acceptable response time for interactive use. I'd say practically centralized search engines like Google and Bing have won the internet search war. For various reasons, meta-search, deep web search and similar services haven't really caught on. On Wed, Aug 13, 2014 at 8:12 AM, Thilini Cooray thilinicooray.u...@gmail.com wrote: Hi, I would like to know available approaches for keyword based querying RDF federations. I found the following approach : FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation by Andriy Nikolov http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22, Andreas Schwarte http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22, Christian Hütter http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22 I would like to know whether there are any other approaches. Regards, Thilini Cooray -- Paul Houle Expert on Freebase, DBpedia, Hadoop and RDF (607) 539 6254paul.houle on Skype ontolo...@gmail.com
(groan, not again): OGC Temporal DWG. Was: space and time
Hi Chris, FWIW. While commerce depends upon Product Release Dates and Versioning, geographic information can default to the Julian Calendar harmonics. Not having to deal with this administrative detail is a real, pardon the expression, time saver. So, I did the math and made a spreadsheet (FODS or EXCEL), a prototype generic version Release Clock. This calendar is not in harmony with astronomical calculations, which use the Winter Solstice as an anchor rather than New Year's. I apologize for this shameless attempt to curry favour with Champagne Manufacturers ;-) The calendar is by year, with anchors at New Year's and New Year+1. There are three arc control points. These are arbitrary but can be recognized holidays - and the key word is recognized. For example I used Easter ~ Passover ~ Jerusalem Tourist Season. The control point labels (identity) has no Controlling Authority, that is, they have no effect on the graph (timeline). http://www.rustprivacy.org/2014/balance/gts/utct.zip Overseas Banks and Government Mint Printing Presses work over night. Human Resources and a Retail Store's safe in the back room do not. Neither do many Cultural Heritage resources. Work-Life Balance gets, um, unbalanced. --Gannon On Tue, 7/29/14, Gannon Dick gannon_d...@yahoo.com wrote: Subject: RE: OGC Temporal DWG. Was: space and time Date: Tuesday, July 29, 2014, 12:45 PM On Tue, 7/29/14, Little, Chris chris.lit...@metoffice.gov.uk wrote: And I agree that transparency about calendar algorithms is an issue, not just in their book. This isone thing that I hope that an OGC Best Practice document could help, in however a small way. Hi Chris, Maybe it is time to go big - Universal Coordinated Calendar Time (UTCT). In the near term, (this Julian Century) the Calendar has no unidentified shifts. We know about Leap Days and the Calendar is ignorant of Leap Seconds. So, it is possible. This presents a problem for Linked Data because even though Personal Identity is coupled to Occupation and Occupation is coupled to the Location of the Workplace, these are couplings not correlations. Mid-day, Noon, is a mean value, but one can't assume regression to the mean. At the Equator the Authority - Solar Noon - has a whopping 7 1/2 minute time shift. This is not hidden, but it is overwhelmed by the Equation of Time. The shifts, on a day-to-day basis do not accumulate to significance on a year-to-year basis. To determine coupling constants is a fools errand. e.g. http://www.rustprivacy.org/2014/balance/utct.jpg When people triangulate in their heads they use 3,4,5 triangles to keep the math easy. For this reason, the Axis length is 500%. All shifts (events which impact Work Life Balance) are vertical. Sorry, the Day indicator can't update automatically - it's a PDF. WDYT? Best, --Gannon (J.) Dick ;-) I'm not a commuter, I have a funny name. -Original Message- From: Gannon Dick [mailto:gannon_d...@yahoo.com] Sent: Thursday, July 24, 2014 5:24 PM To: andrea.per...@jrc.ec.europa.eu; frans.kni...@geodan.nl; simon@csiro.au; Chris Beer; Little, Chris Cc: public-loc...@w3.org; public-egov...@w3.org; public-lod; tempo...@lists.opengeospatial.org; Piero Campalani; Matthias Müller Subject: Re: OGC Temporal DWG. Was: space and time Hi Chris, who wrote: One concern that I have is that we do not re-invent the wheel, and do nugatory work, hence this email. I do not envisage that we will need to do much with Calendars, which have been covered so well by Dershowitz and Reingold. = No question the quality of the issue coverage (Calendars) is first rate. However, the computations are not transparently self-evident and the references you cite in the Wiki are not available on-line - or are they ? 3. Calendrical Tabulations 1900-2200, Edward M. Reingold, Nachum Dershowitz. Hardcover: 636 pages. Publisher: Cambridge University Press (16 Sep 2002) Language: English ISBN-10: 0521782538 ISBN-13: 978-0521782531 4. Calendrical Calculations, Nachum Dershowitz, Edward M. Reingold. Paperback: 512 pages. Publisher: Cambridge University Press; 3 edition (10 Dec 2007) Language: English ISBN-10: 0521702380 ISBN-13: 978-0521702386 Accessability to Wheels known to have been invented is a Wiki issue, I think. --Gannon On Thu, 7/24/14, Little, Chris chris.lit...@metoffice.gov.uk wrote: Subject: OGC Temporal DWG. Was: space and time To: Gannon Dick gannon_d...@yahoo.com, andrea.per...@jrc.ec.europa.eu andrea.per...@jrc.ec.europa.eu, frans.kni...@geodan.nl frans.kni...@geodan.nl, simon@csiro.au simon@csiro.au, Chris Beer