Available approaches for keyword based querying RDF federations

2014-08-13 Thread Thilini Cooray
Hi,

I would like to know available approaches for  keyword based querying RDF
federations.

I found the following approach :
FedSearch: Efficiently Combining Structured Queries and Full-Text Search in
a SPARQL Federation by
Andriy Nikolov
http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22,
Andreas Schwarte
http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22,
Christian Hütter
http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22

I would like to know whether there are any other approaches.

Regards,
Thilini Cooray


Re: Available approaches for keyword based querying RDF federations

2014-08-13 Thread Paul Houle
I would tend to stick up for a non-federated approach,  in the sense of
gathering a lot of federated data into a centralized knowledge base and
then querying that.  This is akin to how Google or Bing does web search by
crawling the web and forming a distributed index.

I can point to a number of reasons for this,  but some major ones are

* many of the better IR algorithms depend on corpus-wide statistics,  topic
modeling,  and other methods that need a global view (or at least a good
sample of a global view)
* even distributed search systems such as Solr (which in contrast to
federated search are well controlled because the machines are in the same
data center,  there is a deliberate approach to dealing with failures,
 etc.) are not terribly scalable for the following reason.  If you run
queries against N shards,  the time it takes to complete the query is
greater than the the maximum response time.  As N gets bigger the
probability that some glitch happens gets bigger and bigger.  Specifically
when N10 it is pretty hard to maintain an acceptable response time for
interactive use.

I'd say practically centralized search engines like Google and Bing have
won the internet search war.  For various reasons, meta-search,  deep web
search and similar services haven't really caught on.



On Wed, Aug 13, 2014 at 8:12 AM, Thilini Cooray 
thilinicooray.u...@gmail.com wrote:

 Hi,

 I would like to know available approaches for  keyword based querying RDF
 federations.

 I found the following approach :
 FedSearch: Efficiently Combining Structured Queries and Full-Text Search
 in a SPARQL Federation by
 Andriy Nikolov
 http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22,
 Andreas Schwarte
 http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22,
 Christian Hütter
 http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22

 I would like to know whether there are any other approaches.

 Regards,
 Thilini Cooray




-- 
Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254paul.houle on Skype   ontolo...@gmail.com


Re: Available approaches for keyword based querying RDF federations

2014-08-13 Thread Thilini Cooray
Thank you,Paul.


On Wed, Aug 13, 2014 at 8:19 PM, Paul Houle ontolo...@gmail.com wrote:

 I would tend to stick up for a non-federated approach,  in the sense of
 gathering a lot of federated data into a centralized knowledge base and
 then querying that.  This is akin to how Google or Bing does web search by
 crawling the web and forming a distributed index.

 I can point to a number of reasons for this,  but some major ones are

 * many of the better IR algorithms depend on corpus-wide statistics,
  topic modeling,  and other methods that need a global view (or at least a
 good sample of a global view)
 * even distributed search systems such as Solr (which in contrast to
 federated search are well controlled because the machines are in the same
 data center,  there is a deliberate approach to dealing with failures,
  etc.) are not terribly scalable for the following reason.  If you run
 queries against N shards,  the time it takes to complete the query is
 greater than the the maximum response time.  As N gets bigger the
 probability that some glitch happens gets bigger and bigger.  Specifically
 when N10 it is pretty hard to maintain an acceptable response time for
 interactive use.

 I'd say practically centralized search engines like Google and Bing have
 won the internet search war.  For various reasons, meta-search,  deep web
 search and similar services haven't really caught on.



 On Wed, Aug 13, 2014 at 8:12 AM, Thilini Cooray 
 thilinicooray.u...@gmail.com wrote:

 Hi,

 I would like to know available approaches for  keyword based querying RDF
 federations.

 I found the following approach :
 FedSearch: Efficiently Combining Structured Queries and Full-Text Search
 in a SPARQL Federation by
 Andriy Nikolov
 http://link.springer.com/search?facet-author=%22Andriy+Nikolov%22,
 Andreas Schwarte
 http://link.springer.com/search?facet-author=%22Andreas+Schwarte%22,
 Christian Hütter
 http://link.springer.com/search?facet-author=%22Christian+H%C3%BCtter%22

 I would like to know whether there are any other approaches.

 Regards,
 Thilini Cooray




 --
 Paul Houle
 Expert on Freebase, DBpedia, Hadoop and RDF
 (607) 539 6254paul.houle on Skype   ontolo...@gmail.com



(groan, not again): OGC Temporal DWG. Was: space and time

2014-08-13 Thread Gannon Dick
Hi Chris,

FWIW.

While commerce depends upon Product Release Dates and Versioning, geographic 
information can default to the Julian Calendar harmonics.  Not having to deal 
with this administrative detail is a real, pardon the expression, time saver.  
So, I did the math and made a spreadsheet (FODS or EXCEL), a prototype generic 
version Release Clock. This calendar is not in harmony with astronomical 
calculations, which use the Winter Solstice as an anchor rather than New 
Year's.  I apologize for this shameless attempt to curry favour with Champagne 
Manufacturers ;-)

The calendar is by year, with anchors at New Year's and New Year+1.  There are 
three arc control points.  These are arbitrary but can be recognized holidays - 
and the key word is recognized.  For example I used Easter ~ Passover ~ 
Jerusalem Tourist Season.  The control point labels (identity) has no 
Controlling Authority, that is, they have no effect on the graph (timeline).

http://www.rustprivacy.org/2014/balance/gts/utct.zip

Overseas Banks and Government Mint Printing Presses work over night.  Human 
Resources and a Retail Store's safe in the back room do not.  Neither do many 
Cultural Heritage resources.  Work-Life Balance gets, um, unbalanced.

--Gannon

On Tue, 7/29/14, Gannon Dick gannon_d...@yahoo.com wrote:

 Subject: RE: OGC Temporal DWG. Was: space and time

 Date: Tuesday, July 29, 2014, 12:45 PM
 
 
 
 On Tue, 7/29/14, Little, Chris chris.lit...@metoffice.gov.uk
 wrote:
  
  And I agree that
 transparency about calendar algorithms is an issue, not
 just
  in their book. This isone thing that I
 hope that an OGC Best Practice document could help, in
 however a small way.
  
 
 
 
 Hi Chris,
 
 Maybe it is time to go
 big - Universal Coordinated Calendar Time (UTCT).  In
 the near term, (this Julian Century) the Calendar has no
 unidentified shifts.  We know about Leap Days and the
 Calendar is ignorant of Leap Seconds.  So, it is
 possible.  
 
 This presents
 a problem for Linked Data because even though Personal
 Identity is coupled to Occupation and Occupation is coupled
 to the Location of the Workplace, these are couplings not
 correlations.
 
 Mid-day,
 Noon, is a mean value, but one can't assume regression
 to the mean. At the Equator the Authority -
 Solar Noon - has a whopping 7 1/2 minute time shift.  This
 is not hidden, but it is overwhelmed by the Equation of
 Time.  The shifts, on a day-to-day basis do not accumulate
 to significance on a year-to-year basis. To determine
 coupling constants is a fools errand.
 
 e.g. http://www.rustprivacy.org/2014/balance/utct.jpg
 
 When people triangulate in
 their heads they use 3,4,5 triangles to keep the math
 easy.  For this reason, the Axis length is 500%.  All
 shifts (events which impact Work Life Balance)
 are vertical. Sorry, the Day indicator can't
 update automatically - it's a PDF.
 
 WDYT?
 
 Best,
 
 --Gannon (J.) Dick ;-) I'm
 not a commuter, I have a funny name.
 
 
  
 
 -Original Message-
  From: Gannon
 Dick [mailto:gannon_d...@yahoo.com]
  
  Sent: Thursday, July 24,
 2014 5:24 PM
  To: andrea.per...@jrc.ec.europa.eu;
  frans.kni...@geodan.nl;
  simon@csiro.au;
  Chris Beer; Little, Chris
  Cc:
 public-loc...@w3.org;
  public-egov...@w3.org;
  public-lod; tempo...@lists.opengeospatial.org;
  Piero Campalani; Matthias Müller
  Subject:
  Re: OGC Temporal
 DWG. Was: space and time
  
 
 Hi Chris,
  
  who wrote:
  One concern that I
  have is
 that we do not re-invent the  wheel, and do
  nugatory work, hence this email. I do not 
 envisage that we
  will need to do much with
 Calendars, which  have been
  covered so
 well by Dershowitz and Reingold.
  
  =
  No question the quality of the issue
 coverage
  (Calendars) is first rate.
  
  However, the computations
 are not transparently
  self-evident and the
 references you cite in the Wiki are not
 
 available on-line - or are they ?
  
  3. Calendrical Tabulations 1900-2200, Edward
 M.
  Reingold, Nachum Dershowitz. Hardcover:
 636 pages.
  Publisher: Cambridge University
 Press (16 Sep 2002)
  Language: English
 ISBN-10: 0521782538 ISBN-13:
 
 978-0521782531
  
  4.
  Calendrical Calculations, Nachum Dershowitz,
 Edward M.
  Reingold. Paperback: 512 pages.
 Publisher: Cambridge
  University Press; 3
 edition (10 Dec 2007) Language: English
 
 ISBN-10: 0521702380 ISBN-13: 978-0521702386 
  
  Accessability to
 Wheels
  known to have been
 invented is a Wiki issue, I
  think.
  
  --Gannon
 
 
  
  
  
 
 
  On Thu, 7/24/14, Little, Chris chris.lit...@metoffice.gov.uk
  wrote:
  
  
 Subject: OGC
  Temporal DWG. Was: space and
 time
   To:
  Gannon
 Dick gannon_d...@yahoo.com,
  andrea.per...@jrc.ec.europa.eu
  andrea.per...@jrc.ec.europa.eu,
  frans.kni...@geodan.nl
  frans.kni...@geodan.nl,
  simon@csiro.au
  simon@csiro.au,
  Chris Beer