Re: [Virtuoso-users] Making REGEX filters fast

2016-02-22 Thread Hugh Williams
Hi Daniel,

I would not expect STRSTARTS to be any faster than REGEX as neither are 
indexed. The Virtuoso bif:contains function which is available in both open 
source and commercial does have a Full Text index which makes it much faster 
for string searches. So you can use something like:

filter(bif:contains(?name1, '"Insurance, Health/utilization"’))

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.  //  http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers



> On 22 Feb 2016, at 22:36, Davis, Daniel (NIH/NLM) [C]  
> wrote:
> 
> Is there a way to make FILTER(REGEX(?label, ‘Insurance, Health/utilization’, 
> ‘I’)) queries fast?
>  
> I can see that REGEX is going to require a linear scan of the data, and that 
> there are stored functions that do the matching.
>  
> Will STRSTARTS be able to work quickly?   Does the database collation allow 
> case insensitive matching with STRSTARTS?
>  
> Does one require a license for Virtuoso to use Full-text bif:contains (which 
> is not quite the same as REGEX, of course)?
>  
> Dan Davis, Systems/Applications Architect (Contractor),
> Office of Computer and Communications Systems,
> National Library of Medicine, NIH
>  
> --
> Site24x7 APM Insight: Get Deep Visibility into Application Performance
> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
> Monitor end-to-end web transactions and take corrective actions now
> Troubleshoot faster and improve end-user experience. Signup Now!
> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140___
>  
> 
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net 
> 
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users 
> 


smime.p7s
Description: S/MIME cryptographic signature
--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users


Re: [Virtuoso-users] infrequent errors on parallel querying

2016-02-22 Thread Hugh Williams
Hi Andreas,

Missed that you were running 4 Virtuoso instances, how is the query workload 
split across the 4 instances ?

I noticed the following errors in all the logs during done attempt to restart:

09:23:02 ERROR: Unable to lock file ../var/lib/virtuoso/db/virtuoso.lck 
(Resource temporarily unavailable).
09:23:02 ERROR: Virtuoso is already runnning (pid 1264)
09:23:02 ERROR: This probably means you either do not have permission to start
09:23:02 ERROR: this server, or that virtuoso-t is already running.
09:23:02 ERROR: If you are absolutely sure that this is not the case, please try
09:23:02 ERROR: to remove the file ../var/lib/virtuoso/db/virtuoso.lck and 
start again.
09:26:41 INFO: ERRS_0 01V01 QW004 :2891: 
WS.WS.SPARQL_ENDPOINT_GENERATE_FORM: Incompatible types INTEGER (189) and 
VARCHAR (182) in = for debug and 
09:34:13 DEBUG: missed delete of name id cache 
benchset/d4e1a90786ac524b9f96f9d2f0506a12 0 (0x20e975b )
09:34:16 DEBUG: missed delete of name id cache 
benchset/265b6be68f2030063c4c17e3778dbbe1 0 (0x2122444 )
09:35:50 DEBUG: missed delete of name id cache 
benchset/8d0299f2a49cf965ac7c98101f55afaa 0 (0x2181ac6 )

But then did not occur on the following startup, do you know what happened here 
as it seem there was an attempt to start an already running server, but I have 
not seen the "DEBUG: missed delete of name id cache …” errors before …

In the status_8895 output I see many occurrences of:

Pending:
  1126400: IER 141.87.4.9 
  156: IER 141.87.4.9 
  152: IER 141.87.4.9 
  148: IER 141.87.4.9 
  144: IER 141.87.4.9
  .
  .
  .

Do these remain indefinitely or do they get cleanup after a while ?

Also you indicated that this errors had not been seen with the 3215 build 
upgrade, is this still the case ?


Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.  //  http://www.openlinksw.com/ 

Weblog   -- http://www.openlinksw.com/blogs/ 
LinkedIn -- http://www.linkedin.com/company/openlink-software/ 

Twitter  -- http://twitter.com/OpenLink 
Google+  -- http://plus.google.com/100570109519069333827/ 

Facebook -- http://www.facebook.com/OpenLinkSoftware 

Universal Data Access, Integration, and Management Technology Providers



> On 21 Feb 2016, at 13:46, Nolle, Andreas  > wrote:
> 
> Hi Hugh,
>  
> typical queries that are executed (each at any Virtuoso instance) are like 
> the following:
>  
> SELECT ?x ?P0src ?P1src
> WHERE {
>{
>   SERVICE > 
> {
>  ?x rdf:type  > .
>   } .
>   BIND (> 
> AS ?P0src)
>}
>UNION
>{
>   ?x rdf:type  > .
>   BIND (> 
> AS ?P0src)
>}
>UNION
>{
>   SERVICE > 
> {
>  ?x rdf:type  > .
>   } .
>   BIND (> 
> AS ?P0src)
>}
>UNION
>{
>   SERVICE > 
> {
>  ?x rdf:type  > .
>   } .
>   BIND (> 
> AS ?P0src)
>} .
>?x rdf:type  > .
>BIND (> AS 
> ?P1src)
> }
>  
> Since it is not possible to use the keyword OFFSET for queries, i.e. query 
> parts, that would return more than 1048576 results, there are also queries 
> like:
>  
> SELECT ?x ?a ?b ?P0src ?P1src
> WHERE {
>{
>   SERVICE > 
> {
>  ?x  > ?a .
>  FILTER ( ?x >= 
>  > ) .
>  FILTER ( ?x < 
>  > ) .
>   } .
>   BIND (> 
> AS ?P0src)
>}
>UNION
>{
>   SERVICE 

[Virtuoso-users] Making REGEX filters fast

2016-02-22 Thread Davis, Daniel (NIH/NLM) [C]
Is there a way to make FILTER(REGEX(?label, 'Insurance, Health/utilization', 
'I')) queries fast?

I can see that REGEX is going to require a linear scan of the data, and that 
there are stored functions that do the matching.

Will STRSTARTS be able to work quickly?   Does the database collation allow 
case insensitive matching with STRSTARTS?

Does one require a license for Virtuoso to use Full-text bif:contains (which is 
not quite the same as REGEX, of course)?

Dan Davis, Systems/Applications Architect (Contractor),
Office of Computer and Communications Systems,
National Library of Medicine, NIH

--
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140___
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users