but still there was a Thrift object that
apparently didn't follow it. How was it created? Or was it created,
serialized to disk, somehow corrupted on-disk and then read back?
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. B
penjdk version "11.0.18" 2023-01-17 LTS
Cheers,
Osma
[1] https://gist.github.com/osma/d61281160e84ea74e9d7dbc155ffaf69
[2] https://gist.github.com/jeffreycwitt/e7c270aae46f403845c87aa57e4b82af
[3] https://issues.apache.org/jira/browse/IMPALA-8252
--
Osma Suominen
D.Sc. (Tech), Information
Replying to myself, as I did some follow-up tests.
Osma Suominen kirjoitti 4.12.2020 klo 18.42:
Now this turned into a rather interesting exercise in using git bisect.
I was able to track down the change that caused the slowdown. It's this
merge c
uot; to be "less optimized"; it
currently does nothing (very efficiently), it could consume silently the
query results.
It would also be better if the warmup with writing the required format
to /dev/null would also be better.
I see that you already did this - great work!
-Osma
-
wo years ago, so it's
been a while...
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
corrected soon in the RDF data set as well when we regenerate it
in the next few days.
-Osma
[1] https://github.com/NatLibFi/Skosmos/pull/1098
[2] https://jena.apache.org/tutorials/sparql_datasets.html
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P
ly I don't have the skills to work directly on the ARQ
optimizer or TDB2 code bases. But I'd be happy to test other variations
and potential fixes to these performance problems.
Cheers,
Osma
[1] https://finto.fi/rest/v1/finaf/data?format=text/turtle
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
-1834
Thanks a lot Andy, that was quick! I've checked the most recent SNAPSHOT
builds and the problem is now gone, the scripts are again executable.
Sorry for forgetting to mention that I used the .tar.gz distribution,
thankfully you and Lorenz found that out quickly.
-Osma
--
Osma Suo
difficult to start a
temporary instance of Fuseki...
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
because I think
Jena creates index entries as triples are added.
So the question is what is the most efficient way to re-index existing data
or do I have to re-import all data again each time I add a new field?
Thanks a lot
Best regards
--
Osma Suominen
D.Sc. (Tech), Information Systems Spec
hemas to get all string properties
(property's range is xsd:string) etc.
On 28/02/2018 20:16, Osma Suominen wrote:
Hi Jim!
Your observation is correct. jena-text only indexes the RDF properties
you have explicitly configured. The configuration for each property
may be different. T
ocker image as its database:
https://github.com/NatLibFi/Skosmos/blob/master/docker-compose.yml
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
ly via the HTTP Graph Store API.
You can use the s-put tool that comes with Fuseki to access that API.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsin
for the move.
Why do you need to move the temporary graph? The PUT operation is atomic
- the data being loaded will only be visible to queries after the whole
operation is complete.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kai
ut you're right, a
combination of text:query + regex or contains is very fast (see example
below).
Great that you tried this approach as well and it is fast.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
gt;
>> text:storeValues true ;
>> text:queryParser text:AnalyzingQueryParser ;
>> text:map (
>> [ text:field "title" ; text:predicate dcterms:title ;
>> text:analyzer [ a text:ConfigurableAnalyzer ;
>> te
d "familyName" ; text:predicate foaf:familyName ;
text:analyzer [ a text:ConfigurableAnalyzer ;
text:tokenizer text:KeywordTokenizer ;
text:filters (text:ASCIIFoldingFilter
text:LowerCaseFilter)
] ]
[ text:field "givenName&qu
8, at 5:54 AM, Laura Morales wrote:
If I have this node
:Alice :name "Alice", "Alice Smith" ;
:age 25 .
how can I return *only one* of the ":name" properties with SPARQL? For example return
("Alice", 25)
--
Osma Suominen
D.Sc. (Tech), Informatio
nce but elsewhere they seem to,
maybe because some file process runs on the real hardware (docker), or
may be file locking can be interfered with.
Is vg0-root shared in anyway?
Andy
On 16/04/18 15:21, Osma Suominen wrote:
Hi Andy!
Forgot to answer the VM part - yes, this is a VM ru
and I will send them to you.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
ue to the large number of tracebacks after 10:01), though the
rest are much smaller.
If you really want I can send these to you e.g. via the Funet Filesender
service, which is OK for moving around large files. You will get a
download link by e-mail.
There are no secrets here, as this is intend
e and created a new one.
But I thought I'd report the problem here in case someone else has seen
the same.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsin
tps://jena.apache.org/documentation/fuseki2/fuseki-run.html#fuseki-service could
clarify: "For newer systemd based Linux systems, the script 'fuseki.service' is
provided. Please adapt to your paths and settings."
Cheers, Joachim
-Ursprüngliche Nachricht-
Von: Osma
HS. Naturally
the chosen directory layout also affects the systemd unit file.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
for the "fuseki" init.d script in /etc/rc.local, which
works after making the latter executable. But perhaps the problem has been solved in a
better way.
Cheers, Joachim
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kai
, Data Operations
Tetherless World Constellation
Department of Computer Science
Rensselaer Polytechnic Institute
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi
, the
prefixes need to be in the file itself, otherwise it won't even parse.
For JSON-LD the context could also be separate, but often it's inlined.
But no, the libraries don't do that.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of
e argued, I think it makes sense to do
prefix handling on the client side and keep the SPARQL protocol "simple
but stupid". Then everything needed to answer a query (well, except for
the RDF data set of course) will be contained in the HTTP request.
-Osma
--
Osma Suominen
D.S
/documentation/query/text-query.html#uid-field-and-automatic-document-deletion
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
rable.
Nothing is said in these 2 pages:
https://jena.apache.org/documentation/notes/typed-literals.html
https://jena.apache.org/documentation/query/text-query.html
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014
nfiguration.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
ially loading to TDB via tdbloader
and tdbloader2.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
rdflib in-memory store, IOMemory)
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
emental, it's a replacement in a single atomic
operation, so perhaps somewhat simpler than "deleting the old graph and
loading the triples of the .nt file into the graph afterwards" that you
suggested.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National
y (at least not efficiently) to compare
blank nodes in two graphs.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
ate Lucene installation. First
querying documents from Lucene index, then filtering the result sets
with additional meta fields using Jena. This setup is quite complicated
so was hoping a tighter integration to Jena would make things easier.
Br,
Mikael
On 22.11.2017 22:40, Osma Suominen wrote:
supported by jena-text. What's your use
case? How would you like to use it if it existed?
-Osma
Osma Suominen kirjoitti 22.11.2017 klo 22:37:
Hi Mikael!
Fuzzy search is a basic Lucene feature, just like prefix searches. You
should be able to use it directly via jena-text using a query
eady works right now.
-Osma
Mikael Pesonen kirjoitti 22.11.2017 klo 15:44:
Are there any plans on implementing similar text search for Jena?
Until similarity is implemented, is it possible to query similar texts
using Lucene directly, bypassing Jena, but with the same data set?
Br,
--
Osma
in
Apache SVN - from CVS at SF, SVN at SF and SVN at Apache.
Andy
--
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Libr
union default graph functionality [1]. It could be added
of course, just hasn't been.
-Osma
[1] https://github.com/rdfhdt/hdt-java/issues/3
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +
taset into a named graph, the size doesn't change. It's
still around 5GB.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
named graphs. More than
twice the space just because I decide to put the data in a named graph
instead of the default graph? And that seems to be the case both for
TDB1 (both tdbloader and tdbloader2) and TDB2.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library
sing a named graph
My larger goal is to decide whether to use TDB1 or TDB2 (or something
else, like HDT or Blazegraph...) for a new bibliographic Linked Data
service. Disk space is a factor (though not the most important one) in
the calculation.
-Osma
--
Osma Suominen
D.Sc. (Tech), Informati
5:35:
Hi Osma!
I'm currently running jena with param -Xmx3600M. What I read "GC
overhead limit exceeded" relates to java garbage collection, so maybe
upping memory is not the right solution here?
Br,
On 11.9.2017 15:29, Osma Suominen wrote:
Hi Mikael,
How much memory have
quot;;)) AS ?newURI)
}
}
and get this
Error 400: GC overhead limit exceeded
Fuseki - version 3.4.0 (Build date: 2017-07-17T11:43:07+)
There are less than million triplets this should affect. Is there
another solution than using limit?
Br,
--
Osma Suominen
D.Sc. (Tech), Information Syste
> i have a large tdb stored dataset with multiple graphs. can i move one
> graph from one to another dataset? what would be the command using http
> protokoll?
> thank you!
> andrew
>
>
>
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library o
tion here:
https://www.quora.com/Is-there-a-SPARQL-endpoint-or-triple-store-running-on-a-raspberry-pi
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
:label ]
) .
What happens in this case? Are the "uri" or "label" fields from both map_graph1 and map_graph2
merged? Do I have to call them with different names like "graph1_uri" and "graph2_uri"? Or are they
distinct? Because if they are merged, it's
phs (from
the same dataset) will be stored in one Lucene index, with the graph
IRIs of individual entities (triples/quads in practice) stored in the
graphField so that queries can be efficiently restricted to a particular
graph. You cannot set different jena-text options per graph.
-Osma
-
graph.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
tand if this is only for full-text searches. Or should I use one of these
indexes every time I use one the string functions
(https://www.w3.org/TR/sparql11-query/#func-strings) such as CONTAINS, LCASE,
etc.?
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finl
- Is "/query" the only endpoint that users need to know if they want to query
my graph?
Either that or "/sparql", in the default configuration they are defined
exactly the same way. Both accept SPARQL queries (not updates).
-Osma
--
Osma Suominen
D.Sc. (Tech), Informatio
e you
wouldn't worry about namespaces and such.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
never tried.
There is little reason to use anything else than Turtle, since that's
the most convenient syntax for human beings.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358
05.04.2017, 09:44, Laura Morales kirjoitti:
Thanks a lot, that fixed the problem! I guess this is a bug?
I'd rather call it a missing feature. There is no autodetection of RDF
syntax based on file content, the code uses the filename extension to
determine it.
-Osma
--
Osma Suominen
There
are many such libraries available (search for "jquery sparql" or
"javascript sparql") but I can't recommend any specific one since I
haven't really used them myself.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P
<#dataset> ;
.
<#dataset> rdf:type tdb:DatasetTDB ;
tdb:location "/home/myself/fusekidb" ;
# Query timeout on this dataset (1s, 1000 milliseconds)
ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1" ] ;
# Make the de
?ysoc (COUNT(DISTINCT ?w) AS ?count) WHERE {
?w schema:about ?ysoc .
FILTER(STRSTARTS(STR(?ysoc), 'http://www.yso.fi/onto/yso/'))
}
GROUP BY ?ysoc
ORDER BY DESC(?count)
LIMIT 20
--example query--
04.04.2017, 13:29, Osma Suominen kirjoitti:
04.04.2017, 13:10, Dave Reynolds kirjoitti:
of the
resulting TDB is 16GB.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
, I'll play around with HDT and Jena today to get some
more insights. )
>> Jena HDT is in-memory, right?
> Is it? I thought it was a on-disk, compressed, and query-able list of
quads...
>
--
Lorenz Bühmann
AKSW group, University of Leipzig
Group: h
to a
Lucene index.
However, a new Elasticsearch implementation for jena-text is currently
being developed (see https://issues.apache.org/jira/browse/JENA-1305).
That may become an alternative for jena-text/Solr users as well.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Speci
iple solutions
for each pattern.
-Osma
[1] http://api.finto.fi/download/yso/yso-skos.ttl
[2]
https://github.com/NatLibFi/Skosmos/blob/master/model/sparql/GenericSparql.php#L404
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
0
ng for
myself, I certainly want to get it included into Jena. It's just a question
of fitting it in correctly, which might take a bit of time.
---
A. Soroka
The University of Virginia Library
On Mar 1, 2017, at 1:27 PM, Osma Suominen
wrote:
Hi Anuj!
I have nothing against modularity in
t specify values for properties that P does not
have
Claude
On Thu, Mar 2, 2017 at 7:30 AM, Osma Suominen
wrote:
Hi Claude,
Do you mean something like this?
SELECT ?r
WHERE {
{ # must have at least one of A, B, C
{ ?r :p :A }
UNION
{ ?r :p :B }
UNION
{ ?r :p :B }
}
# mus
.
Any help would be appreciated,
Claude
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
you get into shading and the like. We have to do that for
Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a
thing we want to do any more of than needed, I don't think.
---
A. Soroka
The University of Virginia Library
[1] http://openjdk.java.net/projects/jigsaw/
O
Otherwise, yes, you get into shading and the like. We have to do that for
Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a
thing we want to do any more of than needed, I don't think.
---
A. Soroka
The University of Virginia Library
[1] http://openjdk.jav
dependency issues by including the Lucene
librarires that we included in our es specific pom. Have a look the pom of
jena-text-es module here to see how it can be done :
https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml
Thanks,
Anuj Kumar
On Wed, Mar 1, 2017 at 7:27 AM, Osma Suom
t with having a separate Module for Jena Text ES and see how
things go. If they go well, we could extract out Solr and Lucene out of
Jena Text.
Again this is just a suggestion based on my limited industry experience.
Thanks,
Anuj Kumar
On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen
wrote:
28.02
liar with how to set it up, and the
jena-text instructions are pretty vague unfortunately.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
2017 at 2:22 PM, Osma Suominen
wrote:
14.02.2017, 15:15, anuj kumar kirjoitti:
I will do it. But I need to first get the simple test working in order to
move forward. I hope I someone here can help me.
Maybe you need to add an implementWith declaration to TextAssembler.java?
-Osma
--
Osm
Sorry I meant SPARQL 1.1 Federated Query spec:
http://www.w3.org/TR/sparql11-federated-query/
-Osma
16.02.2017, 14:12, Osma Suominen kirjoitti:
Hi Sandor,
You need to do a federated query. See the SPARQL 1.1 Query spec.
Something like this:
SELECT *
WHERE {
SERVICE <http://other-endpo
FROM
WHERE {GRAPH
{?s ?p ?o . }
}
but it provides empty set:
-
| s | p | o |
=
-
What is wrong?
Thanks in advance,
Sandor
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSING
14.02.2017, 15:15, anuj kumar kirjoitti:
I will do it. But I need to first get the simple test working in order to
move forward. I hope I someone here can help me.
Maybe you need to add an implementWith declaration to TextAssembler.java?
-Osma
--
Osma Suominen
D.Sc. (Tech), Information
"DRY" comment in
the code showing that somebody else has thought about it too.
Also it might be helpful to try to reuse all the Lucene unit tests for
ES as well, if you can figure out a way to do that.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Lib
Lorenz Bühmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi
-30 files
with .dat and .idn prefix.
Could you please help me out ? how can I load them into fuseki now?
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@h
have a .nt file size 10G and I want to upload it into fuseki server as
TDB structure not in-memory.
After running the server, if I upload it one-time I get SessionTimesOut
error, how can I address this problem?
Please help me what is your recommendation?
Regards,
Reihan
--
Osma Suominen
D.Sc
68 -117.12865 32.56668 -117.13865) .
}
On 9 January 2017 at 09:25, Osma Suominen wrote:
Hi Samur,
Can you report this to JIRA with a reproducible full description? I.e.
link to the data set you used, your jena-spatial index configuration,
Jena/Fuseki versions you used, and of course the que
all have the same issue.
I think there are some merging happening after accessing the index that
make it slow.
I did not look into the code to see where is the issue.
Best,
On 9 January 2017 at 09:12, Osma Suominen wrote:
Hi Samur,
I agree, that's really slow. I wonder if that's so
eki INFO [6] exec/select
[2017-01-09 09:07:34] Fuseki INFO [6] 200 OK (5.334 s)
I wonder why is so slow if the lucene index is so fast and the result set
contains only 17 resources.
On 9 January 2017 at 08:55, Osma Suominen wrote:
Hi Samur!
Does it help if you drop the DISTINCT?
-O
<http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?place{
?place spatial:intersectBox (32.55668 -117.12865 32.56668 -117.13865) .
}
The same query in solr/lucene takes only 20ms.
I wonder why fuseki or jena spatial is so slow.
Any clue about it?
--
Osma Suominen
D.Sc. (Tech),
!
java.lang.Exception: Unexpected exception,
expected but
was
at
org.apache.jena.riot.tokens.TestTokenizer.tokenFirst(TestTok
enizer.java:45)
at
org.apache.jena.riot.tokens.TestTokenizer.tokenUnit_iri18(Te
stTokenizer.java:205)
The code you are compiling is behind the Apache codebase.
It looks like it could
-Osma
3.12.2016, 17:35, Samur Araujo kirjoitti:
Is there any plan to migrate Jena/Fuseki for Lucene 5 or 6?
Any fork available that have done the migration already?
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELS
Sorry I misinterpreted the StackOverflow link, so please ignore the part
about the version. I'm assuming you are using a recent Fuseki version.
-Osma
29.11.2016, 10:55, Osma Suominen kirjoitti:
Hi Abhishek,
What are the contents of the Lucene index directory (called "Lucene&quo
sue a year ago as in this thread
http://thread.gmane.org/gmane.comp.apache.jena.user/7892 but there are no
solutions there as well.
Can someone please help here?
Thanks & Regards
Abhishek Kumar
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
Hi Andy,
Sure thing! https://issues.apache.org/jira/browse/JENA-1261
-Osma
14.11.2016, 19:28, Andy Seaborne kirjoitti:
Osma,
Could you raise a JIRA so this does not get overlooked?
Thanks
Andy
On 14/11/16 11:12, Osma Suominen wrote:
Hi,
I noticed a behavior change in the Jena 3.1.1
he file specified by --data does end up in the default graph.
I can easily work around this by not using this combination of options
(after all, --graph is the more explicit way of loading data into the
default graph), I was just surprised when a script broke because of this
change.
-Os
least get the
update to version 5 merged into Jena. At the very minimum, making a PR
against Jena would indicate (from a legal perspective) that you wish to
contribute the work to Apache Jena, so that others can make use of it.
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
Nati
w triples are added for the same subject, but its label is
unchanged, then the text index won't see the update and thus the count
of references/triples won't be updated either.
I may be wrong here, I'm not sure how the update tracking works.
-Osma
--
Osma Suominen
D.Sc. (Tech),
I'll have to implement also the callback for updates
like class TextDocProducerTriples in Jena-text.
2016-11-01 13:59 GMT+01:00 Osma Suominen :
Hi Jean-Marc,
The wildcard queries etc. are basic Lucene features, part of Lucene query
syntax, so probably that's why they not documented
02.11.2016, 13:08, Osma Suominen wrote:
It's possible that a MINUS expression could be used instead of FILTER
NOT EXISTS and perform better. I will have to test this. But other than
switching to MINUS, I can't think of any way to express this constraint
on collections without using so
(A,C))
now if A is an complex expression, that is a bad idea (probably).
If A is a small VALUES block then it makes sense. It isn't done though.
Ok. So a potential future optimization perhaps.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finl
both weightings.
So, in the short term I have to figure out how to add weights to the Lucene
- Jena index.
Then I have to read what dbPedia lookup does, and other background material.
2016-10-31 16:42 GMT+01:00 Osma Suominen :
Hi Jean-Marc,
Depending on what exactly you want from such a se
thing wrong?
-Osma
On 01/11/16 11:03, Osma Suominen wrote:
Hi,
I'm investigating a performance regression we're seeing with the current
Jena 3.1.1-SNAPSHOT compared to 3.1.0.
The data in graph <http://www.yso.fi/onto/yso/> is the YSO ontology,
available from http://api.finto.
to affect query evaluation
order in this way? It appears to me that in the slow version, ?uri is
not bound inside the inner FILTER NOT EXISTS, which causes an explosion
of results internally.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Bo
bib/reshaperdf/blob/master/src/main/java/org/gesis/reshaperdf/cmd/correct/CorrectCommand.java
On 27/10/16 13:24, james anderson wrote:
good afternoon;
On 2016-10-27, at 11:46, Osma Suominen wrote:
Hi Andy!
On 27/10/16 12:21, Andy Seaborne wrote:
Shouldn't the conversion to triples check t
here is a code snippet here
http://stackoverflow.com/questions/120180/how-to-do-query-auto-completion-suggestions-in-lucene
but a regular Lucene API may exist.
[1] https://github.com/dbpedia/lookup
[2]
https://github.com/jmvanel/semantic_forms/blob/master/doc/en/administration.md#populating-with-dbpedi
e) which can do recovery, reporting and
splitting the output between good and bad. The current parser can't
output to different places. It should be easy to register it as a
replacement for the standard one.
Okay. I will think about this. But most likely I'll just use a separate
rocess.
JSON-LD is 3rd party system : jsonld-java.
Looks to me like Jena is not checking the output from that as it creates
the Jena objects because "ParserProfileChecker" is checking for triple
problems (literals as subjects etc) and assumes it's input terms are valid.
1 - 100 of 179 matches
Mail list logo