Github user osma commented on the issue:

    https://github.com/apache/jena/pull/227
  
    I tested the ES backend with some non-toy SKOS data, namely 
[YSO](http://finto.fi/en/yso/). I configured the entity definition to index the 
predicates `skos:prefLabel`, `skos:altLabel` and `skos:hiddenLabel`. The 
dataset has 520k triples and 29k entities. There are in total 150k triples with 
these label properties.
    
    I'm using a rather old laptop (i3-2330M with SSD) for the test. Ubuntu 
16.04, ES 5.2.1.
    
    Using the ES backend, indexing this dataset took about 25 minutes:
    ```
    16:42:45 INFO  [1] PUT http://localhost:3030/ds/data?default
    17:08:06 INFO  [1] 204 No Content (1 521,465 s) 
    ```
    
    Looking at process stats, most of the time was spent by ES. It spent about 
38 minutes CPU time.
    
    I also indexed the same dataset using the Lucene backend. It took less than 
30 seconds:
    ```
    17:11:26 INFO  [1] PUT http://localhost:3030/ds/data?default
    17:11:55 INFO  [1] 204 No Content (28,237 s) 
    ```
    
    Query performance seems to be pretty much the same, in fact the ES backend 
seems slightly faster than the Lucene backend but there was a lot of variance 
so I can't tell for sure.
    
    I have my doubts about whether the indexing performance is acceptable for 
real world use cases like what @anujgandharv is targeting, but I don't think 
this should stop us from merging this contribution. Since there have been no 
objections, I will proceed with the merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to