Hello Jo, thanks!


I'm really happy to see such vibrant support and community for Spotlight :)



"


For concepts like "leardership", the model will likely not spot it with the 
default confidence (0.5), but if you want more concepts like this (at the 
expense of having possibly more irrelevant spots), you can spot with lower 
confidence value:




http://spotlight.sztaki.hu:2222/rest/annotate?text=Leadership+is+a+skill&;
confidence=0.1
(http://spotlight.sztaki.hu:2222/rest/annotate?text=Leadership+is+a+skill&confidence=0.1)


"



I know, I tried that. As I wrote in my last message, trouble is I cannot 
replicate this result locally, on localhost, using the same "2+2" model (I 
think). As my CURL there shows, I'm actually setting confidence to 0.




Any idea what I'm doing wrong?




Cheers,

Radim







 
"



(http://spotlight.sztaki.hu:2222/rest/annotate?text=Leadership+is+a+skill&confidence=0.1)
 




As for Lucene, I do not recommend using it. For specific data sets, Lucene 
still seems better than the statistical approach, but the latter is much 
faster, simpler and performs well on a lot of test sets (and there are 
models for languages other than English).




Jo








On Sat, Jun 7, 2014 at 12:49 PM, David Przybilla <[email protected]
(mailto:[email protected])> wrote:
"
1. I think someone can confirm if I am right. I think those are different 
models versions in which some things have been pruned, fro example there are
surfaceoforms which because of their probabilities are not spottable. In 
order to gain some memory they might have been pruned. I use en_2+2

2. I've never used the lucene version. But here are a few reasons to use the
statistical version:  (i) because the latest version uses around 10G/9G of 
heap. (ii) Because the tools like the idio model editor works with the 
statistical version. (iii)  because there is active development on it.

3. As far as I know there are no tools for that. I guess the spotlight's 
team provide a vanilla model which you have to tweak for your specific 
needs. Idio's editor will do half of the job in this case.

4. Probably the topic and the surface forms for "Leadership" are in the 
model. Most likely it gets filtered cause its surfaceform does not have 
enough probability. 
I've been using spotlight 0.6 and 0.7 (both statistical models) and I can 
tell spotlight is good at getting specific topics i.e: Barack Obama, Europe 
as oppose to more general ones such as : President, Food or Table.
One way you can try to detect at what level is the problem with "Leadership"
is to use the endpoints : "Candidates" and "Spot":



i.e: 

http://spotlight.sztaki.hu:2222/rest/spot?text=Leadership%20is%20a%20skill
(http://spotlight.sztaki.hu:2222/rest/spot?text=Leadership%20is%20a%20skill)
 
- spot will tell you the candidate surface forms found in the text
- candidates will show you the list of candidate topics for the spotted 
surfaceForms.









On Sat, Jun 7, 2014 at 11:31 AM, Radim Rehurek <[email protected]
(mailto:[email protected])> wrote:
" 
And one more concrete question:



Why are some DBPedia concepts missing from spotlight results?




For example, "Leadership" is under http://dbpedia.org/page/Leadership
(http://dbpedia.org/page/Leadership) , but is detected neither with the en2+
2 nor en4+8  statistical models. It looks like it doesn't even spot the 
surface form, which is strange.




Using the (Lucene?) demo at http://dbpedia-spotlight.github.io/demo/
(http://dbpedia-spotlight.github.io/demo/) , "Leadership" is detected 
without problems.





Cheers,

Radim



---------- Původní zpráva ----------
Od: Radim Rehurek <[email protected](mailto:[email protected])>
Komu: [email protected]
(mailto:[email protected])
Datum: 7. 6. 2014 11:59:40
Předmět: [Dbp-spotlight-users] spotlight concepts

"


Hello all,


I am a new spotlight user -- thanks for the awesome project!





I'd appreciate help with some basic concepts:




1. I read the "statistical" paper [0], and ran the "statistical" backend, 
but I'm not clear on which parameters it uses. What is the difference 
between en2+2 or en4+8, does it use the language-dependent or independent 
version? How do I switch to the language-dependent one?




2. I tried Lucene spotlight 0.5, and it runs out of heap space even with the
smallest spotter dict (threshold 75), with Xmx 22GB, debian with openjdk. Is
that normal or am I doing it wrong? I followed the installation instructions
at [1]. Is there any advantage to the Lucene implementation anyway? When 
should I use Lucene vs. statistical?




3. Is there a list of tools like idio's "spotlight editor" [2]? In 
particular I'm interested in ways to manually add new "things" to the DB, or
manually tweak detections that are wrong. Is there any support out of the 
box? Are there other ways, apart from that idio's editor?




Thank you again for the great tool,

Radim




[0] http://jodaiber.de/doc/entity.pdf(http://jodaiber.de/doc/entity.pdf)

[1] https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Installation
(https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Installation)

[2] https://github.com/idio/spotlight-model-editor
(https://github.com/idio/spotlight-model-editor)






----------------------------------------------------------------------------
--
 Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech_______________________________________________
(http://p.sf.net/sfu/NeoTech_______________________________________________)
Dbp-spotlight-users mailing list
[email protected]
(mailto:[email protected])
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
(https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users)"


----------------------------------------------------------------------------
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech(http://p.sf.net/sfu/NeoTech)
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
(mailto:[email protected])
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
(https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users)

"





----------------------------------------------------------------------------
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech(http://p.sf.net/sfu/NeoTech)
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
(mailto:[email protected])
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
(https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users)

"



----------------------------------------------------------------------------
--
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users";

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to