[
https://issues.apache.org/jira/browse/SOLR-9480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475025#comment-16475025
]
Hoss Man commented on SOLR-9480:
--------------------------------
Updated patch...
This includes cleanup of some test & javadoc nocommits, but the biggest change
is renaming {{skg(...)}} to {{relatedness(...)}} -- that's the best name I
could come up with.
It occured to me I never really posted a full example of what generating an SKG
looks like with this approach of implementing relatedness as an Aggregate
function, so here's a complete request/response example using stackexchange
"scifi" data...
{noformat}
curl -sS -X POST http://localhost:8983/solr/scifi/query -d
'rows=0&q=type:QUESTION&fore=body:%22harry+potter%22&back=*:*&json.facet={
tags : {
type : terms,
field : tags,
limit : 5,
sort : { skg: desc },
facet : {
skg : "relatedness($fore,$back)",
body : {
type : terms,
field : body,
limit : 5,
sort : { skg: desc },
facet : {
skg : "relatedness($fore,$back)"
}
}
}
}
}'
{noformat}
{noformat}
{
"responseHeader":{
"status":0,
"QTime":4402,
"params":{
"q":"type:QUESTION",
"json.facet":"{\n tags : {\n type : terms,\n field : tags,\n
limit : 5,\n sort : { skg: desc },\n facet : {\n skg :
\"relatedness($fore,$back)\",\n body : {\n type : terms,\n field :
body,\n limit : 5,\n sort : { skg: desc },\n facet : {\n skg
: \"relatedness($fore,$back)\"\n }\n }\n }\n }\n}",
"back":"*:*",
"rows":"0",
"fore":"body:\"harry potter\""}},
"response":{"numFound":46598,"start":0,"docs":[]
},
"facets":{
"count":46598,
"tags":{
"buckets":[{
"val":"harry-potter",
"count":5141,
"skg":{
"relatedness":0.70795,
"foreground_popularity":0.01113,
"background_popularity":0.03627},
"body":{
"buckets":[{
"val":"potter",
"count":1715,
"skg":{
"relatedness":0.83699,
"foreground_popularity":0.01113,
"background_popularity":0.03555}},
{
"val":"harry",
"count":2944,
"skg":{
"relatedness":0.76488,
"foreground_popularity":0.01113,
"background_popularity":0.07392}},
{
"val":"deathly",
"count":516,
"skg":{
"relatedness":0.41314,
"foreground_popularity":0.0017,
"background_popularity":0.01308}},
{
"val":"hallows",
"count":525,
"skg":{
"relatedness":0.4125,
"foreground_popularity":0.00171,
"background_popularity":0.01333}},
{
"val":"hogwarts",
"count":1061,
"skg":{
"relatedness":0.39054,
"foreground_popularity":0.00229,
"background_popularity":0.02585}}]}},
{
"val":"jk-rowling",
"count":107,
"skg":{
"relatedness":0.23501,
"foreground_popularity":3.7E-4,
"background_popularity":7.5E-4},
"body":{
"buckets":[{
"val":"attender",
"count":1,
"skg":{
"relatedness":0.4322,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"escapers",
"count":1,
"skg":{
"relatedness":0.4322,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"l'etat",
"count":1,
"skg":{
"relatedness":0.4322,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"mugglenet's",
"count":1,
"skg":{
"relatedness":0.4322,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"pocketeded",
"count":1,
"skg":{
"relatedness":0.4322,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}}]}},
{
"val":"the-cursed-child",
"count":60,
"skg":{
"relatedness":0.23294,
"foreground_popularity":2.7E-4,
"background_popularity":4.2E-4},
"body":{
"buckets":[{
"val":"cursed",
"count":45,
"skg":{
"relatedness":0.6238,
"foreground_popularity":2.6E-4,
"background_popularity":0.00459}},
{
"val":"delphi",
"count":10,
"skg":{
"relatedness":0.50766,
"foreground_popularity":5.0E-5,
"background_popularity":2.9E-4}},
{
"val":"scorpius",
"count":14,
"skg":{
"relatedness":0.48154,
"foreground_popularity":7.0E-5,
"background_popularity":6.9E-4}},
{
"val":"neutralising",
"count":1,
"skg":{
"relatedness":0.479,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"noselessness",
"count":1,
"skg":{
"relatedness":0.479,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}}]}},
{
"val":"voldemort",
"count":460,
"skg":{
"relatedness":0.21765,
"foreground_popularity":7.6E-4,
"background_popularity":0.00324},
"body":{
"buckets":[{
"val":"potter",
"count":118,
"skg":{
"relatedness":0.44277,
"foreground_popularity":7.6E-4,
"background_popularity":0.03555}},
{
"val":"voldemort",
"count":384,
"skg":{
"relatedness":0.42619,
"foreground_popularity":6.7E-4,
"background_popularity":0.03074}},
{
"val":"harry",
"count":278,
"skg":{
"relatedness":0.33236,
"foreground_popularity":7.6E-4,
"background_popularity":0.07392}},
{
"val":"948",
"count":1,
"skg":{
"relatedness":0.32771,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"chernyshov",
"count":1,
"skg":{
"relatedness":0.32771,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}}]}},
{
"val":"spells",
"count":175,
"skg":{
"relatedness":0.19104,
"foreground_popularity":4.0E-4,
"background_popularity":0.00123},
"body":{
"buckets":[{
"val":"bitingly",
"count":1,
"skg":{
"relatedness":0.42157,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"centrari",
"count":1,
"skg":{
"relatedness":0.42157,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"counterspelling",
"count":1,
"skg":{
"relatedness":0.42157,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"effectivly",
"count":1,
"skg":{
"relatedness":0.42157,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}},
{
"val":"expelliarus",
"count":1,
"skg":{
"relatedness":0.42157,
"foreground_popularity":1.0E-5,
"background_popularity":1.0E-5}}]}}]}}}
{noformat}
Now that the randomized tests seem really reliable, I'll work on refactoring
the Slot collection vs distributed Merging to reduce code duplication ... but
in general I think this is getting really close to being committable.
> Graph Traversal for Significantly Related Terms (Semantic Knowledge Graph)
> --------------------------------------------------------------------------
>
> Key: SOLR-9480
> URL: https://issues.apache.org/jira/browse/SOLR-9480
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Trey Grainger
> Priority: Major
> Attachments: SOLR-9480.patch, SOLR-9480.patch, SOLR-9480.patch,
> SOLR-9480.patch
>
>
> This issue is to track the contribution of the Semantic Knowledge Graph Solr
> Plugin (request handler), which exposes a graph-like interface for
> discovering and traversing significant relationships between entities within
> an inverted index.
> This data model has been described in the following research paper: [The
> Semantic Knowledge Graph: A compact, auto-generated model for real-time
> traversal and ranking of any relationship within a
> domain|https://arxiv.org/abs/1609.00464], as well as in presentations I gave
> in October 2015 at [Lucene/Solr
> Revolution|http://www.slideshare.net/treygrainger/leveraging-lucenesolr-as-a-knowledge-graph-and-intent-engine]
> and November 2015 at the [Bay Area Search
> Meetup|http://www.treygrainger.com/posts/presentations/searching-on-intent-knowledge-graphs-personalization-and-contextual-disambiguation/].
> The source code for this project is currently available at
> [https://github.com/careerbuilder/semantic-knowledge-graph], and the folks at
> CareerBuilder (where this was built) have given me the go-ahead to now
> contribute this back to the Apache Solr Project, as well.
> Check out the Github repository, research paper, or presentations for a more
> detailed description of this contribution. Initial patch coming soon.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]