ES-hadoop uses a scan & scroll search to efficiently retrieve large result 
sets. Scores are not tracked in a scan and sorting is not supported hence 0 
scores.





http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan










—
Sent from Mailbox

On Thu, Apr 16, 2015 at 10:46 PM, Andrejs Abele
<andrejs.ab...@insight-centre.org> wrote:

> Hi,
> I have data in my ElasticSearch server, when I query it using rest
> interface, I get results and score for each result, but when I run the
> same query in spark using ElasticSearch API,  I get results and meta
> data, but the score is shown 0 for each record.
> My configuration is
> ...
> val conf = new SparkConf()
>   .setMaster("local[6]")
>   .setAppName("DBpedia to ElasticSearch")
>   .set("es.index.auto.create", "true")
>   .set("es.field.read.empty.as.null","true")
>   .set("es.read.metadata","true")
> ...
> val sc = new SparkContext(conf) 
> val test= Map("query"->"{\n\"query\":{\n \"fuzzy_like_this\" : {\n \"fields\" 
> : [\"label\"],\n \"like_text\" : \"102nd Ohio Infantry\" }\n  } \n}")
> val mYRDD = sc.esRDD("dbpedia/docs",test.get("query").get)
> Sample output:
> Map(id -> "http://dbpedia.org/resource/Alert,_Ohio";, label -> "Alert, Ohio", 
> category -> "Unincorporated communities in Ohio", abstract -> "Alert is an 
> unincorporated community in southern Morgan Township, Butler County, Ohio, in 
> the United States. It is located about ten miles southwest of Hamilton on 
> Howards Creek, a tributary of the Great Miami River in section 28 of R1ET3N 
> of the Congress Lands. It is three miles west of Shandon and two miles south 
> of Okeana.", _metadata -> Map(_index -> dbpedia, _type -> docs, _id -> 
> AUy5aQs7895C6HE5GmG4, _score -> 0.0))
> As you can see _score is 0.
> Would appreciate any help,
> Cheers,
> Andrejs 

Reply via email to