Re: When querying ElasticSearch, score is 0

2015-04-18 Thread Andrejs Abele
Thank you  for the information.
Cheers,
Andrejs
On 04/18/2015 10:23 AM, Nick Pentreath wrote:
> ES-hadoop uses a scan & scroll search to efficiently retrieve large
> result sets. Scores are not tracked in a scan and sorting is not
> supported hence 0 scores.
>
> http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan
>
>
>
> —
> Sent from Mailbox 
>
>
> On Thu, Apr 16, 2015 at 10:46 PM, Andrejs Abele
>  > wrote:
>
> Hi,
> I have data in my ElasticSearch server, when I query it using rest
> interface, I get results and score for each result, but when I run
> the same query in spark using ElasticSearch API,  I get results
> and meta data, but the score is shown 0 for each record.
> My configuration is
>
> ...
> val conf = new SparkConf()
>   .setMaster("local[6]")
>   .setAppName("DBpedia to ElasticSearch")
>   .set("es.index.auto.create", "true")
>   .set("es.field.read.empty.as.null","true")
>   .set("es.read.metadata","true")
>
> ...
> val sc = new SparkContext(conf) 
> val test= Map("query"->"{\n\"query\":{\n \"fuzzy_like_this\" : {\n 
> \"fields\" : [n
> >\"label\"],\n \"like_text\" : \"102nd Ohio Infantry\" }\n  } \n}")
> val mYRDD = sc.esRDD("dbpedia/docs",test.get("query").get)
>
> Sample output:
> Map(id -> "http://dbpedia.org/resource/Alert,_Ohio";, label -> "Alert, 
> Ohio", category -> "Unincorporated communities in Ohio", abstract -> "Alert 
> is an unincorporated community in southern Morgan Township, Butler County, 
> Ohio, in the United States. It is located about ten miles southwest of 
> Hamilton on Howards Creek, a tributary of the Great Miami River in section 28 
> of R1ET3N of the Congress Lands. It is three miles west of Shandon and two 
> miles south of Okeana.", _metadata -> Map(_index -> dbpedia, _type -> docs, 
> _id -> AUy5aQs7895C6HE5GmG4, _score -> 0.0))
> As you can see _score is 0.
>
> Would appreciate any help,
>
> Cheers,
> Andrejs 
>
>



Re: When querying ElasticSearch, score is 0

2015-04-18 Thread Nick Pentreath
ES-hadoop uses a scan & scroll search to efficiently retrieve large result 
sets. Scores are not tracked in a scan and sorting is not supported hence 0 
scores.





http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan










—
Sent from Mailbox

On Thu, Apr 16, 2015 at 10:46 PM, Andrejs Abele
 wrote:

> Hi,
> I have data in my ElasticSearch server, when I query it using rest
> interface, I get results and score for each result, but when I run the
> same query in spark using ElasticSearch API,  I get results and meta
> data, but the score is shown 0 for each record.
> My configuration is
> ...
> val conf = new SparkConf()
>   .setMaster("local[6]")
>   .setAppName("DBpedia to ElasticSearch")
>   .set("es.index.auto.create", "true")
>   .set("es.field.read.empty.as.null","true")
>   .set("es.read.metadata","true")
> ...
> val sc = new SparkContext(conf) 
> val test= Map("query"->"{\n\"query\":{\n \"fuzzy_like_this\" : {\n \"fields\" 
> : [\"label\"],\n \"like_text\" : \"102nd Ohio Infantry\" }\n  } \n}")
> val mYRDD = sc.esRDD("dbpedia/docs",test.get("query").get)
> Sample output:
> Map(id -> "http://dbpedia.org/resource/Alert,_Ohio";, label -> "Alert, Ohio", 
> category -> "Unincorporated communities in Ohio", abstract -> "Alert is an 
> unincorporated community in southern Morgan Township, Butler County, Ohio, in 
> the United States. It is located about ten miles southwest of Hamilton on 
> Howards Creek, a tributary of the Great Miami River in section 28 of R1ET3N 
> of the Congress Lands. It is three miles west of Shandon and two miles south 
> of Okeana.", _metadata -> Map(_index -> dbpedia, _type -> docs, _id -> 
> AUy5aQs7895C6HE5GmG4, _score -> 0.0))
> As you can see _score is 0.
> Would appreciate any help,
> Cheers,
> Andrejs 

When querying ElasticSearch, score is 0

2015-04-16 Thread Andrejs Abele
Hi,
I have data in my ElasticSearch server, when I query it using rest
interface, I get results and score for each result, but when I run the
same query in spark using ElasticSearch API,  I get results and meta
data, but the score is shown 0 for each record.
My configuration is

...
val conf = new SparkConf()
  .setMaster("local[6]")
  .setAppName("DBpedia to ElasticSearch")
  .set("es.index.auto.create", "true")
  .set("es.field.read.empty.as.null","true")
  .set("es.read.metadata","true")

...
val sc = new SparkContext(conf) 
val test= Map("query"->"{\n\"query\":{\n \"fuzzy_like_this\" : {\n \"fields\" : 
[\"label\"],\n \"like_text\" : \"102nd Ohio Infantry\" }\n  } \n}")
val mYRDD = sc.esRDD("dbpedia/docs",test.get("query").get)

Sample output:
Map(id -> "http://dbpedia.org/resource/Alert,_Ohio";, label -> "Alert, Ohio", 
category -> "Unincorporated communities in Ohio", abstract -> "Alert is an 
unincorporated community in southern Morgan Township, Butler County, Ohio, in 
the United States. It is located about ten miles southwest of Hamilton on 
Howards Creek, a tributary of the Great Miami River in section 28 of R1ET3N of 
the Congress Lands. It is three miles west of Shandon and two miles south of 
Okeana.", _metadata -> Map(_index -> dbpedia, _type -> docs, _id -> 
AUy5aQs7895C6HE5GmG4, _score -> 0.0))

As you can see _score is 0.

Would appreciate any help,

Cheers,
Andrejs