Re: Java Client Aggregation question

2015-02-20 Thread David Pilato
Yes. Change your mapping and define the field as not_analyzed.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 20 févr. 2015 à 12:32, Matt Williams  a écrit :
> 
> Hi all,
> 
> I am currently indexing tags (industries) for an entity with a data structure 
> like this:
> 
> industry: ["Consulting & Recruitment","Professional Services","Education & 
> Training"] 
> I am applying a termsAggregation to the query as:
> 
> AggregationBuilders.terms("industry").field("industry");
> What I expect to come out:
> 
> Key: "Consulting & Recruitment"
> 
> docCount: 100
> 
> What I actually get:
> 
> Key: "Consulting"
> 
> docCount: 100
> 
> Key: "Recruitment"
> 
> docCount: 100.
> 
> Is there a way to correct this?
> 
> Thanks
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/e86c02ab-9699-41ea-88b3-871ed761dad6%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6A33DAF3-C491-4EBA-8C1D-77F83C52160F%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Java Client Aggregation question

2015-02-20 Thread Matt Williams
Hi all,

I am currently indexing tags (industries) for an entity with a data 
structure like this:

industry: ["Consulting & Recruitment","Professional Services","Education & 
Training"] 

I am applying a termsAggregation to the query as:

AggregationBuilders.terms("industry").field("industry");

What I expect to come out:

Key: "Consulting & Recruitment"

docCount: 100

What I actually get:

Key: "Consulting"

docCount: 100

Key: "Recruitment"

docCount: 100.

Is there a way to correct this?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e86c02ab-9699-41ea-88b3-871ed761dad6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation question

2014-02-21 Thread mooky
Excellent. Thanks!

On Tuesday, 18 February 2014 15:28:32 UTC, Binh Ly wrote:
>
> Yes, the correct way would be to index intentLocationDescription as a 
> multi-field. You don't have to introduce it as multiple fields in your 
> source document. All you need to do is on the ES mapping, you set that 
> field to a multi-field, once as whatever analyzed you want, and the other 
> as not_analyzed. You can see an example here:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3
>
> Wherein you have 2 fields in the index derived from 1 single field in your 
> JSON source. The "name" field is analyzed. And then the "name.raw" field is 
> not_analyzed which is what you want to aggregate on.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70acc449-62d8-4230-8da8-6aabf206d5cd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation question

2014-02-18 Thread Binh Ly
Yes, the correct way would be to index intentLocationDescription as a 
multi-field. You don't have to introduce it as multiple fields in your 
source document. All you need to do is on the ES mapping, you set that 
field to a multi-field, once as whatever analyzed you want, and the other 
as not_analyzed. You can see an example here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3

Wherein you have 2 fields in the index derived from 1 single field in your 
JSON source. The "name" field is analyzed. And then the "name.raw" field is 
not_analyzed which is what you want to aggregate on.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b0b8f26-0775-4d6c-9376-faab0e03b106%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Aggregation question

2014-02-17 Thread mooky

I am using aggregation to give me some reports of items (count) aggregated 
by various properties (location or market (or both)).

I am using a term aggregator.

A simplified example of my data looks like this:

{
   "intentLocationCode": "SHANG",
   "intentLocationDescription": "Shanghai area",
   "intentMarketDescription": "China Stainless Steel Exchange",
   "intentMarketCode": "CSSX",
}

Lets say I want to aggregate by Intent Location:

My aggregation looks like this
{
   "aggregations" : {
  "intentLocations" : {
 "terms" : { "field" : "intentLocationCode" }
  }
   }
}

And the result looks something like this:
{
   "aggregations": {
  "intentLocations": {
 "buckets": [
{
   "key": "shang",
   "doc_count": 12
},
{
   "key": "anotherlocation",
   "doc_count": 8760
},
{
   "key": "loc42",
   "doc_count": 4773
},
{
   "key": "area51",
   "doc_count": 821
}
 ]
  }
   }
}

However, in the results I would like something like:
{
   "aggregations": {
  "intentLocations": {
 "buckets": [
{
   "key": "Shanghai area",
   "doc_count": 12
},
{
   "key": "Another Location Where Copper Is Stored",
   "doc_count": 8760
},
{
   "key": "The 42nd Stainless Steel Storage Company",
   "doc_count": 4773
},
{
   "key": "Area 51",
   "doc_count": 821
}
 ]
  }
   }
}

ie I really want to the value of the intentLocationDescription field as the 
key rather than the code. But obviously, doing a term aggregation on 
description is going to give me very different results (unless I index 
description with not_analyzed)
However, I do want to analyse intentLocationDescription for decent search 
behaviour.

Is there a trick to achieve this with aggregations?
Or do I have to index intentLocationDescriptionTwice (analysed and not 
analysed)?
(I don't really want to be doing any post-processing to match code and 
description - because that would involve a reference data lookup that will 
kill performance)

Cheers.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/965c337e-3b12-4603-880d-acf9a1860ed7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.