Discrete value aggregations on a URL field

2014-09-12 Thread Ali Kheyrollahi
Hi,

I am trying to find numbers of discrete value per URL in a day and the 
result is not what I expect.
So let's say I have an index which contains such document:

{
   date: ...,
   url: ,
other...
}

And basically I am trying to group by url for a particular date:

{
  query:
  {
range:{date: {gte:2014-09-08, lte:2014-09-09}}  
  },
  aggregations:
  {
mt_agg:
{
  terms: {field: url}
}
  }
}

Result is bizarre, I mean it breaks my URL into its segments and aggregates 
on that. Do I need to use Hash of the URL (I prefer not to)? Here is the 
result:

aggregations: {
shabash: {
buckets: [
{
key: http,
doc_count: 903
},
{
key: rss,
doc_count: 638
},
{
key: service,
doc_count: 381
},
{
key: zzz.fff,
doc_count: 337
},
{
key: e,
doc_count: 153
},
{
key: xxx.com,
doc_count: 153
},
{
key: www.yyy,
doc_count: 153
},
{
key: fa,
doc_count: 127
},
{
key: feed,
doc_count: 119
},
{
key: www.nnn.com,
doc_count: 71
}
]
}
}


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ac784f35-d8ee-4fe5-979f-de1ca7446da0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Discrete value aggregations on a URL field

2014-09-12 Thread Ali Kheyrollahi
OK, it seems that I need to use not_analyzed on the field. Is that correct?

On Friday, 12 September 2014 08:18:19 UTC+1, Ali Kheyrollahi wrote:

 Hi,

 I am trying to find numbers of discrete value per URL in a day and the 
 result is not what I expect.
 So let's say I have an index which contains such document:

 {
date: ...,
url: ,
 other...
 }

 And basically I am trying to group by url for a particular date:

 {
   query:
   {
 range:{date: {gte:2014-09-08, lte:2014-09-09}}  
   },
   aggregations:
   {
 mt_agg:
 {
   terms: {field: url}
 }
   }
 }

 Result is bizarre, I mean it breaks my URL into its segments and 
 aggregates on that. Do I need to use Hash of the URL (I prefer not to)? 
 Here is the result:

 aggregations: {
 shabash: {
 buckets: [
 {
 key: http,
 doc_count: 903
 },
 {
 key: rss,
 doc_count: 638
 },
 {
 key: service,
 doc_count: 381
 },
 {
 key: zzz.fff,
 doc_count: 337
 },
 {
 key: e,
 doc_count: 153
 },
 {
 key: xxx.com,
 doc_count: 153
 },
 {
 key: www.yyy,
 doc_count: 153
 },
 {
 key: fa,
 doc_count: 127
 },
 {
 key: feed,
 doc_count: 119
 },
 {
 key: www.nnn.com,
 doc_count: 71
 }
 ]
 }
 }




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e453b450-3329-476c-9102-852af3180745%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Discrete value aggregations on a URL field

2014-09-12 Thread Magnus Bäck
On Friday, September 12, 2014 at 09:23 CEST,
 Ali Kheyrollahi alios...@gmail.com wrote:

On Friday, 12 September 2014 08:18:19 UTC+1, Ali Kheyrollahi wrote:

  I am trying to find numbers of discrete value per URL in a day and
  the result is not what I expect.

[...]

  Result is bizarre, I mean it breaks my URL into its segments
  and aggregates on that. Do I need to use Hash of the URL (I prefer
  not to)?

 OK, it seems that I need to use not_analyzed on the field. Is that
 correct?

Yes.

-- 
Magnus Bäck| Software Engineer, Development Tools
magnus.b...@sonymobile.com | Sony Mobile Communications

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20140912085425.GA9172%40seldlx20533.corpusers.net.
For more options, visit https://groups.google.com/d/optout.