Need help/suggestion for the massive queries user case

2014-09-01 Thread Yuheng Du
Hi guys,

I have some streaming sensor data as input to ES. For each incoming data 
message, I need to do a query on the historic data in ES according to the 
'timestamp' and 'messageId' in that message.  I need to get the aggregated 
query results in real-time.

My problem is each data message may sprout 28 queries at approximately the 
same time and I get [No-nodes available] error from ES frequently. 

Can anyone suggest what is the cause of this error and how can I fix it? Or 
should I use some other system to handle this problem like apache storm?

Thanks!

Yuheng

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6dace925-d2d8-443f-bc9d-1e333ef96890%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get distinct data

2014-09-01 Thread vineeth mohan
Hello Alex ,

Term aggregation is here to save your day -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation

Thanks
  Vineeth


On Tue, Sep 2, 2014 at 12:07 PM, Alex T  wrote:

> Hi all!
>
> I have problem with getting unique data from elasticsearch. I have the
> following documents:
>
> [
> {
>  "message": "Message 1",
>  "author": {
>   "id": 4,
>   "name": "Author Name"
>   },
>   "sourceId": "123456789",
>   "userId": "123456"
> },
> {
>  "message": "Message 1",
>  "author": {
>   "id": 4,
>   "name": "Author Name"
>   },
>   "sourceId": "123456789",
>   "userId": "654321"
> }
> ]
>
> Different between this documents in userId. When I send query by "
> author.id", I get response with 2 documents.
>
> Can I get distinct data by sourceId field?
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1b59b2a2-484b-46cc-a95b-695e84e6d6eb%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5n_qZicXZp--7jX5vaip%3DJdadCrPQU0wzKmCYL494xmmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to Speed Up Indexing

2014-09-01 Thread xiehaiwei
Hi, 

  "If ’Word Segmentation‘ is the problem" - means, word 
segmentation analyzer speed is not good, 
about 1MB/s when runs independently.  In our case, many fields of a 
document need  to be segment.

"more machines with a shard" - Will a shard be running in multi 
nodes?  Do you mean  with a cluster?

Thanks.
Haiwei

>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d601d98-18c9-4e63-bffc-6948a072e30a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Get distinct data

2014-09-01 Thread Alex T
Hi all!

I have problem with getting unique data from elasticsearch. I have the 
following documents:

[
{
 "message": "Message 1",
 "author": {
  "id": 4,
  "name": "Author Name"
  },
  "sourceId": "123456789",
  "userId": "123456"
},
{
 "message": "Message 1",
 "author": {
  "id": 4,
  "name": "Author Name"
  },
  "sourceId": "123456789",
  "userId": "654321"
}
]

Different between this documents in userId. When I send query by 
"author.id", I get response with 2 documents.

Can I get distinct data by sourceId field?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1b59b2a2-484b-46cc-a95b-695e84e6d6eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Transport Client connectedNodes() duplicates

2014-09-01 Thread David Pilato
Interesting. May be you could open an issue for that?

Something like "Transport Client with sniff duplicates nodes"?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 2 sept. 2014 à 04:50, Stefan Will  a écrit :

Hi,

for testing purposes, I've started up a stand-alone Elasticsearch node on my 
laptop, and am using the transport client to connect to it. When I initialize 
the client using "sniff=true", and then print out the list of connected nodes, 
as follows:

TransportClient client = new TransportClient(
  ImmutableSettings.builder()
.put("client.transport.ignore_cluster_name",true)
.put("client.transport.sniff",true)
);

client.addTransportAddress(new 
InetSocketTransportAddress("192.168.1.5",9300));

for(DiscoveryNode node: client.connectedNodes()) {
  System.out.println(node);
}

I get two nodes in the output:

[#transport#-1][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]
[Diablo][15AyTluTS4Wj26tKgkyQDA][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]

These are the same node listed twice, presumably once as the node that I added 
via "addTransportAddress()", and once as sniffed out by the sniffer. You can 
tell that the latter one is more detailed, and includes the actual node id and 
name, while the former is just the basic network information. Notice also that 
they both refer to the same IP and port combination.

When I run the same test, but with sniff=false, I get one node, but somewhat 
surprisingly, the generic "transport" node has been dropped from the list:

[Diablo][15AyTluTS4Wj26tKgkyQDA][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]

I this the expected behavior ? Why does the sniffer not simply *replace* the 
node info like the client does with sniff=false ? Why does it essentially leave 
the node in there twice, apparently ignoring the fact that there are two 
entries with the exact same connection info ?

I looked at the code, and apparently it dedupes nodes based on ID, which in 
this case are obviously different (#transport#-1 vs 15AyTluTS4Wj26tKgkyQDA), 
but ignores the fact that the IP and port are identical.

My expectation was that with sniff=true, it would replace the initial with the 
one sniffed from the cluster, and then add any additional nodes it has 
discovered, but in reality, I end up with 2x the nodes when I add my entire 
cluster via addTransportAddress().

Thanks,

Stefan



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f7ca562-8a91-469f-98d3-08e8a27d4ea1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E8DBD877-D828-44DF-968C-63F6EEDBB247%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: How to Speed Up Indexing

2014-09-01 Thread vineeth mohan
Hello ,

One tip from my experience -


   1. Disable refresh before bulk indexing and enable it once its done. ES
   waits for 1 second and then make all documents which are indexed during
   that time , searchable. -
   
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html#bulk
   2. Reduce replica to 0 while bulk indexing.
   3. Increase number of  machines and add the shard number . The indexing
   is happening in parallel. So more machines with a shard in it will help.

"If ’Word Segmentation‘ is the problem" - Please elaborate.

Thanks
Vineeth


On Tue, Sep 2, 2014 at 10:16 AM,  wrote:

> Hi all,
>  In our ES system,  one line of a Mysql table  will be indexing  as a
> document, but indexing speed is slow.
>
> My Questions:
> 1) how fast of using BulkAPI indexing compared with single indexing?
> 2) If ’Word Segmentation‘ is the problem, how to deal it?
> 3) Can I use multi nodes of ES cluster to parallelly indexing in one
> Index?
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4f7eae49-1bee-4bdd-9a8c-c9d1178fccdc%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kP_adjHNyMoC5-VTzt6%2ByX8bEhfWmH3KFaCtDYiSQ8Mg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


How to Speed Up Indexing

2014-09-01 Thread xiehaiwei
Hi all,
 In our ES system,  one line of a Mysql table  will be indexing  as a 
document, but indexing speed is slow.

My Questions:
1) how fast of using BulkAPI indexing compared with single indexing?
2) If ’Word Segmentation‘ is the problem, how to deal it?
3) Can I use multi nodes of ES cluster to parallelly indexing in one Index? 

Thanks. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4f7eae49-1bee-4bdd-9a8c-c9d1178fccdc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
Though this is not the solution, thanks for your advice.


2014-09-02 11:31 GMT+08:00 vineeth mohan :

> Hello ,
>
> Couple of options here.
> Add the count aggregation -
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-valuecount-aggregation.html
>
> {
>   "aggregations": {
> "aggs": {
>   "value_count": {
> "field": "names"
>   }
> }
>   }
> }
>
> Or , set the size as 0 for the aggregation , it will return all the
> buckets -
> http://stackoverflow.com/questions/22927098/list-all-buckets-on-an-elasticsearch-aggregation
> This might not be a good idea in case the number of buckets are too large.
>
> But again , the number of buckets seems to be a basic requirement and i
> feel there should be some other better way to do it.
> I will file an issue , if i dont find one.
>
> Thanks
>   Vineeth
>
>
>
> On Tue, Sep 2, 2014 at 8:13 AM, panfei  wrote:
>
>> Hi vineeth, thanks for the tips, but I really want to know the number of
>> the buckets generated by the aggs. a  bucket_count's value (like the
>> doc_count's value)  in the response.
>>
>>
>> 2014-09-02 10:00 GMT+08:00 vineeth mohan :
>>
>>>  Hello ,
>>>
>>> Setting search_type to count avoids executing the fetch phase of the
>>> search making the request more efficient. See Search Type
>>> 
>>> for more information on the search_type parameter.
>>>
>>> Thanks
>>>Vineeth
>>>
>>>
>>> On Tue, Sep 2, 2014 at 7:26 AM, panfei  wrote:
>>>
 To some degree, I can archive this goal by setting size: 0 in the terms
 aggs to list all the generated buckets, but it really consumes much memory,
 so is there any internal API to do this job ?

 Thanks.


 2014-09-02 9:38 GMT+08:00 panfei :

 we do a aggregation, and we only want to get the number of buckets it
> generated, is there any way to archive this goal ?
>
> Thanks
>
> --
> 不学习,不知道
>



 --
 不学习,不知道

 --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLD6PPAaYyc19rsdy4pO0MbpxJBYbpBqoteZ%3DMQN3diKcg%40mail.gmail.com
 
 .

 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DFLQQVJHMj0AzNr4%3DL3G5vv3h%3DR9FzXY-5k8CEoeorXg%40mail.gmail.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> 不学习,不知道
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLDwhNFgLaECiiW5siF08o%3DaMmpAL05Xbu3n2F%2BrB%3DibRg%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kV9Qi%3D4UWY4HG0E4b9mefT8NnUpUWJ7GgfgjEJBcXbhg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
不学习,不知道

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroup

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread vineeth mohan
Hello ,

Couple of options here.
Add the count aggregation -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-valuecount-aggregation.html

{
  "aggregations": {
"aggs": {
  "value_count": {
"field": "names"
  }
}
  }
}

Or , set the size as 0 for the aggregation , it will return all the buckets
-
http://stackoverflow.com/questions/22927098/list-all-buckets-on-an-elasticsearch-aggregation
This might not be a good idea in case the number of buckets are too large.

But again , the number of buckets seems to be a basic requirement and i
feel there should be some other better way to do it.
I will file an issue , if i dont find one.

Thanks
  Vineeth



On Tue, Sep 2, 2014 at 8:13 AM, panfei  wrote:

> Hi vineeth, thanks for the tips, but I really want to know the number of
> the buckets generated by the aggs. a  bucket_count's value (like the
> doc_count's value)  in the response.
>
>
> 2014-09-02 10:00 GMT+08:00 vineeth mohan :
>
>> Hello ,
>>
>> Setting search_type to count avoids executing the fetch phase of the
>> search making the request more efficient. See Search Type
>> 
>> for more information on the search_type parameter.
>>
>> Thanks
>>Vineeth
>>
>>
>> On Tue, Sep 2, 2014 at 7:26 AM, panfei  wrote:
>>
>>> To some degree, I can archive this goal by setting size: 0 in the terms
>>> aggs to list all the generated buckets, but it really consumes much memory,
>>> so is there any internal API to do this job ?
>>>
>>> Thanks.
>>>
>>>
>>> 2014-09-02 9:38 GMT+08:00 panfei :
>>>
>>> we do a aggregation, and we only want to get the number of buckets it
 generated, is there any way to archive this goal ?

 Thanks

 --
 不学习,不知道

>>>
>>>
>>>
>>> --
>>> 不学习,不知道
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLD6PPAaYyc19rsdy4pO0MbpxJBYbpBqoteZ%3DMQN3diKcg%40mail.gmail.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DFLQQVJHMj0AzNr4%3DL3G5vv3h%3DR9FzXY-5k8CEoeorXg%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> 不学习,不知道
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLDwhNFgLaECiiW5siF08o%3DaMmpAL05Xbu3n2F%2BrB%3DibRg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kV9Qi%3D4UWY4HG0E4b9mefT8NnUpUWJ7GgfgjEJBcXbhg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Transport Client connectedNodes() duplicates

2014-09-01 Thread Stefan Will
Hi,

for testing purposes, I've started up a stand-alone Elasticsearch node on 
my laptop, and am using the transport client to connect to it. When I 
initialize the client using "sniff=true", and then print out the list of 
connected nodes, as follows:

TransportClient client = new TransportClient(
  ImmutableSettings.builder()
.put("client.transport.ignore_cluster_name",true)
.put("client.transport.sniff",true)
);

client.addTransportAddress(new 
InetSocketTransportAddress("192.168.1.5",9300));

for(DiscoveryNode node: client.connectedNodes()) {
  System.out.println(node);
}

I get two nodes in the output:

[#transport#-1][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]
[Diablo][15AyTluTS4Wj26tKgkyQDA][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]

These are the same node listed twice, presumably once as the node that I 
added via "addTransportAddress()", and once as sniffed out by the sniffer. 
You can tell that the latter one is more detailed, and includes the actual 
node id and name, while the former is just the basic network information. 
Notice also that they both refer to the same IP and port combination.

When I run the same test, but with sniff=false, I get one node, but 
somewhat surprisingly, the generic "transport" node has been dropped from 
the list:

[Diablo][15AyTluTS4Wj26tKgkyQDA][Stefans-MacBook-Pro.local][inet[/192.168.1.5:9300]]

I this the expected behavior ? Why does the sniffer not simply *replace* 
the node info like the client does with sniff=false ? Why does it 
essentially leave the node in there twice, apparently ignoring the fact 
that there are two entries with the exact same connection info ?

I looked at the code, and apparently it dedupes nodes based on ID, which in 
this case are obviously different (#transport#-1 
vs 15AyTluTS4Wj26tKgkyQDA), but ignores the fact that the IP and port are 
identical.

My expectation was that with sniff=true, it would replace the initial with 
the one sniffed from the cluster, and then add any additional nodes it has 
discovered, but in reality, I end up with 2x the nodes when I add my entire 
cluster via addTransportAddress().

Thanks,

Stefan


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f7ca562-8a91-469f-98d3-08e8a27d4ea1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
Hi vineeth, thanks for the tips, but I really want to know the number of
the buckets generated by the aggs. a  bucket_count's value (like the
doc_count's value)  in the response.


2014-09-02 10:00 GMT+08:00 vineeth mohan :

> Hello ,
>
> Setting search_type to count avoids executing the fetch phase of the
> search making the request more efficient. See Search Type
> 
> for more information on the search_type parameter.
>
> Thanks
>Vineeth
>
>
> On Tue, Sep 2, 2014 at 7:26 AM, panfei  wrote:
>
>> To some degree, I can archive this goal by setting size: 0 in the terms
>> aggs to list all the generated buckets, but it really consumes much memory,
>> so is there any internal API to do this job ?
>>
>> Thanks.
>>
>>
>> 2014-09-02 9:38 GMT+08:00 panfei :
>>
>> we do a aggregation, and we only want to get the number of buckets it
>>> generated, is there any way to archive this goal ?
>>>
>>> Thanks
>>>
>>> --
>>> 不学习,不知道
>>>
>>
>>
>>
>> --
>> 不学习,不知道
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLD6PPAaYyc19rsdy4pO0MbpxJBYbpBqoteZ%3DMQN3diKcg%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DFLQQVJHMj0AzNr4%3DL3G5vv3h%3DR9FzXY-5k8CEoeorXg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
不学习,不知道

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLDwhNFgLaECiiW5siF08o%3DaMmpAL05Xbu3n2F%2BrB%3DibRg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread vineeth mohan
Hello ,

Setting search_type to count avoids executing the fetch phase of the search
making the request more efficient. See Search Type

for more information on the search_type parameter.

Thanks
   Vineeth


On Tue, Sep 2, 2014 at 7:26 AM, panfei  wrote:

> To some degree, I can archive this goal by setting size: 0 in the terms
> aggs to list all the generated buckets, but it really consumes much memory,
> so is there any internal API to do this job ?
>
> Thanks.
>
>
> 2014-09-02 9:38 GMT+08:00 panfei :
>
> we do a aggregation, and we only want to get the number of buckets it
>> generated, is there any way to archive this goal ?
>>
>> Thanks
>>
>> --
>> 不学习,不知道
>>
>
>
>
> --
> 不学习,不知道
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLD6PPAaYyc19rsdy4pO0MbpxJBYbpBqoteZ%3DMQN3diKcg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DFLQQVJHMj0AzNr4%3DL3G5vv3h%3DR9FzXY-5k8CEoeorXg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
To some degree, I can archive this goal by setting size: 0 in the terms
aggs to list all the generated buckets, but it really consumes much memory,
so is there any internal API to do this job ?

Thanks.


2014-09-02 9:38 GMT+08:00 panfei :

> we do a aggregation, and we only want to get the number of buckets it
> generated, is there any way to archive this goal ?
>
> Thanks
>
> --
> 不学习,不知道
>



-- 
不学习,不知道

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLD6PPAaYyc19rsdy4pO0MbpxJBYbpBqoteZ%3DMQN3diKcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


refresh and flush time reference in the cluster or in the node or in the index or in the other?

2014-09-01 Thread fiefdx yang
I know ES execute refresh every second at the default configuration.
I know ES execute flush every 30 minutes or trigger by translog at the 
default configuration.
But I do not know who give the time reference and it is what level in the 
ES framework?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f99fa4d9-c72a-468b-9dac-8e15ea1297ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is there any way to get the total number of buckets a aggs generated ?

2014-09-01 Thread panfei
we do a aggregation, and we only want to get the number of buckets it
generated, is there any way to archive this goal ?

Thanks

-- 
不学习,不知道

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLCU7Pj3YF2Oy2SXH%2Brc4kxMMY87dRxoqes_wgAUvs8chg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: phrase suggester's sort mode

2014-09-01 Thread Nikolas Everett
Sounds like a useful feature then!  I'll certainly review it if you send a
pull request though I can't merge stuff.
On Sep 1, 2014 1:57 PM, "Heval Azizoğlu"  wrote:

> Yeah i know that actually currently i am manipulating term suggester to
> get what i want but this feature is quite basic, i think, that I thought
> maybe i am missing something and there is an easy way to do that. Thanks
> for the answer.
>
>
> On Sun, Aug 31, 2014 at 2:18 AM, Nikolas Everett 
> wrote:
>
>> You'd have to write a plugin or patch to elasticsearch. Plugin would be
>> easier in the short run but a patch in elasticsearch is more likely to be
>> of higher quality because of the code review process.
>> On Aug 30, 2014 4:13 PM, "Heval Azizoğlu" 
>> wrote:
>>
>>>  Hi,
>>>
>>> Is there any way to change the way direct generators outputs
>>> combination? It uses most frequent terms to produce top results, but i want
>>> to use calculated scores to sort the generator results and combine the top
>>> ones? Any body know how?
>>>
>>> Thanks.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>>
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/1918ad6c-73d3-4f81-a35b-cb96364ce900%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/YBjy7POumbU/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd18MJtEEyAwqv8LNS58DRoHLeMbcvS-ryM6UGJ0dcUKxg%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPJuJmkXBHJeDGJF7ZcYLoYxZRcRJSKK6yYdadR0GU9pg_Tmtw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1YLJkCvJkizEUA3-%2BJWvyn5B60Sh8fNPdK2wOWsXX%2BPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need Help: Upgrade of ES + Large queries = new CPU overload

2014-09-01 Thread Scott Decker
well, in case anyone wants to know, it was because we had
_cache:true
and
_cache_key:

items in our filter sets.
basically because they are known filters that do not change.

for some reason, having this set caused huge amounts of cpu usage. not sure 
what was happening behind the scenes, but this was our culprit.  will have 
to look into the code and see what causes this to cause such an issue.


On Monday, September 1, 2014 7:50:35 AM UTC-7, Scott Decker wrote:
>
> Hey all,
>   We have been testing the new 1.3.1 release on our current load and 
> queries, and have found that under same conditions, same queries, the es 
> cluster we have just starts to max out cpu and the thread pools fill up and 
> the query times just keep going up until eventually we have the restart 
> nodes just to clear things.
> on our older (.20.6) version, we do have big queries. Think 100+ terms, 
> but they were all wrapped in a filter and cached. We almost never did any 
> scoring, and if we did, it was only on a few terms.
> so, a query may look like the following:
>
> "query": {
> "filtered": {
>   "query": {
> "constant_score": {
>   "query": {
> "bool": {
>   "must": [
> {
>   "bool": {
> "should": {
>   "bool": {
> "should": [
>   {
> "term": {
>   "content": "smyrna"
> }
>   },
>   {
> "term": {
>   "title": "smyrna"
> }
>   }
> ]
>   }
> }
>   }
> }
>   ]
> }
>   },
>   "boost": 1
> }
>   },
>   "filter": {
> "bool": {
>   "must": [
> {
>   "bool": {
> "must": [
>   {
> "bool": {
>   "must": {
> "bool": {
>   "should": [
> {
>   "bool": {
> "must": [
>   {
> "terms": { here}
>
> and this filter is broken up into multiple sections, each filter is 
> given a cache name and cache key
>
>
> so, what could have changed between .20.6 and 1.3.1 that would cause this 
> sort of non-scored filter query to suddenly spend so much cpu time running?
> i did a thread dump and it is setting multiple threads in the .scorer 
> state of the filteredquery.  
> not sure if that matters.
>
>
> any help in trying to figure out where the es is spending its time on all 
> of this would be helpful. we at least have marvel up and running now and 
> that tells us that cpu gets pegged and the avg query times, but not sure 
> how to start debugging the query side to see could be changed under the 
> hoods to cause such a drastic change.
>
> Thanks,
> Scott
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/19fee079-ce3e-4e77-a4b4-7e95c35f9d98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


aggregation of hierchical elements possible?

2014-09-01 Thread skippi1
The index has a field named "path" which contains the canonical file name,
e.g.:

/a/file1
/a/file2
/a/b/file3

Is it possible to create an bucket aggregation to summarize all file per
path including subfolders?

Something like that:

/a => 3 files
/a/b => 1 file

regars,
markus





--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: phrase suggester's sort mode

2014-09-01 Thread Heval Azizoğlu
Yeah i know that actually currently i am manipulating term suggester to get
what i want but this feature is quite basic, i think, that I thought maybe
i am missing something and there is an easy way to do that. Thanks for the
answer.


On Sun, Aug 31, 2014 at 2:18 AM, Nikolas Everett  wrote:

> You'd have to write a plugin or patch to elasticsearch. Plugin would be
> easier in the short run but a patch in elasticsearch is more likely to be
> of higher quality because of the code review process.
> On Aug 30, 2014 4:13 PM, "Heval Azizoğlu"  wrote:
>
>> Hi,
>>
>> Is there any way to change the way direct generators outputs combination?
>> It uses most frequent terms to produce top results, but i want to use
>> calculated scores to sort the generator results and combine the top ones?
>> Any body know how?
>>
>> Thanks.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/1918ad6c-73d3-4f81-a35b-cb96364ce900%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/YBjy7POumbU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd18MJtEEyAwqv8LNS58DRoHLeMbcvS-ryM6UGJ0dcUKxg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPJuJmkXBHJeDGJF7ZcYLoYxZRcRJSKK6yYdadR0GU9pg_Tmtw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: how to use aggregations with filter?

2014-09-01 Thread Markus Breuer
Can someone see this post on mailinglist? I see "currently not accepted by
mailing list", but I am subribed to it.



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/how-to-use-aggregations-with-filter-tp4062464p4062773.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1409587665257-4062773.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Hadoop] Hadoop plugin indices with ':' colon - not able to snapshot (?)

2014-09-01 Thread Costin Leau

Hi,

':' has a special meaning in an URI, which is what HDFS uses. You basically have to either escape the character (%3A) or 
use a different character.

Potentially you can rename the file to the desired name by running a separate 
command after the job has been completed.

However, no URI/URL can be constructed from your file-name, in the format you desire, so you'll have to encode/decode 
the location every-time.


On 9/1/14 8:10 PM, Mateusz Kaczynski wrote:

Whenever I try to take a snapshot of an index which contains a ':' colon in its 
name,I end up with the following trace:

{
 "error":"IllegalArgumentException[java.net.URISyntaxException: Relative 
path in absolute URI: crawl:1]; nested:
URISyntaxException[Relative path in absolute URI: crawl:1]; ",
 "status":500
}

It does not matter if the 'indices' argument is provided or not, the snapshot 
name is set to 'snapshot'. I have tried to
specify the index escaping the sign with '%3a' but in this case the name does 
not fit any available indices.
I assume this is related to https://issues.apache.org/jira/browse/HDFS-13 
filenames with ':' colon throws
java.lang.IllegalArgumentException?

The question is, is there a way to somehow escape the character (if not within 
the request then perhaps code itself?)
and if so, would creating a feature request make sense?

Many thanks,
Mateusz


--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ce4b5ffb-f473-4862-8b67-bec1f08bd840%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5404AE44.3010800%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


[Hadoop] Hadoop plugin indices with ':' colon - not able to snapshot (?)

2014-09-01 Thread Mateusz Kaczynski
Whenever I try to take a snapshot of an index which contains a ':' colon in 
its name,I end up with the following trace: 

{
"error":"IllegalArgumentException[java.net.URISyntaxException: Relative 
path in absolute URI: crawl:1]; nested: URISyntaxException[Relative path in 
absolute URI: crawl:1]; ",
"status":500
}

It does not matter if the 'indices' argument is provided or not, the 
snapshot name is set to 'snapshot'. I have tried to specify the index 
escaping the sign with '%3a' but in this case the name does not fit any 
available indices.
 
I assume this is related to https://issues.apache.org/jira/browse/HDFS-13 
filenames with ':' colon throws java.lang.IllegalArgumentException?

The question is, is there a way to somehow escape the character (if not 
within the request then perhaps code itself?) and if so, would creating a 
feature request make sense?

Many thanks,
Mateusz


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce4b5ffb-f473-4862-8b67-bec1f08bd840%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANNOUNCEMENT] - spring-elasticsearch 1.3.0 released

2014-09-01 Thread David Pilato
I am pleased to announce the spring-elasticsearch-1.3.0 release!

Spring factories for Elasticsearch

Changes in this version include:



Changes:
o Update to Elasticsearch 1.3.0.  Issue: 
https://github.com/dadoonet/spring-elasticsearch/issues/47. 
o Update to Spring 4.0.6.RELEASE.  Issue: 
https://github.com/dadoonet/spring-elasticsearch/issues/49. 
o Refactor integration tests.  Issue: 
https://github.com/dadoonet/spring-elasticsearch/issues/50. 
o Merging settings don't require anymore closing index.  Issue: 
https://github.com/dadoonet/spring-elasticsearch/issues/51. 



See documentation at: https://github.com/dadoonet/spring-elasticsearch/

Have fun!

--
David

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/562945376.01409589775332.JavaMail.dpilato%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Design advice for ES side-by-side with hadoop cluster?

2014-09-01 Thread Costin Leau

On 9/1/14 4:51 PM, bob.web...@gmail.com wrote:

Hi Guys,

I have a 16 node hadoop cluster, running Cloudera's community edition. All 16 
nodes are big powerful boxes with lots of
disk.



Can you provide some actual numbers? How much RAM per machine - how much is allocated to Hadoop, how much to ES and how 
much is actually free

(and thus usable by the OS)?


I would like to add ES to this cluster, but would like advice on how to 
configure/design the ES cluster.


There's a lot of useful information out there - I'll point to two great webinars, namely the pre-flight checklist [1] 
and getting started with

Elasticsearch [2]

Especially in I/O intensive environments, make sure the OS has enough RAM and that the file-system cache is not trashed 
since it has a big impact

(not just on ES but everything that accesses the disk).



I bulk load my data using PIG, which means Map-Reduce. What are the thoughts on 
reducers against ES master nodes? Should
I restrict me reducers to match ES master nodes?



Are you using Elasticsearch Hadoop? I'm asking since it's not the master nodes that matter but rather the data nodes. 
es-hadoop automatically writes
only to those nodes. Depending on how big is your bulk size and the number of reducers vs your cluster size, you can 
might be forced to limit the number

of tasks to avoid overloading the ES cluster.


Any thoughts of advice? At the moment my standard MR parameters kill the ES 
nodes.



See above - how are you writing the data to ES? How many shards do you have in the target index and what's the number of 
reducers writing to it at a certain point?
Marvel by the way (or any monitoring tool) helps a lot here since it eliminates guesses and actually offers insight into 
what's going on.


From the looks of it, it sounds like you are throwing too much data at one to the ES cluster and not retrying or 
adjusting the bulk.


Otherwise, consider using es-hadoop. Run a job, take a look at the metrics [3] and tune it accordingly. See also the 
troubleshooting page [4].


One last thing - make sure you use the latest ES - it has a _lot_ of 
improvements.

[1] http://www.elasticsearch.org/webinars/elasticsearch-pre-flight-checklist/
[2] http://www.elasticsearch.org/webinars/getting-started-with-elasticsearch/
[3] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/2.1.Beta/metrics.html
[4] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/2.1.Beta/troubleshooting.html


Thanks


--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2d541d83-5a79-4bc1-b5da-11065b9b568a%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54049C1B.7060108%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Thanks for all your help Vineeth

On Monday, September 1, 2014 5:03:38 PM UTC+1, vineeth mohan wrote:
>
> Hello Simon ,
>
> I hope you are familiar with Elasticsearch API.
> I believe by dashboard you are meaning Kibana.
> If that is the case , you wont be able to see this result there as Kibana 
> does not currently offers this level of flexibility.
>
> You will need to write JSON agg query based on - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#search-aggregations-bucket-datehistogram-aggregation
>
> There need not be any work on schma or index to use this.
> Hope that helps.
>
> Thanks
>   Vineeth
>
>
> On Mon, Sep 1, 2014 at 8:59 PM, Simon Edwards  > wrote:
>
>> Hello Vineeth,
>>
>> My question was regarding where to set up the date histograms. Do i 
>> simply add a  section to the dashboard JSON or do I need to 
>> update the index field mappings? If you could provide a small example 
>> that'd be great.
>>
>> Thanks for all your help!
>>
>>
>> On Monday, September 1, 2014 4:12:14 PM UTC+1, vineeth mohan wrote:
>>
>>> Hello Simon , 
>>>
>>> I didn't quite understand your question.
>>> Kindly elaborate. 
>>>
>>> Thanks
>>>   Vineeth
>>>
>>>
>>> On Mon, Sep 1, 2014 at 7:59 PM, Simon Edwards  
>>> wrote:
>>>
  Hello Vineeth,

 Many thanks for your reply, sounds like your method should work a treat!

 One quick noob question, where abouts are the date histograms 
 aggregations created? Are they created on the index or on the dashboard 
 itself? Any help is appreciated.

 Cheers,
 Si.


 On Monday, September 1, 2014 2:11:02 PM UTC+1, vineeth mohan wrote:

> Hello Simon , 
>
> I believe this can be done in this manner.
> Do 2 separate date histogram on the date_submitted field and 
> date_closed field.
> The sum of count of date_submitted minus the sum of  count on 
> date_closed on all the previous date should give you the number of open 
> issues for that week.
>
> For eg:
>
> Week1 - Open - 10 , closed - 5
> Week2 - Open 20 m ,closed 6
> Week3 - Open 30 , closed 10
>
> Number of open issues on 
> Week1 - 10
> Week2 - (20 + 10 ) - 5 = 25
> Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49
>
>
> Thanks
>   Vineeth
>
>
>
> On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards  
> wrote:
>
>>  Hi,
>>
>> I was wondering if somebody familiar with aggregations, particularly 
>> date histogram aggregations, can point me in the right direction.
>>
>> I'm currently looking to get a total count of records over a specific 
>> time period. Each record contains a "date_submitted" field and if 
>> they're 
>> closed, contain a "date_closed" field.
>>
>> Is it possible to aggregate the records based off these values (i.e. 
>> only showing open issues for a weekly period, even if the record was 
>> submitted a year ago)? If so where abouts are the aggregations 
>> specified? 
>> In the dashboard JSON or within the index mapping?
>>
>> Many thanks in advance.
>>
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, 
>> send an email to elasticsearc...@googlegroups.com.
>>
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%40goo
>> glegroups.com 
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/b9df0286-fac1-41be-89a5-be062e8a9f55%
 40googlegroups.com 
 
 .

 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/aadc5927-f268-4016-a8b9-da3d6512e0b4%40googlegroups.com
>>  
>> 

Re: How can we integration of hive(hadoop) and elasticsearch

2014-09-01 Thread Costin Leau

Hi,

Elasticsearch Hadoop might be of use - it's a connector that allows Hadoop jobs (Map/Reduce, Pig, Hive, Spark, 
Cascading) to communicate

with Elasticsearch.
The official home page is here [1] with links to the downloads and the docs. There was also a recent webinar held 
recently which you can find

here [2] and a short demo video from some time ago here [3].

Cheers,

[1] http://www.elasticsearch.org/hadoop
[2] http://www.elasticsearch.org/webinars/elasticsearch-and-apache-hadoop/
[3] 
http://www.elasticsearch.org/videos/search-and-analytics-with-hadoop-and-elasticsearch/

On 9/1/14 4:37 PM, Mohit Kumar Yadav wrote:

Hi folks,
I am trying to use hive concept with elasticsearch. I have successful install 
and run both hadoop(running on window
machine using vmware player) and elasticsearch
But I have no clue how I can integrate both thing.
Please can you provide me any link where I can get steps to integrate both 
thing. I have go through the elastic search
url but I am puzzled. Please suggest me.

Thanks in advance...!!!

Regrads
Mohit Kumar Yadav
(MCA/BBA)

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAK6eDKcd_tzjn83yvRdBq6L6HREtCVoNdp%2Btc3rHfh9bUXNPfg%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/540499D7.8090604%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Hadoop] Setting Document ID in Map Reduce Mapper

2014-09-01 Thread Costin Leau

Glad to hear the issue has been fixed - not sure how I've missed this email 
before.
It was probably addressed in the way the parameters are handled [1]

In the future, when encountering an unexpected behaviour/bug please file an issue on github as well to make sure it's 
not getting lost.


Thanks!

[1] https://github.com/elasticsearch/elasticsearch-hadoop/issues/223

On 9/1/14 4:15 PM, Juan Carlos Fernández wrote:

I had the same issue and it was solved using es-hadoop 2.0.1 instead 2.0.0. 
Looks like a solved bug but I couldn't find
anyone claiming it like an open bug neither closed.
Regards

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've 
seen in the documentation the
/es.mapping.id / propperty and I'm trying to set it 
in the conf part of the driver class the
same way i set the index and type of documents:

conf.set("es.resource", "logs/{event}");
conf.set("es.mapping.id ", "id");


In the Mapper class I put in the MapWritable object a new key value pair 
for each map:

MapWritable doc = new MapWritable();
String id = node+"|"+timestamp; //node and timestamp are two String 
values that I have.
doc.put(new Text("id"), new Text(id));


And as a result I can't write in ES and get exceptions with this message: 
JsonParseException[Unexpected character
('"' (code 34))

If I comment the es.mapping.id  line and allow ES to 
set the documents ID everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9c31846f-b141-4d7d-971b-d8a2b2c43843%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54049929.9040103%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon ,

I hope you are familiar with Elasticsearch API.
I believe by dashboard you are meaning Kibana.
If that is the case , you wont be able to see this result there as Kibana
does not currently offers this level of flexibility.

You will need to write JSON agg query based on -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#search-aggregations-bucket-datehistogram-aggregation

There need not be any work on schma or index to use this.
Hope that helps.

Thanks
  Vineeth


On Mon, Sep 1, 2014 at 8:59 PM, Simon Edwards 
wrote:

> Hello Vineeth,
>
> My question was regarding where to set up the date histograms. Do i simply
> add a  section to the dashboard JSON or do I need to update the
> index field mappings? If you could provide a small example that'd be great.
>
> Thanks for all your help!
>
>
> On Monday, September 1, 2014 4:12:14 PM UTC+1, vineeth mohan wrote:
>
>> Hello Simon ,
>>
>> I didn't quite understand your question.
>> Kindly elaborate.
>>
>> Thanks
>>   Vineeth
>>
>>
>> On Mon, Sep 1, 2014 at 7:59 PM, Simon Edwards  wrote:
>>
>>> Hello Vineeth,
>>>
>>> Many thanks for your reply, sounds like your method should work a treat!
>>>
>>> One quick noob question, where abouts are the date histograms
>>> aggregations created? Are they created on the index or on the dashboard
>>> itself? Any help is appreciated.
>>>
>>> Cheers,
>>> Si.
>>>
>>>
>>> On Monday, September 1, 2014 2:11:02 PM UTC+1, vineeth mohan wrote:
>>>
 Hello Simon ,

 I believe this can be done in this manner.
 Do 2 separate date histogram on the date_submitted field and
 date_closed field.
 The sum of count of date_submitted minus the sum of  count on
 date_closed on all the previous date should give you the number of open
 issues for that week.

 For eg:

 Week1 - Open - 10 , closed - 5
 Week2 - Open 20 m ,closed 6
 Week3 - Open 30 , closed 10

 Number of open issues on
 Week1 - 10
 Week2 - (20 + 10 ) - 5 = 25
 Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49


 Thanks
   Vineeth



 On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards 
 wrote:

>  Hi,
>
> I was wondering if somebody familiar with aggregations, particularly
> date histogram aggregations, can point me in the right direction.
>
> I'm currently looking to get a total count of records over a specific
> time period. Each record contains a "date_submitted" field and if they're
> closed, contain a "date_closed" field.
>
> Is it possible to aggregate the records based off these values (i.e.
> only showing open issues for a weekly period, even if the record was
> submitted a year ago)? If so where abouts are the aggregations specified?
> In the dashboard JSON or within the index mapping?
>
> Many thanks in advance.
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
>
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%40goo
> glegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/b9df0286-fac1-41be-89a5-be062e8a9f55%
>>> 40googlegroups.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/aadc5927-f268-4016-a8b9-da3d6512e0b4%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving

Re: Elasticsearch and Hive work together

2014-09-01 Thread Costin Leau

Actually they are completely different.
Hive is a library built on top of Hadoop that uses a SQL-like query language to 
transform (mainly read) data.
Elasticsearch is a real-time search and analytics engine.

You can read the docs of each library/product to see the differences or better yet, take a look at the various demos out 
there.
As for why folks use Hive and Elasticsearch? Because as a Hadoop user (using Hive) by using Elasticsearch one can 
leverage its

powerful search capabilities and easily slice and dice data.
Arguably one could do the same with Hive however it's not at all trivial - a 
simple example is doing geo-search.

Hope this clarifies this a bit.

P.S. Elasticsearch does not depend on Hadoop however it is integrated with Hadoop (Map/Reduce, Hive, Pig, Spark, 
Cascading ) through

Elasticsearch-Hadoop project.

On 8/31/14 2:01 PM, vak5d6 wrote:

I see that Hive and Elasticsearch are almost equivalent except that 
Elasticsearch supports near real time queries.
Moreover, Elasticsearch can run independently to store and analyze data. So why 
people use both Hive and Elasticsearch
on Hadoop ?

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d69148e8-603a-4dc8-9e67-c89d37a43dca%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/540498A4.5010202%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Issues when a node comes back after reboot on Azure

2014-09-01 Thread satishmallik
Hi,
We are trying to deploy elasticsearch for our production on Azure,

I am using following settings,

discovery.zen.ping.unicast.hosts: 
["saku-es-m01","saku-es-m02","saku-es-m03"]

discovery.zen.minimum_master_nodes: 2

 

node.tag: masternode

 

http.enabled: false

 

# Fixed settings

path.data: f:/data

path.logs: f:/logs

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.timeout: 15s

discovery.zen.fd.ping_retries: 10

discovery.zen.fd.ping_timeout: 45s

cluster.routing.allocation.cluster_concurrent_rebalance: 1


 

we are hitting an issue when a node shutdowns and comes back after reboot,

 

[2014-08-27 06:47:36,172][WARN ][transport.netty  ] [CS-NODE-M03] 
exception caught on transport layer [[id: 0x98a6ba59]], closing connection

java.nio.channels.UnresolvedAddressException

at sun.nio.ch.Net.checkAddress(Net.java:127)

at 
sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:613)

at 
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink.connect(NioClientSocketPipelineSink.java:108)

at 
org.elasticsearch.common.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:70)

at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:574)

at 
org.elasticsearch.common.netty.channel.Channels.connect(Channels.java:634)

at 
org.elasticsearch.common.netty.channel.AbstractChannel.connect(AbstractChannel.java:207)

at 
org.elasticsearch.common.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:229)

at 
org.elasticsearch.common.netty.bootstrap.ClientBootstrap.connect(ClientBootstrap.java:182)

at 
org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:681)

at 
org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:644)

at 
org.elasticsearch.transport.netty.NettyTransport.connectToNodeLight(NettyTransport.java:611)

at 
org.elasticsearch.transport.TransportService.connectToNodeLight(TransportService.java:133)

at 
org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$3.run(UnicastZenPing.java:279)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:724)

 

We are using unicast mechanism for node discovery as you can see from our 
config. If I restart ES service on all nodes this issue goes away. So using 
names inside unicast.hosts should not be an issue. 

Please do let me know your suggestions,

 Regards

Satish

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/071a3648-e803-42e1-83c3-da9a13d499e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch-Hadoop repository plugin Cloudera Hadoop 2.0.0-cdh4.6.0

2014-09-01 Thread Mateusz Kaczynski
(Much delayed) thank you Costin.

Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib 
directories in /etc/default/elasticsearch (and exporting it in 
/etc/init.d/elasticsearch) and installing light plugin version did work.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f3dc7f6a-3dc0-4793-af8e-ea4390540204%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch-Hadoop repository plugin Cloudera Hadoop 2.0.0-cdh4.6.0

2014-09-01 Thread Mateusz Kaczynski
(Much delayed) thank you Costin.

Indeed, on Ubuntu, changing ES_CLASSPATH to include hadoop and hadoop/lib 
directories in /etc/default/elasticsearch (and exporting it in 
/etc/init.d/elasticsearch) and installing light plugin version did work. 

On Thursday, 14 August 2014 20:59:39 UTC, Costin Leau wrote:
>
> Hi, 
>
> The hdfs repository relies on vanilla Hadoop 2.2 since that's the official 
> stable version of Yarn. Since you are using a 
> different 
> Hadoop version, use the 'light' version as explained in the docs - this 
> contains only the repository-hdfs, without the 
> Hadoop dependency 
> (since you already have it installed). 
>
> In other words, both the error that you see as well as the 2.0.1 
> (regarding JobLocalizer) seems to be related to the 
> differences between 
> vanilla Hadoop 2.2 and the one you are using. 
>
> Hope this helps, 
>
> On 8/14/14 7:36 PM, Mateusz Kaczynski wrote: 
> > I'm trying to get es-hadoop repository plugin working on our hadoop 
> 2.0.0-cdh4.6.0 distribution and it seems like I'm 
> > quite lost. 
> > 
> > I installed plugin's -hadoop2 version on the machines on our hadoop 
> cluster (which also run our stage elasticsearch nodes). 
> > 
> > When attempting to create a repository on one of the datanodes with: 
> > 
> > curl -XPUT 1.0.0.1:9200/_snapshot/hdfs -d '{"type":"hdfs", 
> "settings": {"uri": "hdfs://1.0.0.10:54310", 
> > "path":"/es_backup"}}' 
> > 
> > 
> > I end up with the logs being filled with the following error: 
> > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol 
> message contained an invalid tag (zero). 
> > at 
> com.google.protobuf.InvalidProtocolBufferException.invalidTag(InvalidProtocolBufferException.java:89)
>  
>
> > at 
> com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:108) 
> > at 
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.(RpcHeaderProtos.java:1398)
>  
>
> > at 
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.(RpcHeaderProtos.java:1362)
>  
>
> > at 
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:1492)
>  
>
> > at 
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:1487)
>  
>
> > at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) 
>
> > at 
> com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>  
>
> > at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>  
>
> > at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>  
>
> > at 
> com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) 
>
> > at 
> org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
>  
>
> > at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:996) 
> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891) 
> > 
> > Is it possible that this is caused by the incompatible hadoop versions 
> (2.2 used by plugin with 2.0 being installed) and 
> > it is necessary to get it build with downgraded version? 
> > 
> > Also, to build the jar, is it just 
> > 
> > gradle build -Pdistro=hadoopYarn ? 
> > 
> > Because release 2.0.1 does not quite work for me as it fails to find 
> JobLocalizer.class. 
> > 
> > Regards, 
> > Mateusz 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to 
> > elasticsearc...@googlegroups.com   elasticsearch+unsubscr...@googlegroups.com >. 
> > To view this discussion on the web visit 
> > 
> https://groups.google.com/d/msgid/elasticsearch/5aff7e2a-eb3e-4bb8-8698-05fec6a67e87%40googlegroups.com
>  
> > <
> https://groups.google.com/d/msgid/elasticsearch/5aff7e2a-eb3e-4bb8-8698-05fec6a67e87%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  
>
> > For more options, visit https://groups.google.com/d/optout. 
>
> -- 
> Costin 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ae0b343-72aa-459e-930e-559852c5d310%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hello Vineeth,

My question was regarding where to set up the date histograms. Do i simply 
add a  section to the dashboard JSON or do I need to update the 
index field mappings? If you could provide a small example that'd be great.

Thanks for all your help!

On Monday, September 1, 2014 4:12:14 PM UTC+1, vineeth mohan wrote:
>
> Hello Simon , 
>
> I didn't quite understand your question.
> Kindly elaborate. 
>
> Thanks
>   Vineeth
>
>
> On Mon, Sep 1, 2014 at 7:59 PM, Simon Edwards  > wrote:
>
>> Hello Vineeth,
>>
>> Many thanks for your reply, sounds like your method should work a treat!
>>
>> One quick noob question, where abouts are the date histograms 
>> aggregations created? Are they created on the index or on the dashboard 
>> itself? Any help is appreciated.
>>
>> Cheers,
>> Si.
>>
>>
>> On Monday, September 1, 2014 2:11:02 PM UTC+1, vineeth mohan wrote:
>>
>>> Hello Simon , 
>>>
>>> I believe this can be done in this manner.
>>> Do 2 separate date histogram on the date_submitted field and date_closed 
>>> field.
>>> The sum of count of date_submitted minus the sum of  count on 
>>> date_closed on all the previous date should give you the number of open 
>>> issues for that week.
>>>
>>> For eg:
>>>
>>> Week1 - Open - 10 , closed - 5
>>> Week2 - Open 20 m ,closed 6
>>> Week3 - Open 30 , closed 10
>>>
>>> Number of open issues on 
>>> Week1 - 10
>>> Week2 - (20 + 10 ) - 5 = 25
>>> Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49
>>>
>>>
>>> Thanks
>>>   Vineeth
>>>
>>>
>>>
>>> On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards  
>>> wrote:
>>>
  Hi,

 I was wondering if somebody familiar with aggregations, particularly 
 date histogram aggregations, can point me in the right direction.

 I'm currently looking to get a total count of records over a specific 
 time period. Each record contains a "date_submitted" field and if they're 
 closed, contain a "date_closed" field.

 Is it possible to aggregate the records based off these values (i.e. 
 only showing open issues for a weekly period, even if the record was 
 submitted a year ago)? If so where abouts are the aggregations specified? 
 In the dashboard JSON or within the index mapping?

 Many thanks in advance.

 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b9df0286-fac1-41be-89a5-be062e8a9f55%40googlegroups.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aadc5927-f268-4016-a8b9-da3d6512e0b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: _ttl inherited from index settings

2014-09-01 Thread David Pilato
You can do that I guess using index templates: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html#indices-templates

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 1 septembre 2014 à 16:58:32, Octavian (octavian.rinc...@gmail.com) a écrit:

Hello,

Is there any setting that specifies the _ttl for all documents from a index?

I saw that there is a setting per type, that works[1], but when I try to put 
the same setting per index, it doesn't work.

[1]http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html

Thank you,
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c12e1cb8-a0c8-4842-a955-4b6d15be0977%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.54048e38.7a6d8d3c.6ef9%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris ,

I am using ES 1.3.1
Can you give it a try on that version.

Thanks
 Vineeth


On Mon, Sep 1, 2014 at 6:59 PM, Chris Lees  wrote:

>
> Same result I'm afraid...
>
> {
> "took": 62,
> "timed_out": false,
> "_shards": {
> "total": 147,
> "successful": 144,
> "failed": 0
> },
> "hits": {
> "total": 9975671,
> "max_score": 1.0,
> "hits": [
> ... results removed as data is sensitive, but I see correct
> documents returned in here ...
> ]
> },
> "aggregations": {
> "aggs": {
> "buckets": []
> }
> }
> }
>
> I'm still running Elasticsearch 1.1.1 -- is this something that changed
> after that perhaps?
>
> Thanks for your help.
>
>
> On Monday, September 1, 2014 2:23:17 PM UTC+1, vineeth mohan wrote:
>
>> Hello Chris ,
>>
>> That is strange , its working fine on my side.
>>
>> Can you run the below and paste the result -
>>
>> curl -XPOST 'http://localhost:9200/_search' -d '{
>>   "aggregations": {
>> "aggs": {
>>   "terms": {
>> "field": "_index"
>>   }
>> }
>>   }
>> }'
>> Thanks
>>   Vineeth
>>
>>
>> On Mon, Sep 1, 2014 at 6:17 PM, Chris Lees  wrote:
>>
>>> Thanks Vineeth.
>>>
>>> Unfortunately it doesn't return any results in the aggregations result.
>>>
>>> Input query:
>>> GET _search
>>>
>>> {
>>>   "aggregations": {
>>> "aggs": {
>>>   "terms": {
>>> "field": "_index"
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Result JSON showing 26K hits (correct), but no index aggregations:
>>> {
>>>"took": 4,
>>>"timed_out": false,
>>>"_shards": {
>>>   "total": 57,
>>>   "successful": 57,
>>>   "failed": 0
>>>},
>>>"hits": {
>>>   "total": 26622,
>>>   "max_score": 1,
>>>   "hits": [...]
>>>},
>>>"aggregations": {
>>>   "aggs": {
>>>  "buckets": []
>>>
>>>   }
>>>}
>>> }
>>>
>>>
>>>
>>> On Monday, September 1, 2014 1:40:00 PM UTC+1, vineeth mohan wrote:
>>>
 Hello Chris ,

 This should work -

 {
 "query" : {
 // GIVE QUERY HERE
 },
   "aggregations": {
  "aggs": {
   "terms": {
 "field": "_index"
   }
 }
   }
 }

 Thanks
Vineeth


 On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees  wrote:

>
> I'm building a simple app which presents the user with two drop-downs
> to easily filter data: one for day (mapping to my daily indices), and one
> for client (a term within documents).
>
> I'm currently finding indices using curl -XGET
> localhost:9200/_aliases, and a simple aggregation query to get a list of
> known clients over all indices. It works, but since not every client is
> present on every date it feels clunky when the client is known but the 
> list
> of dates still contains all indices, many of which are irrelevant for the
> selected client.
>
> Can anyone recommend a good way of finding a list of indices in which
> there is at least one document containing a specified term please? Thank
> you very much.
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
>
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%40goo
> glegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/a485bf59-4ab6-43a3-b3df-172b8d09e7ba%
>>> 40googlegroups.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e7c82301-e7b7-4ea9-91e9-cbea714f1712%40googlegroups.com
> 

Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon ,

I didn't quite understand your question.
Kindly elaborate.

Thanks
  Vineeth


On Mon, Sep 1, 2014 at 7:59 PM, Simon Edwards 
wrote:

> Hello Vineeth,
>
> Many thanks for your reply, sounds like your method should work a treat!
>
> One quick noob question, where abouts are the date histograms aggregations
> created? Are they created on the index or on the dashboard itself? Any help
> is appreciated.
>
> Cheers,
> Si.
>
>
> On Monday, September 1, 2014 2:11:02 PM UTC+1, vineeth mohan wrote:
>
>> Hello Simon ,
>>
>> I believe this can be done in this manner.
>> Do 2 separate date histogram on the date_submitted field and date_closed
>> field.
>> The sum of count of date_submitted minus the sum of  count on date_closed
>> on all the previous date should give you the number of open issues for that
>> week.
>>
>> For eg:
>>
>> Week1 - Open - 10 , closed - 5
>> Week2 - Open 20 m ,closed 6
>> Week3 - Open 30 , closed 10
>>
>> Number of open issues on
>> Week1 - 10
>> Week2 - (20 + 10 ) - 5 = 25
>> Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49
>>
>>
>> Thanks
>>   Vineeth
>>
>>
>>
>> On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards  wrote:
>>
>>> Hi,
>>>
>>> I was wondering if somebody familiar with aggregations, particularly
>>> date histogram aggregations, can point me in the right direction.
>>>
>>> I'm currently looking to get a total count of records over a specific
>>> time period. Each record contains a "date_submitted" field and if they're
>>> closed, contain a "date_closed" field.
>>>
>>> Is it possible to aggregate the records based off these values (i.e.
>>> only showing open issues for a weekly period, even if the record was
>>> submitted a year ago)? If so where abouts are the aggregations specified?
>>> In the dashboard JSON or within the index mapping?
>>>
>>> Many thanks in advance.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>>
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b9df0286-fac1-41be-89a5-be062e8a9f55%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5no_tFxcNbh9huv1Bos2hjE-B-mWq5%3DL83vKg-ygROT3Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


_ttl inherited from index settings

2014-09-01 Thread Octavian
Hello,

Is there any setting that specifies the _ttl for all documents from a index?

I saw that there is a setting per type, that works[1], but when I try to 
put the same setting per index, it doesn't work.

[1]http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html

Thank you,

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c12e1cb8-a0c8-4842-a955-4b6d15be0977%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Need Help: Upgrade of ES + Large queries = new CPU overload

2014-09-01 Thread Scott Decker
Hey all,
  We have been testing the new 1.3.1 release on our current load and 
queries, and have found that under same conditions, same queries, the es 
cluster we have just starts to max out cpu and the thread pools fill up and 
the query times just keep going up until eventually we have the restart 
nodes just to clear things.
on our older (.20.6) version, we do have big queries. Think 100+ terms, but 
they were all wrapped in a filter and cached. We almost never did any 
scoring, and if we did, it was only on a few terms.
so, a query may look like the following:

"query": {
"filtered": {
  "query": {
"constant_score": {
  "query": {
"bool": {
  "must": [
{
  "bool": {
"should": {
  "bool": {
"should": [
  {
"term": {
  "content": "smyrna"
}
  },
  {
"term": {
  "title": "smyrna"
}
  }
]
  }
}
  }
}
  ]
}
  },
  "boost": 1
}
  },
  "filter": {
"bool": {
  "must": [
{
  "bool": {
"must": [
  {
"bool": {
  "must": {
"bool": {
  "should": [
{
  "bool": {
"must": [
  {
"terms": {https://groups.google.com/d/msgid/elasticsearch/be45cf36-3b4a-4452-b3bc-461a879dec02%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hello Vineeth,

Many thanks for your reply, sounds like your method should work a treat!

One quick noob question, where abouts are the date histograms aggregations 
created? Are they created on the index or on the dashboard itself? Any help 
is appreciated.

Cheers,
Si.

On Monday, September 1, 2014 2:11:02 PM UTC+1, vineeth mohan wrote:
>
> Hello Simon , 
>
> I believe this can be done in this manner.
> Do 2 separate date histogram on the date_submitted field and date_closed 
> field.
> The sum of count of date_submitted minus the sum of  count on date_closed 
> on all the previous date should give you the number of open issues for that 
> week.
>
> For eg:
>
> Week1 - Open - 10 , closed - 5
> Week2 - Open 20 m ,closed 6
> Week3 - Open 30 , closed 10
>
> Number of open issues on 
> Week1 - 10
> Week2 - (20 + 10 ) - 5 = 25
> Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49
>
>
> Thanks
>   Vineeth
>
>
>
> On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards  > wrote:
>
>> Hi,
>>
>> I was wondering if somebody familiar with aggregations, particularly date 
>> histogram aggregations, can point me in the right direction.
>>
>> I'm currently looking to get a total count of records over a specific 
>> time period. Each record contains a "date_submitted" field and if they're 
>> closed, contain a "date_closed" field.
>>
>> Is it possible to aggregate the records based off these values (i.e. only 
>> showing open issues for a weekly period, even if the record was submitted a 
>> year ago)? If so where abouts are the aggregations specified? In the 
>> dashboard JSON or within the index mapping?
>>
>> Many thanks in advance.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b9df0286-fac1-41be-89a5-be062e8a9f55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Design advice for ES side-by-side with hadoop cluster?

2014-09-01 Thread bob . webman
Hi Guys,

I have a 16 node hadoop cluster, running Cloudera's community edition. All 
16 nodes are big powerful boxes with lots of disk.

I would like to add ES to this cluster, but would like advice on how to 
configure/design the ES cluster.

I bulk load my data using PIG, which means Map-Reduce. What are the 
thoughts on reducers against ES master nodes? Should I restrict me reducers 
to match ES master nodes?

Any thoughts of advice? At the moment my standard MR parameters kill the ES 
nodes.

Thanks


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2d541d83-5a79-4bc1-b5da-11065b9b568a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How can we integration of hive(hadoop) and elasticsearch

2014-09-01 Thread Mohit Kumar Yadav
Hi folks,
I am trying to use hive concept with elasticsearch. I have successful
install and run both hadoop(running on window machine using vmware player)
and elasticsearch
But I have no clue how I can integrate both thing.
Please can you provide me any link where I can get steps to integrate both
thing. I have go through the elastic search url but I am puzzled. Please
suggest me.

Thanks in advance...!!!

Regrads
Mohit Kumar Yadav
(MCA/BBA)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAK6eDKcd_tzjn83yvRdBq6L6HREtCVoNdp%2Btc3rHfh9bUXNPfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees

Same result I'm afraid... 

{
"took": 62,
"timed_out": false,
"_shards": {
"total": 147,
"successful": 144,
"failed": 0
},
"hits": {
"total": 9975671,
"max_score": 1.0,
"hits": [
... results removed as data is sensitive, but I see correct 
documents returned in here ...
]
},
"aggregations": {
"aggs": {
"buckets": []
}
}
}

I'm still running Elasticsearch 1.1.1 -- is this something that changed 
after that perhaps?

Thanks for your help.

On Monday, September 1, 2014 2:23:17 PM UTC+1, vineeth mohan wrote:
>
> Hello Chris , 
>
> That is strange , its working fine on my side.
>
> Can you run the below and paste the result - 
>
> curl -XPOST 'http://localhost:9200/_search' -d '{
>   "aggregations": {
> "aggs": {
>   "terms": {
> "field": "_index"
>   }
> }
>   }
> }'
> Thanks
>   Vineeth
>
>
> On Mon, Sep 1, 2014 at 6:17 PM, Chris Lees  > wrote:
>
>> Thanks Vineeth.
>>
>> Unfortunately it doesn't return any results in the aggregations result.
>>
>> Input query:
>> GET _search
>>
>> {
>>   "aggregations": {
>> "aggs": {
>>   "terms": {
>> "field": "_index"
>>   }
>> }
>>   }
>> }
>>
>> Result JSON showing 26K hits (correct), but no index aggregations:
>> {
>>"took": 4,
>>"timed_out": false,
>>"_shards": {
>>   "total": 57,
>>   "successful": 57,
>>   "failed": 0
>>},
>>"hits": {
>>   "total": 26622,
>>   "max_score": 1,
>>   "hits": [...]
>>},
>>"aggregations": {
>>   "aggs": {
>>  "buckets": []
>>
>>   }
>>}
>> }
>>
>>
>>
>> On Monday, September 1, 2014 1:40:00 PM UTC+1, vineeth mohan wrote:
>>
>>> Hello Chris , 
>>>
>>> This should work - 
>>>
>>> {
>>> "query" : {
>>> // GIVE QUERY HERE
>>> },
>>>   "aggregations": {
>>>  "aggs": {
>>>   "terms": {
>>> "field": "_index"
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Thanks
>>>Vineeth
>>>
>>>
>>> On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees  wrote:
>>>

 I'm building a simple app which presents the user with two drop-downs 
 to easily filter data: one for day (mapping to my daily indices), and one 
 for client (a term within documents).

 I'm currently finding indices using curl -XGET localhost:9200/_aliases, 
 and a simple aggregation query to get a list of known clients over all 
 indices. It works, but since not every client is present on every date it 
 feels clunky when the client is known but the list of dates still contains 
 all indices, many of which are irrelevant for the selected client.

 Can anyone recommend a good way of finding a list of indices in which 
 there is at least one document containing a specified term please? Thank 
 you very much.
  
 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%
 40googlegroups.com 
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a485bf59-4ab6-43a3-b3df-172b8d09e7ba%40googlegroups.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e7c82301-e7b7-4ea9-91e9-cbea714f1712%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris ,

That is strange , its working fine on my side.

Can you run the below and paste the result -

curl -XPOST 'http://localhost:9200/_search' -d '{
  "aggregations": {
"aggs": {
  "terms": {
"field": "_index"
  }
}
  }
}'
Thanks
  Vineeth


On Mon, Sep 1, 2014 at 6:17 PM, Chris Lees  wrote:

> Thanks Vineeth.
>
> Unfortunately it doesn't return any results in the aggregations result.
>
> Input query:
> GET _search
>
> {
>   "aggregations": {
> "aggs": {
>   "terms": {
> "field": "_index"
>   }
> }
>   }
> }
>
> Result JSON showing 26K hits (correct), but no index aggregations:
> {
>"took": 4,
>"timed_out": false,
>"_shards": {
>   "total": 57,
>   "successful": 57,
>   "failed": 0
>},
>"hits": {
>   "total": 26622,
>   "max_score": 1,
>   "hits": [...]
>},
>"aggregations": {
>   "aggs": {
>  "buckets": []
>
>   }
>}
> }
>
>
>
> On Monday, September 1, 2014 1:40:00 PM UTC+1, vineeth mohan wrote:
>
>> Hello Chris ,
>>
>> This should work -
>>
>> {
>> "query" : {
>> // GIVE QUERY HERE
>> },
>>   "aggregations": {
>> "aggs": {
>>   "terms": {
>> "field": "_index"
>>   }
>> }
>>   }
>> }
>>
>> Thanks
>>   Vineeth
>>
>>
>> On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees  wrote:
>>
>>>
>>> I'm building a simple app which presents the user with two drop-downs to
>>> easily filter data: one for day (mapping to my daily indices), and one for
>>> client (a term within documents).
>>>
>>> I'm currently finding indices using curl -XGET localhost:9200/_aliases,
>>> and a simple aggregation query to get a list of known clients over all
>>> indices. It works, but since not every client is present on every date it
>>> feels clunky when the client is known but the list of dates still contains
>>> all indices, many of which are irrelevant for the selected client.
>>>
>>> Can anyone recommend a good way of finding a list of indices in which
>>> there is at least one document containing a specified term please? Thank
>>> you very much.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>>
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a485bf59-4ab6-43a3-b3df-172b8d09e7ba%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3DNtGZhQRDkX90p39KxpRNeekbWSq5gasyaVK%3DevUHe_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


how to get trip length out of a series of gps coordinates

2014-09-01 Thread M R
hi all,

what would be the smartes approach to calculate the distance traveled for a 
(filtered) series of coordinates? i´m wondering what tools could help out 
there... 

somewhat like the sum aggregation of a geopoint field respecting the result 
order? i hope somebody can help me out. 

or is it better to iteratively increase a trip distance field with 
scan&scroll?

thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3e4ef50a-0301-4b38-ba2c-9429f92485d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Hadoop] Setting Document ID in Map Reduce Mapper

2014-09-01 Thread Juan Carlos Fernández
I had the same issue and it was solved using es-hadoop 2.0.1 instead 2.0.0. 
Looks like a solved bug but I couldn't find anyone claiming it like an open 
bug neither closed.
Regards

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:
>
> Hi all,
>
> I'm newbie with ES and i'm trying to set manually each document ID. I've 
> seen in the documentation the *es.mapping.id * 
> propperty and I'm trying to set it in the conf part of the driver class the 
> same way i set the index and type of documents:
>
> conf.set("es.resource", "logs/{event}");
>> conf.set("es.mapping.id", "id"); 
>
>
> In the Mapper class I put in the MapWritable object a new key value pair 
> for each map:
>
> MapWritable doc = new MapWritable();
>> String id = node+"|"+timestamp; //node and timestamp are two String 
>> values that I have.
>> doc.put(new Text("id"), new Text(id));
>
>
> And as a result I can't write in ES and get exceptions with this message: 
> JsonParseException[Unexpected character ('"' (code 34))
>
> If I comment the es.mapping.id line and allow ES to set the documents ID 
> everything works fine. 
>
> What could I do?
>
> Thanks in advance  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c31846f-b141-4d7d-971b-d8a2b2c43843%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Date Aggregation Help

2014-09-01 Thread vineeth mohan
Hello Simon ,

I believe this can be done in this manner.
Do 2 separate date histogram on the date_submitted field and date_closed
field.
The sum of count of date_submitted minus the sum of  count on date_closed
on all the previous date should give you the number of open issues for that
week.

For eg:

Week1 - Open - 10 , closed - 5
Week2 - Open 20 m ,closed 6
Week3 - Open 30 , closed 10

Number of open issues on
Week1 - 10
Week2 - (20 + 10 ) - 5 = 25
Week3 -  ( 30 + 20 + 10 ) - (6 + 5) = 49


Thanks
  Vineeth



On Mon, Sep 1, 2014 at 5:25 PM, Simon Edwards 
wrote:

> Hi,
>
> I was wondering if somebody familiar with aggregations, particularly date
> histogram aggregations, can point me in the right direction.
>
> I'm currently looking to get a total count of records over a specific time
> period. Each record contains a "date_submitted" field and if they're
> closed, contain a "date_closed" field.
>
> Is it possible to aggregate the records based off these values (i.e. only
> showing open issues for a weekly period, even if the record was submitted a
> year ago)? If so where abouts are the aggregations specified? In the
> dashboard JSON or within the index mapping?
>
> Many thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5m3oacS75hV6nSeXR0ZasC7RyL2SqDh1vk%2B1VXUKFNd_w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees
Thanks Vineeth.

Unfortunately it doesn't return any results in the aggregations result.

Input query:
GET _search
{
  "aggregations": {
"aggs": {
  "terms": {
"field": "_index"
  }
}
  }
}

Result JSON showing 26K hits (correct), but no index aggregations:
{
   "took": 4,
   "timed_out": false,
   "_shards": {
  "total": 57,
  "successful": 57,
  "failed": 0
   },
   "hits": {
  "total": 26622,
  "max_score": 1,
  "hits": [...]
   },
   "aggregations": {
  "aggs": {
 "buckets": []
  }
   }
}



On Monday, September 1, 2014 1:40:00 PM UTC+1, vineeth mohan wrote:
>
> Hello Chris , 
>
> This should work - 
>
> {
> "query" : {
> // GIVE QUERY HERE
> },
>   "aggregations": {
> "aggs": {
>   "terms": {
> "field": "_index"
>   }
> }
>   }
> }
>
> Thanks
>   Vineeth
>
>
> On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees  > wrote:
>
>>
>> I'm building a simple app which presents the user with two drop-downs to 
>> easily filter data: one for day (mapping to my daily indices), and one for 
>> client (a term within documents).
>>
>> I'm currently finding indices using curl -XGET localhost:9200/_aliases, 
>> and a simple aggregation query to get a list of known clients over all 
>> indices. It works, but since not every client is present on every date it 
>> feels clunky when the client is known but the list of dates still contains 
>> all indices, many of which are irrelevant for the selected client.
>>
>> Can anyone recommend a good way of finding a list of indices in which 
>> there is at least one document containing a specified term please? Thank 
>> you very much.
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a485bf59-4ab6-43a3-b3df-172b8d09e7ba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread vineeth mohan
Hello Chris ,

This should work -

{
"query" : {
// GIVE QUERY HERE
},
  "aggregations": {
"aggs": {
  "terms": {
"field": "_index"
  }
}
  }
}

Thanks
  Vineeth


On Mon, Sep 1, 2014 at 3:10 PM, Chris Lees  wrote:

>
> I'm building a simple app which presents the user with two drop-downs to
> easily filter data: one for day (mapping to my daily indices), and one for
> client (a term within documents).
>
> I'm currently finding indices using curl -XGET localhost:9200/_aliases,
> and a simple aggregation query to get a list of known clients over all
> indices. It works, but since not every client is present on every date it
> feels clunky when the client is known but the list of dates still contains
> all indices, many of which are irrelevant for the selected client.
>
> Can anyone recommend a good way of finding a list of indices in which
> there is at least one document containing a specified term please? Thank
> you very much.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kdfq2kuroxFVx2590k03SQf7T_H0PhrCGeA1_OoKMDxQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Date Aggregation Help

2014-09-01 Thread Simon Edwards
Hi,

I was wondering if somebody familiar with aggregations, particularly date 
histogram aggregations, can point me in the right direction.

I'm currently looking to get a total count of records over a specific time 
period. Each record contains a "date_submitted" field and if they're 
closed, contain a "date_closed" field.

Is it possible to aggregate the records based off these values (i.e. only 
showing open issues for a weekly period, even if the record was submitted a 
year ago)? If so where abouts are the aggregations specified? In the 
dashboard JSON or within the index mapping?

Many thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5e448bc2-007b-4c7b-a073-fcb1a8017eed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Determine Shard Id based on routing key

2014-09-01 Thread Adrien Grand
On Mon, Sep 1, 2014 at 1:18 PM, 'Sandeep Ramesh Khanzode' via elasticsearch
 wrote:

> However, I am a little concerned with your comment on the equivalence of 1
> index with 20 shards and 20 indices with one shard each. You mentioned that
> you would discourage the latter.
>
> Can you please explain why? Is it for management reasons or performance
> overhead reasons? I can deal with the former but not the latter unless if
> you have some pointers. Thanks,
>

Sorry for the confusion, what I would like to discourage is not having 20
indices with one shard but trying to manage sharding manually instead of
relying on elasticsearch's routing mechanism that abstracts the number of
shards.

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5R4-BOtaMkFTDB1PfpVqVrh4BQb%3D4TsAfseOiCFP79Fg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: complexe query ( for me ;) ) with many match_all and range

2014-09-01 Thread alain ibrahim
EDIT 1 ;

this seem to work for begining :

curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["E_EVENTID", "O_CATALOGNUMBER", "O_RECORDNUMBER"],
"query" : {
"bool" : {
"must" : [
{
"query_string":{"query" : "O_CATALOGNUMBER:P23*"}
},
{
"query_string":{"query" : 
"O_RECORDNUMBER:recordnumber236*"}
},
{
"range" : {
"E_EVENTID" : {
"gte" : 36,
"lte" : 60
}
}
}
]
}
},
"highlight": {"pre_tags": [""], "post_tags": [""], 
"fields": {"*": {}}},
"facets": {
   "lifestage": {"terms": {  "field": "O_LIFESTAGE", "size": 20}},
   "sex": {"terms": {"field": "O_SEX", "size": 20}},
   "continent": {"terms": {"field": "L_CONTINENT", "size": 20}},
   "institutioncode": {"terms": {"field": "I_INSTITUTIONCODE", "size": 
20  }}
}
}'

Le lundi 1 septembre 2014 11:56:42 UTC+2, alain ibrahim a écrit :
>
> Hello
>
> I'm quite new on elasticsearch.
>
> I made a form who query an elasticsearch document for getting the datas.
>
> In the form there is mutiple input : 
> image_yes : checkbox
> NAME : string
> COLLECTION : string
> CATALOGNUMBER:string
> RECORDNUMBER: string
> LOCALISATION: string
> EVENTID : integer
> event_date_start:  year  integer
> event_date_end:year integer
>
>
> the search is general: on all the fields of the document
> I made many simples queries who works but i can't make a complexe one 
> working with AND and RANGE. All the exemples i see use "match_all:{}". 
> I also need the use of "*" and "?" in the query.
>
> *this work ok:*
>
> curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
> "fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
> "query" : {
> "query_string" : {
> "query" : "CATALOGNUMBER:P23*"
>  
> } 
> }
> }'
>
>
> curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
> "fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
> "query" : {
> "filtered" : {
> "filter" : {
> "range":{"EVENTID": {"from" : 36,
> "to" : 50,
> "include_lower" : true,
> "include_upper" : false}
> }
> } 
> }
> }
> }'
>
> *this working ok too:*
>
> curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
> "fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
> "query" : {
> "bool" : {
> "must" : [
> {
> "query_string":{"query" : "CATALOGNUMBER:P23*"}
> },
> {
> "query_string":{"query" : 
> "RECORDNUMBER:recordnumber236*"}
> }
> ]
> }
> },
> "highlight": {"pre_tags": [""], "post_tags": [""], 
> "fields": {"*": {}}},
> "facets": {
>"lifestage": {"terms": {  "field": "LIFESTAGE", "size": 20}},
>"sex": {"terms": {"field": "SEX", "size": 20}},
>"continent": {"terms": {"field": "CONTINENT", "size": 20}},
>"institutioncode": {"terms": {"field": "INSTITUTIONCODE", "size": 
> 20  }}
> }
> }'
>
>
> *trying to merge but fail : *
> I need that the facets are made with the results so i can't use 
> "filtering" if i understand good.
> I try many syntaxes but nothink good.
>
>
> (error)
> curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
> "fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
> "query" : {
> "filtered" : {
> "filter" : {
> "range":{"EVENTID": {"from" : 36,
> "to" : 50,
> "include_lower" : true,
> "include_upper" : false}
> }
> },
> "query_string" : {
> "query" : "CATALOGNUMBER:P23*"   
> }
> }
> },
> "highlight": {"pre_tags": [""], "post_tags": [""], 
> "fields": {"*": {}}},
> "facets": {
>"lifestage": {"terms": {  "field": "LIFESTAGE", "size": 20}},
>"sex": {"terms": {"field": "SEX", "size": 20}},
>"continent": {"terms": {"field": "CONTINENT", "size": 20}},
>"institutioncode": {"terms": {"field": "INSTITUTIONCODE", "size": 
> 20  }}
> }
> }'
> error : 
> curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
> "fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
> "query" : {
> "filtered" : {
> "filter" : {
> "range":{"EVENTID"

Re: Determine Shard Id based on routing key

2014-09-01 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi Adrian,

Thanks for the reply. That was important for me to understand.

However, I am a little concerned with your comment on the equivalence of 1 
index with 20 shards and 20 indices with one shard each. You mentioned that 
you would discourage the latter.

Can you please explain why? Is it for management reasons or performance 
overhead reasons? I can deal with the former but not the latter unless if 
you have some pointers. Thanks,

Thanks,
Sandeep


On Thursday, 28 August 2014 13:28:32 UTC+5:30, Adrien Grand wrote:
>
> Hi Sandeep,
>
> Routing is deterministic, otherwise we couldn't  know where data is 
> located when using the get API (this API goes to a single shard, not all of 
> them). However, you should not rely on the distribution of the hash values 
> as this is an implementation detail that we could indeed change at some 
> point.
>
> I don't know what your use-case is, but if you really need to manage the 
> sharding yourself, the easiest way to do it would be to creates 20 indices 
> with 1 shard instead of 1 index with 20 shards. I would discourage to do it 
> though.
>
>
>
> On Thu, Aug 28, 2014 at 8:31 AM, 'Sandeep Ramesh Khanzode' via 
> elasticsearch > wrote:
>
>> Hi
>>
>> Say I create an index with 20 shards.
>>
>> During indexing, if I specify a routing_key as 0, will it be indexed in 
>> shardId 0? Will routing_key 3 correspond to shard Id 3? Similarly for all 
>> other keys if I have 20 unique routing values since 0 % 20 will be 0 and 3 
>> % 20 will be 3, etc.
>>
>> There is no hash but a specific set of routing keys [0..19] = number of 
>> shards [0..19] that I have.
>>
>> Is this deterministic and documented by ES regarding this routing key to 
>> shard Id behavior? Or is it internal to ES and changeable anytime?
>>
>> Please let me know soon!! Thanks in advance...
>>
>> Thanks,
>> Sandeep
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/32b16cf4-517e-4b33-9eb8-129cf5bd8cf0%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc1cc1c3-9987-4f51-98f9-0878145e6c66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan,

Using a test index and the analyze API, I was no able to create a config, 
which is fine for me... theoretically.
{
"template": "logstash-*",
"settings": {
"analysis": {
"filter": {

"my_word_delimiter": {
"type": "word_delimiter",
"preserve_original": "true"
}
},

"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["standard",
"lowercase",
"stop",
"my_word_delimiter",
"asciifolding"]
}
}
}
},
"mappings": {
"_default_": {
"properties": {

"excp": {
"type": "string",
"index": "analyzed",
"analyzer": "my_analyzer"
},
"msg": {
"type": "string",
"index": "analyzed",
"analyzer": "my_analyzer"
}
}
}
}
}
The problem now is, as soon as I activate this for the two fields and have 
a new logstash index created I cannot use a simpleQueryString query to 
retrieve any results.
It won't find anything via the REST api. Using the standard logstash 
template and mapping it works fine.
Have you observed anything simililar?

Thx
Marc

On Friday, August 29, 2014 6:49:41 PM UTC+2, Ivan Brusic wrote:
>
> That output does not look like the something generated from the standard 
> analyzer since it contains uppercase letters and various non-word 
> characters such as '='.
>
> Your two analysis requests will differ since the second one contains the 
> default word_delimiter filter instead of your custom my_word_delimiter. 
> What you are trying to achieve is somewhat difficult, but you can get there 
> if you keep on tweaking. :) Try using a pattern tokenizer instead of the 
> whitespace tokenizer if you want more control over word boundaries.
>
> -- 
> Ivan
>
>
> On Fri, Aug 29, 2014 at 1:48 AM, Marc  > wrote:
>
> Hi Ivan,
>
> thanks again. I have tried so and found a reasonable combination.
> Nevertheless, when I now try to use the analyze api with an index that has 
> the said analyzer defined via template it doesn't seem to apply:
>
> This is the complete template:
> {
> "template": "bogstash-*",
> "settings": {
> "index.number_of_replicas": 0,
> "analysis": {
> "analyzer": {
> "msg_excp_analyzer": {
> "type": "custom",
> "tokenizer": "whitespace",
> "filters": ["word_delimiter",
> "lowercase",
> "asciifolding",
> "shingle",
> "standard"]
> }
> },
> "filters": {
> "my_word_delimiter": {
> "type": "word_delimiter",
> "preserve_original": "true"
> },
> "my_asciifolding": {
> "type": "asciifolding",
> "preserve_original": true
> }
> }
> }
> },
> "mappings": {
> "_default_": {
> "properties": {
> "@excp": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> },
> "@msg": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> }
> }
> }
> }
> }
> I create the index bogstash-1.
> Now I test the following:
> curl -XGET 
> 'localhost:9200/bogstash-1/_analyze?analyzer=msg_excp_analyzer&pretty=1' -d 
> 'Service=MyMDB.onMessage appId=cs Times=Me:22/Total:22 (updated 
> attributes=gps_lng: 183731222/ gps_lat: 289309222/ )'
> and it returns:
> {
>   "tokens" : [ {
> "token" : "Service=MyMDB.onMessage",
> "start_offset" : 0,
> "end_offset" : 23,
> "type" : "word",
> "position" : 1
>   }, {
> "token" : "appId=cs",
> "start_offset" : 24,
> "end_offset" : 32,
> "type" : "word",
> "position" : 2
>   }, {
> "token" : "Times=Me:22/Total:22",
> "start_offset" : 33,
> "end_offset" : 53,
> "type" : "word",
> "position" : 3
>   }, {
> "token" : "(updated",
> "start_offset" : 54,
> "end_offset" : 62,
> "type" : "word",
> "position" : 4
>   }, {
> "token" : "attributes=gps_lng:",
> "start_offset" : 63,
> "end_offset" : 82,
> "type" : "word",
> "position" : 5
>   }, {
> "token" : "183731222/",
> "start_offset" : 83,
> "end_offset" : 93,
> "type" : "word",
> "position" : 6
>   }, {

Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan,

Using a test index and the analyze API, I was no able to create a config, 
which is fine for me... theoretically.
{
"template": "logstash-*",
"settings": {
"analysis": {
"filter": {

"my_word_delimiter": {
"type": "word_delimiter",
"preserve_original": "true"
}
},

"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["standard",
"lowercase",
"stop",
"my_word_delimiter",
"asciifolding"]
}
}
}
},
"mappings": {
"_default_": {
"properties": {

"excp": {
"type": "string",
"index": "analyzed",

"analyzer": "my_analyzer"
},
"msg": {
"type": "string",
"index": "not_analyzed",
"analyzer": "my_analyzer"
}
}
}
}
}
The problem now is, as soon as I activate this for the two fields and have 
a new logstash index created I cannot use a simpleQueryString query to 
retrieve any results.
It won't find anything via the REST api. Using the standard logstash 
template and mapping it works fine.
Have you observed anything simililar?

Thx
Marc

On Friday, August 29, 2014 6:49:41 PM UTC+2, Ivan Brusic wrote:
>
> That output does not look like the something generated from the standard 
> analyzer since it contains uppercase letters and various non-word 
> characters such as '='.
>
> Your two analysis requests will differ since the second one contains the 
> default word_delimiter filter instead of your custom my_word_delimiter. 
> What you are trying to achieve is somewhat difficult, but you can get there 
> if you keep on tweaking. :) Try using a pattern tokenizer instead of the 
> whitespace tokenizer if you want more control over word boundaries.
>
> -- 
> Ivan
>
>
> On Fri, Aug 29, 2014 at 1:48 AM, Marc  > wrote:
>
> Hi Ivan,
>
> thanks again. I have tried so and found a reasonable combination.
> Nevertheless, when I now try to use the analyze api with an index that has 
> the said analyzer defined via template it doesn't seem to apply:
>
> This is the complete template:
> {
> "template": "bogstash-*",
> "settings": {
> "index.number_of_replicas": 0,
> "analysis": {
> "analyzer": {
> "msg_excp_analyzer": {
> "type": "custom",
> "tokenizer": "whitespace",
> "filters": ["word_delimiter",
> "lowercase",
> "asciifolding",
> "shingle",
> "standard"]
> }
> },
> "filters": {
> "my_word_delimiter": {
> "type": "word_delimiter",
> "preserve_original": "true"
> },
> "my_asciifolding": {
> "type": "asciifolding",
> "preserve_original": true
> }
> }
> }
> },
> "mappings": {
> "_default_": {
> "properties": {
> "@excp": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> },
> "@msg": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> }
> }
> }
> }
> }
> I create the index bogstash-1.
> Now I test the following:
> curl -XGET 
> 'localhost:9200/bogstash-1/_analyze?analyzer=msg_excp_analyzer&pretty=1' -d 
> 'Service=MyMDB.onMessage appId=cs Times=Me:22/Total:22 (updated 
> attributes=gps_lng: 183731222/ gps_lat: 289309222/ )'
> and it returns:
> {
>   "tokens" : [ {
> "token" : "Service=MyMDB.onMessage",
> "start_offset" : 0,
> "end_offset" : 23,
> "type" : "word",
> "position" : 1
>   }, {
> "token" : "appId=cs",
> "start_offset" : 24,
> "end_offset" : 32,
> "type" : "word",
> "position" : 2
>   }, {
> "token" : "Times=Me:22/Total:22",
> "start_offset" : 33,
> "end_offset" : 53,
> "type" : "word",
> "position" : 3
>   }, {
> "token" : "(updated",
> "start_offset" : 54,
> "end_offset" : 62,
> "type" : "word",
> "position" : 4
>   }, {
> "token" : "attributes=gps_lng:",
> "start_offset" : 63,
> "end_offset" : 82,
> "type" : "word",
> "position" : 5
>   }, {
> "token" : "183731222/",
> "start_offset" : 83,
> "end_offset" : 93,
> "type" : "word",
> "position" : 6
>   

Re: EL setup for fulltext search

2014-09-01 Thread Marc
Hi Ivan,

Using a test index and the analyze API, I was no able to create a config, 
which is fine for me... theoretically.
{
"template": "logstash-*",
"settings": {
"analysis": {
"filter": {
"my_word_delimiter": {
"type": "word_delimiter",
"preserve_original": "true"
}
},
"analyzer": {
"b2v_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["standard",
"lowercase",
"stop",
"my_word_delimiter",
"asciifolding"]
}
}
}
},
"mappings": {
"_default_": {
"properties": {
"excp": {
"type": "string",
"index": "analyzed",
"analyzer": "b2v_analyzer"
},
"msg": {
"type": "string",
"index": "not_analyzed",
"analyzer": "b2v_analyzer"
}
}
}
}
}
The problem now is, as soon as I activate this for the two fields and have 
a new logstash index created I cannot use a simpleQueryString query to 
retrieve any results.
It won't find anything via the REST api. Using the standard logstash 
template and mapping it works fine.
Have you observed anything simililar?

Thx
Marc
 

On Friday, August 29, 2014 6:49:41 PM UTC+2, Ivan Brusic wrote:
>
> That output does not look like the something generated from the standard 
> analyzer since it contains uppercase letters and various non-word 
> characters such as '='.
>
> Your two analysis requests will differ since the second one contains the 
> default word_delimiter filter instead of your custom my_word_delimiter. 
> What you are trying to achieve is somewhat difficult, but you can get there 
> if you keep on tweaking. :) Try using a pattern tokenizer instead of the 
> whitespace tokenizer if you want more control over word boundaries.
>
> -- 
> Ivan
>
>
> On Fri, Aug 29, 2014 at 1:48 AM, Marc  > wrote:
>
> Hi Ivan,
>
> thanks again. I have tried so and found a reasonable combination.
> Nevertheless, when I now try to use the analyze api with an index that has 
> the said analyzer defined via template it doesn't seem to apply:
>
> This is the complete template:
> {
> "template": "bogstash-*",
> "settings": {
> "index.number_of_replicas": 0,
> "analysis": {
> "analyzer": {
> "msg_excp_analyzer": {
> "type": "custom",
> "tokenizer": "whitespace",
> "filters": ["word_delimiter",
> "lowercase",
> "asciifolding",
> "shingle",
> "standard"]
> }
> },
> "filters": {
> "my_word_delimiter": {
> "type": "word_delimiter",
> "preserve_original": "true"
> },
> "my_asciifolding": {
> "type": "asciifolding",
> "preserve_original": true
> }
> }
> }
> },
> "mappings": {
> "_default_": {
> "properties": {
> "@excp": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> },
> "@msg": {
> "type": "string",
> "index": "analyzed",
> "analyzer": "msg_excp_analyzer"
> }
> }
> }
> }
> }
> I create the index bogstash-1.
> Now I test the following:
> curl -XGET 
> 'localhost:9200/bogstash-1/_analyze?analyzer=msg_excp_analyzer&pretty=1' -d 
> 'Service=MyMDB.onMessage appId=cs Times=Me:22/Total:22 (updated 
> attributes=gps_lng: 183731222/ gps_lat: 289309222/ )'
> and it returns:
> {
>   "tokens" : [ {
> "token" : "Service=MyMDB.onMessage",
> "start_offset" : 0,
> "end_offset" : 23,
> "type" : "word",
> "position" : 1
>   }, {
> "token" : "appId=cs",
> "start_offset" : 24,
> "end_offset" : 32,
> "type" : "word",
> "position" : 2
>   }, {
> "token" : "Times=Me:22/Total:22",
> "start_offset" : 33,
> "end_offset" : 53,
> "type" : "word",
> "position" : 3
>   }, {
> "token" : "(updated",
> "start_offset" : 54,
> "end_offset" : 62,
> "type" : "word",
> "position" : 4
>   }, {
> "token" : "attributes=gps_lng:",
> "start_offset" : 63,
> "end_offset" : 82,
> "type" : "word",
> "position" : 5
>   }, {
> "token" : "183731222/",
> "start_offset" : 83,
> "end_offset" : 93,
> "type" : "word",
> "position" : 6
>  

The optimal way to aggregate in Kibana information from multiple Elasticsearch indexes

2014-09-01 Thread Vagif Abilov
I originally posted this question on StackOverflow, but I see that this 
group might be a more suitable place for it.

We are setting up logs from several related applications so the log events 
are imported into Elasticsearch (via Logstash). It was straightforward 
create Kibana dashboards to visualize log indexes for each application, but 
since the applications are related and its activities belong to the same 
pipeline, it would be great to build a dashboard that would show aggregated 
information, collected from different applications. Such dashboard would be 
especially useful to track failures and performance problems.

Right now I can see three main ways to implement aggregated dashboard:

   1. Keep separate application logs and configure Kibana dashboard that 
   would consume information from different applications. I am afraid this can 
   be a challenging task, I am not even sure Kibana fully supports it.
   2. Revise application logging so they will all log to the same index. 
   What I dislike about this is that log event structure must be then unified 
   across applications, and they are built by different people in different 
   languages. I've lost my faith to centralized control over such low level 
   details like logging.
   3. Keep applications log and corresponding Elastichsearch indexes as 
   they are now, but set up a new index which will contain aggregate 
   information. This article 
   

 
   describes how to configure Elasticsearch to dump it’s logs to Logstash 
   which would then insert them back into Elasticsearch for searching. At 
   first glance this approach may look surprising: why would you need to 
   re-insert log data once again into the same database? It's another index, 
   it adds overhead, uses more space etc. But it gives the opportunity to set 
   up the index in a way that will be suitable for a aggregated Kibana 
   dashboard.

I wonder if someone has gone through a similar dilemma and can share their 
experience.

Thanks in advance

Vagif Abilov

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ac164ad2-ea7f-4b00-a9af-fc6e819949e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to use copy_to on a nested field

2014-09-01 Thread Yann Moisan
 

Is it possible to use copy_to in a nested field :

Here is an extract of my mapping :

day: {
type: nested
properties: {
weight: {
index_name: bzixtz2fng.day.weight
type: double
}
value: {
index_name: bzixtz2fng.day.value
type: string
copy_to: [
raw_words
back
]
}
} 
}

Why I don't find my doc when I search on back

{
  "query": {
"term": {
  "back": "one"
}
  }
}

PS : Version 1.0.1

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b265191b-aa34-4074-9b62-a8f8a205e65d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


complexe query ( for me ;) ) with many match_all and range

2014-09-01 Thread alain ibrahim
Hello

I'm quite new on elasticsearch.

I made a form who query an elasticsearch document for getting the datas.

In the form there is mutiple input : 
image_yes : checkbox
NAME : string
COLLECTION : string
CATALOGNUMBER:string
RECORDNUMBER: string
LOCALISATION: string
EVENTID : integer
event_date_start:  year  integer
event_date_end:year integer


the search is general: on all the fields of the document
I made many simples queries who works but i can't make a complexe one 
working with AND and RANGE. All the exemples i see use "match_all:{}". 
I also need the use of "*" and "?" in the query.

*this work ok:*

curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
"query" : {
"query_string" : {
"query" : "CATALOGNUMBER:P23*"
 
} 
}
}'


curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
"query" : {
"filtered" : {
"filter" : {
"range":{"EVENTID": {"from" : 36,
"to" : 50,
"include_lower" : true,
"include_upper" : false}
}
} 
}
}
}'

*this working ok too:*

curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
"query" : {
"bool" : {
"must" : [
{
"query_string":{"query" : "CATALOGNUMBER:P23*"}
},
{
"query_string":{"query" : 
"RECORDNUMBER:recordnumber236*"}
}
]
}
},
"highlight": {"pre_tags": [""], "post_tags": [""], 
"fields": {"*": {}}},
"facets": {
   "lifestage": {"terms": {  "field": "LIFESTAGE", "size": 20}},
   "sex": {"terms": {"field": "SEX", "size": 20}},
   "continent": {"terms": {"field": "CONTINENT", "size": 20}},
   "institutioncode": {"terms": {"field": "INSTITUTIONCODE", "size": 
20  }}
}
}'


*trying to merge but fail : *
I need that the facets are made with the results so i can't use "filtering" 
if i understand good.
I try many syntaxes but nothink good.


(error)
curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
"query" : {
"filtered" : {
"filter" : {
"range":{"EVENTID": {"from" : 36,
"to" : 50,
"include_lower" : true,
"include_upper" : false}
}
},
"query_string" : {
"query" : "CATALOGNUMBER:P23*"   
}
}
},
"highlight": {"pre_tags": [""], "post_tags": [""], 
"fields": {"*": {}}},
"facets": {
   "lifestage": {"terms": {  "field": "LIFESTAGE", "size": 20}},
   "sex": {"terms": {"field": "SEX", "size": 20}},
   "continent": {"terms": {"field": "CONTINENT", "size": 20}},
   "institutioncode": {"terms": {"field": "INSTITUTIONCODE", "size": 
20  }}
}
}'
error : 
curl -XGET 'localhost:9200/donnees/specimens/_search?pretty=true' -d '{
"fields" : ["EVENTID", "CATALOGNUMBER", "RECORDNUMBER"],
"query" : {
"filtered" : {
"filter" : {
"range":{"EVENTID": {"from" : 36,
"to" : 50,
"include_lower" : true,
"include_upper" : false}
}
}
}, 
"query_string" : { "query" : "CATALOGNUMBER:P23*" }
},
"highlight": {"pre_tags": [""], "post_tags": [""], 
"fields": {"*": {}}},
"facets": {
   "lifestage": {"terms": {  "field": "LIFESTAGE", "size": 20}},
   "sex": {"terms": {"field": "SEX", "size": 20}},
   "continent": {"terms": {"field": "CONTINENT", "size": 20}},
   "institutioncode": {"terms": {"field": "INSTITUTIONCODE", "size": 
20  }}
}
}'



how can i put a AND query with Range please ?

I whish somethink like this in pseudo sql:

select * from document where : 
image_yes is true
AND NAME : "name"
AND COLLECTION : "collection*"
AND  CATALOGNUMBER:"catalognumber*" 
AND  RECORDNUMBER:"recordnumber*"
AND LOCALISATION:  "localisation"
AND
  "range":{"EVENTID": {"from" : 36,   "include_lower" : true  
}
   AND  
"range":{" event_date": {"from" :  event_date_start,
"to" : event_data_end,
"include_lower" : true,
"include_upper" : false}
}


Many thanks
Alain

-- 
You received this message because you are subscribed to the 

Can I find a list of indices containing documents which contain a specified term?

2014-09-01 Thread Chris Lees

I'm building a simple app which presents the user with two drop-downs to 
easily filter data: one for day (mapping to my daily indices), and one for 
client (a term within documents).

I'm currently finding indices using curl -XGET localhost:9200/_aliases, and 
a simple aggregation query to get a list of known clients over all indices. 
It works, but since not every client is present on every date it feels 
clunky when the client is known but the list of dates still contains all 
indices, many of which are irrelevant for the selected client.

Can anyone recommend a good way of finding a list of indices in which there 
is at least one document containing a specified term please? Thank you very 
much.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3dcf46da-3eeb-4503-a348-365e3f0fd7a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana Line charts Without timestamp field

2014-09-01 Thread srinu konda
Hi ,

Am trying to display some date on kibana, my requirement is to build a Line 
chart without timestamp field,
Is that option is there in kibana ?, I have to take some numerical values 
on x-axis and y-axis to display Line chart...
Please help me.


Thanks,
Srinivas.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7b082b20-d87e-4d97-b676-3791ee673b7d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Function_Score Filter on _ttl

2014-09-01 Thread Fabian Köstring
Hey there!
I have a function_score query and i try to get only documents with a ttl 
greater than or equals a specific value.
But it seems that the query doesnt work. I dont get the expected results.

GET index1/_search
{
   "query": {
  "function_score": {
  "filter":{
  "bool": {
  "must": [
 {
 "range": {
"_ttl": {
"gte" : 87105050
}
 }
 }
  ]
  }
  }
  }
   }
}

In my results there are documents which have a ttl lower then 87105050. How 
can this happen?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6cf09a91-7b0e-48cc-8d82-cf8726c1f63f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Mapper Attachment plugin 2.3.2 released

2014-09-01 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Mapper Attachment 
plugin, version 2.3.2.

The mapper attachments plugin adds the attachment type to Elasticsearch using 
Apache Tika..

https://github.com/elasticsearch/elasticsearch-mapper-attachments/

Release Notes - elasticsearch-mapper-attachments - Version 2.3.2


Fix:
 * [82] - Unable to extract text from Word documents 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/82)





Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-mapper-attachments project repository: 
https://github.com/elasticsearch/elasticsearch-mapper-attachments/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54042835.87b9b40a.032a.73dfSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread joergpra...@gmail.com
There is no limitation on fields in ES. Each field requires a bit of memory
so the limit is only dependent on your hardware resources (RAM, CPU power).

I run ~1000 fields if it is of any interest, without significant
performance hit.

Jörg



On Mon, Sep 1, 2014 at 9:34 AM, ravi kumar  wrote:

> Hi Dixit ,
>
> SO will it be ok to put 1000s fields inside the same _type. This is what i
> am worry about. Is there any docs that describe this kind of limitation
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/71e1e3f3-7a3b-4c1b-932a-8b752f52882a%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH0EUPE4Ex%2BvoU-u5mxYbcsQJ0Z-eEA5Ghprv1Todta9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread vineeth mohan
Hello Ravi ,

Its not a good practice to add any number of fields , this will have a toll
on the performance.

Instead of storing it as laptopModelId:"Vostro" ,

Store it as

"attributes" : [{
  "key" : laptopModelId",
  "value" : "Vostro"
},
{
"key" ...
"value" ...
}]

And then declare attributes as nested type.

Thanks
   Vineeth




On Mon, Sep 1, 2014 at 1:04 PM, ravi kumar  wrote:

> Hi Dixit ,
>
> SO will it be ok to put 1000s fields inside the same _type. This is what i
> am worry about. Is there any docs that describe this kind of limitation
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/71e1e3f3-7a3b-4c1b-932a-8b752f52882a%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nPi_B9KOaO3trrjgUDZE_J%2BMnZ2B%3DsJUQvyT0jAuug5g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread ravi kumar
Hi Dixit ,

SO will it be ok to put 1000s fields inside the same _type. This is what i 
am worry about. Is there any docs that describe this kind of limitation

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/71e1e3f3-7a3b-4c1b-932a-8b752f52882a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread ravi kumar
Hi Dixit ,

SO will it be ok to put 1000s fields inside the same _type. This is what i 
am worry about

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f284fc0d-b516-4ee9-832c-2484ea229513%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread Bharvi Dixit
Hi Ravi,

You need to change your schema & put all these fields under same type to 
get the output. And as per my experience with elasticsearch ther won't be 
any problem with large the number of fields.

Regards
Bharvi

On Monday, 1 September 2014 12:25:36 UTC+5:30, ravi kumar wrote:
>
> Hi david ,
>
> Your query is working but thats not what I want. What i want is to return 
> record of laptop that with laptopModelId "Vostro" AND sellAdState should be 
> "Karnataka".
>
> laptopModelId and sellAdState will always be under different _types.
>
> Shall I change my data schema and make it in under same _type ?
>
> I have around 70-80 fields. will it be efficient to put all these fields 
> inside same _type so that 
>
> laptopModelId:"Vostro" AND sellAdState:"Karnataka"
>
> can be executed?
>
> or Is there any other solution to my previous given data format?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/757959af-81c1-423b-8526-e06037113120%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic search : Quesry not executing

2014-09-01 Thread David Pilato
Yes.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 1 sept. 2014 à 08:53, ravi kumar  a écrit :
> 
> Hi david ,
> 
> Your query is working but thats not what I want. What i want is to return 
> record of laptop that with laptopModelId "Vostro" AND sellAdState should be 
> "Karnataka".
> 
> Shall I change my data schema and make it in under same _type ?
> 
> I have around 70-80 fields. will it be efficient to put all these fields 
> inside same _type so that 
> 
> laptopModelId:"Vostro" AND sellAdState:"Karnataka"
> 
> can be executed?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/3cee85cf-8350-4baa-9d38-298f33a9982f%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/02F124E1-5473-4E92-9281-895E85158E86%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.