Re: Filtered query returning unwanted results

2014-05-20 Thread Yarin Miran
One side note, if I change the query_string fields to be either 
'collected.etag' or 'collected.action_type' the document is filtered 
properly.
If I include 'collected.'is_success' I still get the document.

On Tuesday, May 20, 2014 6:43:05 PM UTC+3, Yarin Miran wrote:
>
> Hello,
>
> We've been using elasticsearch for a while now in my company and we 
> recently discovered that certain queries are returning unwanted results.
>
> I've created a gist <https://gist.github.com/yairnm/14bca137879212cf64f2>that 
> recreates the problem on a newly index without mapping
>
> The use case is quite simple, I have the following document:
> {
> "collected" : {
> "etag" : "\"KOlSfwGpwASvJB5lLGkjWDc36DY/R9KZ-VQJQP18W3NNaJcSpQlPgaY\"",
> "is_success" : true,
> "action_type" : "password"
> },
> "timestamp" : 1399820164000,
> "instanceId" : 0,
> "collected_event" : true,
> "tenantId" : 2,
> "eventType" : 589825
> }
>
> And I try to run a filtered query, that has terms and query_string:
> {
> "filter" : {
> "and" : [{
> "term" : {
> "tenantId" : 2
> }
> }, {
> "not" : {
> "term" : {
> "eventType" : 589844
> }
> }
> }, {
> "not" : {
> "term" : {
> "collected_event" : false
> }
> }
> }
> ]
> },
> "query" : {
> "query_string" : {
> "query" : "\"sfoun\"",
> "lenient" : true,
> "fields" : ["collected.*"]
> }
> }
> }
>
> I expect the filtered query to yield zero results since the document 
> doesn't meet with the query_string.
>
> You can try and replace the text in query_string to anything you like but 
> still get the results.
> My guess is that I'm missing out something or I don't fully understand how 
> queries work.
>
> Thanks in advance,
> Yarin.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/72ca01a8-2a49-4588-ad7b-65f58b68504f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Filtered query returning unwanted results

2014-05-20 Thread Yarin Miran
Hello,

We've been using elasticsearch for a while now in my company and we 
recently discovered that certain queries are returning unwanted results.

I've created a gist that 
recreates the problem on a newly index without mapping

The use case is quite simple, I have the following document:
{
"collected" : {
"etag" : "\"KOlSfwGpwASvJB5lLGkjWDc36DY/R9KZ-VQJQP18W3NNaJcSpQlPgaY\"",
"is_success" : true,
"action_type" : "password"
},
"timestamp" : 1399820164000,
"instanceId" : 0,
"collected_event" : true,
"tenantId" : 2,
"eventType" : 589825
}

And I try to run a filtered query, that has terms and query_string:
{
"filter" : {
"and" : [{
"term" : {
"tenantId" : 2
}
}, {
"not" : {
"term" : {
"eventType" : 589844
}
}
}, {
"not" : {
"term" : {
"collected_event" : false
}
}
}
]
},
"query" : {
"query_string" : {
"query" : "\"sfoun\"",
"lenient" : true,
"fields" : ["collected.*"]
}
}
}

I expect the filtered query to yield zero results since the document 
doesn't meet with the query_string.

You can try and replace the text in query_string to anything you like but 
still get the results.
My guess is that I'm missing out something or I don't fully understand how 
queries work.

Thanks in advance,
Yarin.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f1bdb3c-2c8a-48ab-8572-0b6bba71302e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch randomly restarts

2014-01-26 Thread Yarin Miran
Thanks mark, I've dug through the logs and finally found a legacy
service that caused the restarts.

Thanks!

On 27/01/2014, Mark Walkom  wrote:
> There might be something in the various /var/log/ files, have a look there
> around the time and see if you can correlate anything.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 26 January 2014 18:44, Yarin Miran  wrote:
>
>> Yeah that's the interval this is happening,
>>
>> I've checked the cronjobs on the machine, there's nothing there.
>>
>>
>> On 26 January 2014 00:35, Mark Walkom  wrote:
>>
>>> It looks like it's happening every 33-34 minutes.
>>> You might want to check if there is a scheduled task that is causing
>>> this
>>> (eg cron).
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 26 January 2014 00:59, Yarin Miran  wrote:
>>>
>>>> Hey all,
>>>>
>>>> I have a testing machine that queries an elasticsearch node that is
>>>> locally on the very same machine.
>>>> The node has a single index with 2 primary shards and no replicas.
>>>>
>>>> We've recently started running simultaneous tests on the machine and we
>>>> noticed that the elasticsearch node randomly restarts during searches.
>>>> On the elasticsearch log there's no indication of any errors
>>>>
>>>> Here's a small example of the log
>>>>
>>>> [2014-01-25 12:06:52,432][INFO ][node ] [Jenkins]
>>>> starting ...
>>>> [2014-01-25 12:06:52,600][INFO ][transport] [Jenkins]
>>>> bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/
>>>> 192.168.1.61:9300]}
>>>> [2014-01-25 12:06:55,685][INFO ][cluster.service  ] [Jenkins]
>>>> new_master
>>>> [Jenkins][WEbUh6FNRmC6MhNjJKDNWQ][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
>>>> reason: zen-disco-join (elected_as_master)
>>>> [2014-01-25 12:06:55,721][INFO ][discovery] [Jenkins]
>>>> cluster_Jenkins/WEbUh6FNRmC6MhNjJKDNWQ
>>>> [2014-01-25 12:06:55,764][INFO ][http ] [Jenkins]
>>>> bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[/
>>>> 127.0.0.1:9200]}
>>>> [2014-01-25 12:06:55,765][INFO ][node ] [Jenkins]
>>>> started
>>>> [2014-01-25 12:06:56,708][INFO ][gateway  ] [Jenkins]
>>>> recovered [1] indices into cluster_state
>>>> [2014-01-25 12:40:48,969][INFO ][node ] [Jenkins]
>>>> stopping ...
>>>> [2014-01-25 12:40:49,061][INFO ][node ] [Jenkins]
>>>> stopped
>>>> [2014-01-25 12:40:49,062][INFO ][node ] [Jenkins]
>>>> closing ...
>>>> [2014-01-25 12:40:49,072][INFO ][node ] [Jenkins]
>>>> closed
>>>> [2014-01-25 12:40:59,274][INFO ][node ] [Jenkins]
>>>> version[0.90.5], pid[25402], build[c8714e8/2013-09-17T12:50:20Z]
>>>> [2014-01-25 12:40:59,274][INFO ][node ] [Jenkins]
>>>> initializing ...
>>>> [2014-01-25 12:40:59,294][INFO ][plugins  ] [Jenkins]
>>>> loaded [inout], sites [bigdesk]
>>>> [2014-01-25 12:41:02,747][INFO ][node ] [Jenkins]
>>>> initialized
>>>> [2014-01-25 12:41:02,747][INFO ][node ] [Jenkins]
>>>> starting ...
>>>> [2014-01-25 12:41:02,899][INFO ][transport] [Jenkins]
>>>> bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/
>>>> 192.168.1.61:9300]}
>>>> [2014-01-25 12:41:05,981][INFO ][cluster.service  ] [Jenkins]
>>>> new_master
>>>> [Jenkins][K2hM2A-DRwulWrMHuTz1Jw][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
>>>> reason: zen-disco-join (elected_as_master)
>>>> [2014-01-25 12:41:06,018][INFO ][discovery] [Jenkins]
>>>> cluster_Jenkins/K2hM2A-DRwulWrMHuTz1Jw
>>>> [2014-01-25 12:41:06,062][INFO ][http ] [Jenkins]
>>>> bound_address {inet[/127.0.0.1:9200]},

Re: Elasticsearch randomly restarts

2014-01-25 Thread Yarin Miran
Yeah that's the interval this is happening,

I've checked the cronjobs on the machine, there's nothing there.


On 26 January 2014 00:35, Mark Walkom  wrote:

> It looks like it's happening every 33-34 minutes.
> You might want to check if there is a scheduled task that is causing this
> (eg cron).
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 26 January 2014 00:59, Yarin Miran  wrote:
>
>> Hey all,
>>
>> I have a testing machine that queries an elasticsearch node that is
>> locally on the very same machine.
>> The node has a single index with 2 primary shards and no replicas.
>>
>> We've recently started running simultaneous tests on the machine and we
>> noticed that the elasticsearch node randomly restarts during searches.
>> On the elasticsearch log there's no indication of any errors
>>
>> Here's a small example of the log
>>
>> [2014-01-25 12:06:52,432][INFO ][node ] [Jenkins]
>> starting ...
>> [2014-01-25 12:06:52,600][INFO ][transport] [Jenkins]
>> bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/
>> 192.168.1.61:9300]}
>> [2014-01-25 12:06:55,685][INFO ][cluster.service  ] [Jenkins]
>> new_master 
>> [Jenkins][WEbUh6FNRmC6MhNjJKDNWQ][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
>> reason: zen-disco-join (elected_as_master)
>> [2014-01-25 12:06:55,721][INFO ][discovery] [Jenkins]
>> cluster_Jenkins/WEbUh6FNRmC6MhNjJKDNWQ
>> [2014-01-25 12:06:55,764][INFO ][http ] [Jenkins]
>> bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[/
>> 127.0.0.1:9200]}
>> [2014-01-25 12:06:55,765][INFO ][node ] [Jenkins]
>> started
>> [2014-01-25 12:06:56,708][INFO ][gateway  ] [Jenkins]
>> recovered [1] indices into cluster_state
>> [2014-01-25 12:40:48,969][INFO ][node ] [Jenkins]
>> stopping ...
>> [2014-01-25 12:40:49,061][INFO ][node ] [Jenkins]
>> stopped
>> [2014-01-25 12:40:49,062][INFO ][node ] [Jenkins]
>> closing ...
>> [2014-01-25 12:40:49,072][INFO ][node ] [Jenkins]
>> closed
>> [2014-01-25 12:40:59,274][INFO ][node ] [Jenkins]
>> version[0.90.5], pid[25402], build[c8714e8/2013-09-17T12:50:20Z]
>> [2014-01-25 12:40:59,274][INFO ][node ] [Jenkins]
>> initializing ...
>> [2014-01-25 12:40:59,294][INFO ][plugins  ] [Jenkins]
>> loaded [inout], sites [bigdesk]
>> [2014-01-25 12:41:02,747][INFO ][node ] [Jenkins]
>> initialized
>> [2014-01-25 12:41:02,747][INFO ][node ] [Jenkins]
>> starting ...
>> [2014-01-25 12:41:02,899][INFO ][transport] [Jenkins]
>> bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/
>> 192.168.1.61:9300]}
>> [2014-01-25 12:41:05,981][INFO ][cluster.service  ] [Jenkins]
>> new_master 
>> [Jenkins][K2hM2A-DRwulWrMHuTz1Jw][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
>> reason: zen-disco-join (elected_as_master)
>> [2014-01-25 12:41:06,018][INFO ][discovery] [Jenkins]
>> cluster_Jenkins/K2hM2A-DRwulWrMHuTz1Jw
>> [2014-01-25 12:41:06,062][INFO ][http ] [Jenkins]
>> bound_address {inet[/127.0.0.1:9200]}, publish_address {inet[/
>> 127.0.0.1:9200]}
>> [2014-01-25 12:41:06,062][INFO ][node ] [Jenkins]
>> started
>> [2014-01-25 12:41:06,812][INFO ][gateway  ] [Jenkins]
>> recovered [1] indices into cluster_state
>> [2014-01-25 13:02:28,648][DEBUG][action.index ] [Jenkins]
>> Sending mapping updated to master: index [tenant] type [audit]
>> [2014-01-25 13:14:57,650][INFO ][node ] [Jenkins]
>> stopping ...
>> [2014-01-25 13:14:57,745][INFO ][node ] [Jenkins]
>> stopped
>> [2014-01-25 13:14:57,746][INFO ][node ] [Jenkins]
>> closing ...
>> [2014-01-25 13:14:57,762][INFO ][node ] [Jenkins]
>> closed
>> [2014-01-25 13:15:09,126][INFO ][node ] [Jenkins]
>> version[0.90.5], pid[4000], build[c8714e8/2013-09-17T12:50:20Z]
>> [2014-01-25 13:15:09,127][INFO ][node ] [Jenkins]
>> initializing ...
>> [2014-01-25 13:15:09,142][INFO ][plugins  ] [Jenkins]
>> loaded [inout

Elasticsearch randomly restarts

2014-01-25 Thread Yarin Miran
Hey all,

I have a testing machine that queries an elasticsearch node that is locally 
on the very same machine.
The node has a single index with 2 primary shards and no replicas.

We've recently started running simultaneous tests on the machine and we 
noticed that the elasticsearch node randomly restarts during searches.
On the elasticsearch log there's no indication of any errors 

Here's a small example of the log

[2014-01-25 12:06:52,432][INFO ][node ] [Jenkins] 
starting ...
[2014-01-25 12:06:52,600][INFO ][transport] [Jenkins] 
bound_address {inet[/0.0.0.0:9300]}, publish_address 
{inet[/192.168.1.61:9300]}
[2014-01-25 12:06:55,685][INFO ][cluster.service  ] [Jenkins] 
new_master 
[Jenkins][WEbUh6FNRmC6MhNjJKDNWQ][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
 
reason: zen-disco-join (elected_as_master)
[2014-01-25 12:06:55,721][INFO ][discovery] [Jenkins] 
cluster_Jenkins/WEbUh6FNRmC6MhNjJKDNWQ
[2014-01-25 12:06:55,764][INFO ][http ] [Jenkins] 
bound_address {inet[/127.0.0.1:9200]}, publish_address 
{inet[/127.0.0.1:9200]}
[2014-01-25 12:06:55,765][INFO ][node ] [Jenkins] 
started
[2014-01-25 12:06:56,708][INFO ][gateway  ] [Jenkins] 
recovered [1] indices into cluster_state
[2014-01-25 12:40:48,969][INFO ][node ] [Jenkins] 
stopping ...
[2014-01-25 12:40:49,061][INFO ][node ] [Jenkins] 
stopped
[2014-01-25 12:40:49,062][INFO ][node ] [Jenkins] 
closing ...
[2014-01-25 12:40:49,072][INFO ][node ] [Jenkins] closed
[2014-01-25 12:40:59,274][INFO ][node ] [Jenkins] 
version[0.90.5], pid[25402], build[c8714e8/2013-09-17T12:50:20Z]
[2014-01-25 12:40:59,274][INFO ][node ] [Jenkins] 
initializing ...
[2014-01-25 12:40:59,294][INFO ][plugins  ] [Jenkins] 
loaded [inout], sites [bigdesk]
[2014-01-25 12:41:02,747][INFO ][node ] [Jenkins] 
initialized
[2014-01-25 12:41:02,747][INFO ][node ] [Jenkins] 
starting ...
[2014-01-25 12:41:02,899][INFO ][transport] [Jenkins] 
bound_address {inet[/0.0.0.0:9300]}, publish_address 
{inet[/192.168.1.61:9300]}
[2014-01-25 12:41:05,981][INFO ][cluster.service  ] [Jenkins] 
new_master 
[Jenkins][K2hM2A-DRwulWrMHuTz1Jw][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
 
reason: zen-disco-join (elected_as_master)
[2014-01-25 12:41:06,018][INFO ][discovery] [Jenkins] 
cluster_Jenkins/K2hM2A-DRwulWrMHuTz1Jw
[2014-01-25 12:41:06,062][INFO ][http ] [Jenkins] 
bound_address {inet[/127.0.0.1:9200]}, publish_address 
{inet[/127.0.0.1:9200]}
[2014-01-25 12:41:06,062][INFO ][node ] [Jenkins] 
started
[2014-01-25 12:41:06,812][INFO ][gateway  ] [Jenkins] 
recovered [1] indices into cluster_state
[2014-01-25 13:02:28,648][DEBUG][action.index ] [Jenkins] 
Sending mapping updated to master: index [tenant] type [audit]
[2014-01-25 13:14:57,650][INFO ][node ] [Jenkins] 
stopping ...
[2014-01-25 13:14:57,745][INFO ][node ] [Jenkins] 
stopped
[2014-01-25 13:14:57,746][INFO ][node ] [Jenkins] 
closing ...
[2014-01-25 13:14:57,762][INFO ][node ] [Jenkins] closed
[2014-01-25 13:15:09,126][INFO ][node ] [Jenkins] 
version[0.90.5], pid[4000], build[c8714e8/2013-09-17T12:50:20Z]
[2014-01-25 13:15:09,127][INFO ][node ] [Jenkins] 
initializing ...
[2014-01-25 13:15:09,142][INFO ][plugins  ] [Jenkins] 
loaded [inout], sites [bigdesk]
[2014-01-25 13:15:12,913][INFO ][node ] [Jenkins] 
initialized
[2014-01-25 13:15:12,914][INFO ][node ] [Jenkins] 
starting ...
[2014-01-25 13:15:13,162][INFO ][transport] [Jenkins] 
bound_address {inet[/0.0.0.0:9300]}, publish_address 
{inet[/192.168.1.61:9300]}
[2014-01-25 13:15:16,268][INFO ][cluster.service  ] [Jenkins] 
new_master 
[Jenkins][M1nNu0uvS1GGMPqHfJMyFA][inet[/192.168.1.61:9300]]{max_local_storage_nodes=1},
 
reason: zen-disco-join (elected_as_master)
[2014-01-25 13:15:16,326][INFO ][discovery] [Jenkins] 
cluster_Jenkins/M1nNu0uvS1GGMPqHfJMyFA
[2014-01-25 13:15:16,363][INFO ][http ] [Jenkins] 
bound_address {inet[/127.0.0.1:9200]}, publish_address 
{inet[/127.0.0.1:9200]}
[2014-01-25 13:15:16,364][INFO ][node ] [Jenkins] 
started
[2014-01-25 13:15:17,264][INFO ][gateway  ] [Jenkins] 
recovered [1] indices into cluster_state
[2014-01-25 13:49:04,861][INFO ][node ] [Jenkins] 
stopping ...
[2014-01-25 13:49:05,002][INFO ][node ] [Jenkins] 
stopped
[2014-01-25 13:49:05,003][INFO ][node ] [Jenkins] 
closing ...
[2014-01-25 13:49:0

Re: query_string queries don't handle multiple types

2013-12-30 Thread Yarin Miran
Silly me, 
I've found that there's a flag for ignoring these exceptions using 
"lenient" 

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

On Monday, December 30, 2013 4:44:13 PM UTC+2, Yarin Miran wrote:
>
> Hello everyone,
>
> I'm trying to implement a search for one of my indices using the following 
> query:
>
> {
> "query" : {
> "query_string" : {
> "query" : "some text",
> "fields" : ["collected.*"]
> }
> }
> }
>
> The documents in the index have a field named "collected" which is dynamic 
> and changes between documents.
>
> When I try to run this query I get NumberFormatException since some of the 
> fields are Numeric and I guess elasticsearch tries to cast the input string 
> to a number.
> Is there any possibility to make the query_string parameter perform a 
> 'best effort' search without raising that exception so that it will ignore 
> the fields that he can't cast,
>
> Thanks,
> Yarin
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/64982b6f-0152-4a02-bfe0-85c7806b0f6b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


query_string queries don't handle multiple types

2013-12-30 Thread Yarin Miran
Hello everyone,

I'm trying to implement a search for one of my indices using the following 
query:

{
"query" : {
"query_string" : {
"query" : "some text",
"fields" : ["collected.*"]
}
}
}

The documents in the index have a field named "collected" which is dynamic 
and changes between documents.

When I try to run this query I get NumberFormatException since some of the 
fields are Numeric and I guess elasticsearch tries to cast the input string 
to a number.
Is there any possibility to make the query_string parameter perform a 'best 
effort' search without raising that exception so that it will ignore the 
fields that he can't cast,

Thanks,
Yarin

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d4492b49-b299-4eae-92a7-07026e19be66%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Partial results when querying an alias

2013-12-21 Thread Yarin Miran


In an effort to create multi-tenant architecture for my project. I've 
created an elasticsearch cluster with an index 'tenant'

"tenant" : { "some_type" : { "_routing" : { "required" : true, "path" : 
"tenantId" }

The tenant index has 2 shards, no replicas.

I've also created aliases - 

"tenant" : {
"aliases" : {
  "tenant_1" : {
"index_routing" : "1",
"search_routing" : "1"
  },
  "tenant_2" : {
"index_routing" : "2",
"search_routing" : "2"
  },
  "tenant_3" : {
"index_routing" : "3",
"search_routing" : "3"
  },
  "tenant_4" : {
"index_routing" : "4",
"search_routing" : "4" 

  }

During testing we noticed that querying tenant_2 yielded 80 records while 
there were far more records with tenantId=2 in the index 'tenant' - about 
300 records.

And here's the strangest part - when I created a new alias 'tenant_test' 
correlating to the alias 'tenant_2' querying it worked perfectly and so 
does querying the problematic 'tenant_2'

I've tried to reproduce the problem but without any success. 

We're really concerned about this, while we really want to use aliases in 
production to increase our scaling capabilities.
Anyone has any clues about what can cause this ? 

Thanks in advance,
Yarin.
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/842209b5-0a1e-42ac-a7dc-b08db200110e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.