Re: Configure Kibana for HTTPS

2015-01-19 Thread Magnus Bäck
On Monday, January 19, 2015 at 15:45 CET,
 Karthik M  wrote:

>  I want the front end of ES (kibana) to run on SSL but keep the
> backend connection from Kibana to ES unencrypted since both are
> running on the same host. I configured Apache2 to accept SSL
> connections and it works but when Kibana populates the dashboard it
> get the below error. Any help is very much appreciated. Could not
> reach [1]http://ec2-XX-XX-XX-XX.compute-1.amazonaws.com/elasticsearch/_
> nodes. If you are using a proxy, ensure it is configured correctly

Which version of Kibana is this?

-- 
Magnus Bäck| Software Engineer, Development Tools
magnus.b...@sonymobile.com | Sony Mobile Communications

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20150120071740.GA2980%40seldlx20533.corpusers.net.
For more options, visit https://groups.google.com/d/optout.


Incremental updates using the river plugin

2015-01-19 Thread 4m7u1
Hello,
In order to get the incremental updates form the db, the river query has 
the following timestamp attribute.

{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "",
"password" : "",
"sql" : [
{
"statement" : "select * from \"products\" where \"mytimestamp\" 
> ?",
"parameter" : [ "$river.state.last_active_begin" ]
}
],
"index" : "my_jdbc_river_index",
"type" : "my_jdbc_river_type"
}
}


  Do I need to have a "mytimestamp" column in my db with the value as time of 
the insert of that particular row? Or is this a predefined attribute in 
elasticsearch?

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9eba7847-db14-4398-8533-ebeacbd1ca61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Why does _stats doc count differ from _search/_count doc count for an index?

2015-01-19 Thread David Pilato
The only way for doing that would be by using parent/child instead of nested.

My 2 cents.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 20 janv. 2015 à 05:27, Peter  a écrit :
> 
> Hi David,
> 
> Yes, we have some nested fields.
> 
> Ok, so the short answer is that for types that include nested fields, 
> search/count api will only count top level matched docs and exclude nested 
> docs from the count. 
> Whereas stats api counts them all at index level.
> 
> Are lucene docs and elasticsearch docs are one and the same or are these 
> different beasts, lucene docs being lower level than elasticsearch docs?
> 
> So to count all docs, I'd need to aggregate each nested doc count along with 
> each top level doc?
> 
> Something like:
> 
>  "aggs" : {
> "stats_total_docs" : {
> "stats" : {
> "script" : "1 + _source.my-nested-field1.size() + 
> _source.my-nested-field2.size() + _source.my-nested-field3.size()"
> }
> }
> 
> This would run the aggregation against every matched top level doc for a 
> given query.
> 
> And is there any more efficient or native search/count API equivalent for the 
> script counting I'm using to arrive at total doc count for nested documents?
> 
> I'm after a way to count total docs for nested docs but at query level not at 
> index level.
> 
> Thanks.
> 
> (Also note, I've used stats rather than just sum for my aggregation to get 
> some additional info as well as just the sum of docs).
> 
> 
>> On Tuesday, January 20, 2015 at 2:02:02 PM UTC+10, David Pilato wrote:
>> You are probably using nested documents, don't you?
>> 
>> Each nested doc is a Lucene doc. stats API count Lucene docs.
>> 
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>> 
>>> Le 20 janv. 2015 à 01:43, Peter  a écrit :
>>> 
>>> Hi Team,
>>> 
>>> I'm trying to work out which documents the _stats is counting when the 
>>> index _count is so much smaller.
>>> 
>>> On a test index with no replicas.
>>> 
>>> When hitting the stats:
>>> 
>>> localhost:9200/my-index/_stats
>>> indices.my-index.primaries.docs.count
>>> =68910 docs
>>> (and deleted docs = 0)
>>> 
>>> where as search/count shows:
>>> 
>>> localhost:9200/my-index/_search?search_type=count
>>> =11485 docs
>>> localhost:9200/my-index/_count
>>> =11485 docs
>>> 
>>> What docs is the stats api counting that the search/count api is not 
>>> counting?
>>> 
>>> Thanks.
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/67260fb7-d414-4051-8282-f4413b471a7e%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/72AC7AED-C83C-4460-8213-30630C962329%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-19 Thread 4m7u1
Got it. Thanks !

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a1f56595-9ab1-4e89-97d4-6e257cbb3042%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Why does _stats doc count differ from _search/_count doc count for an index?

2015-01-19 Thread Peter
Hi David,

Yes, we have some nested fields.

Ok, so the short answer is that for types that include nested fields, 
search/count api will only count top level matched docs and exclude nested 
docs from the count. 
Whereas stats api counts them all at index level.

Are lucene docs and elasticsearch docs are one and the same or are these 
different beasts, lucene docs being lower level than elasticsearch docs?

So to count all docs, I'd need to aggregate each nested doc count along 
with each top level doc?

Something like:

 "aggs" : {
"stats_total_docs" : {
"stats" : {
"script" : "1 + _source.my-nested-field1.size() + 
_source.my-nested-field2.size() + _source.my-nested-field3.size()"
}
}

This would run the aggregation against every matched top level doc for a 
given query.

And is there any more efficient or native search/count API equivalent for 
the script counting I'm using to arrive at total doc count for nested 
documents?

I'm after a way to count total docs for nested docs but at query level not 
at index level.

Thanks.

(Also note, I've used stats rather than just sum for my aggregation to get 
some additional info as well as just the sum of docs).


On Tuesday, January 20, 2015 at 2:02:02 PM UTC+10, David Pilato wrote:
>
> You are probably using nested documents, don't you?
>
> Each nested doc is a Lucene doc. stats API count Lucene docs.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 20 janv. 2015 à 01:43, Peter > a écrit :
>
> Hi Team,
>
> I'm trying to work out which documents the _stats is counting when the 
> index _count is so much smaller.
>
> On a test index with no replicas.
>
> When hitting the stats:
>
> localhost:9200/my-index/_stats
> indices.my-index.primaries.docs.count
> =68910 docs
> (and deleted docs = 0)
>
> where as search/count shows:
>
> localhost:9200/my-index/_search?search_type=count
> =11485 docs
> localhost:9200/my-index/_count
> =11485 docs
>
> What docs is the stats api counting that the search/count api is not 
> counting?
>
> Thanks.
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com
>  
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67260fb7-d414-4051-8282-f4413b471a7e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Why does _stats doc count differ from _search/_count doc count for an index?

2015-01-19 Thread David Pilato
You are probably using nested documents, don't you?

Each nested doc is a Lucene doc. stats API count Lucene docs.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 20 janv. 2015 à 01:43, Peter  a écrit :
> 
> Hi Team,
> 
> I'm trying to work out which documents the _stats is counting when the index 
> _count is so much smaller.
> 
> On a test index with no replicas.
> 
> When hitting the stats:
> 
> localhost:9200/my-index/_stats
> indices.my-index.primaries.docs.count
> =68910 docs
> (and deleted docs = 0)
> 
> where as search/count shows:
> 
> localhost:9200/my-index/_search?search_type=count
> =11485 docs
> localhost:9200/my-index/_count
> =11485 docs
> 
> What docs is the stats api counting that the search/count api is not counting?
> 
> Thanks.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/31626DBD-0E69-4155-B43F-4841921AB33D%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Mysterious index settings overwriting cluster settings

2015-01-19 Thread Chris Neal
I'm a bit confused on the github page which versions have the bug, and
which do not.  It has labels of:

1.4.3
1.5.0
2.0.0

My hunch is those versions do not have the bug, although it sure seems more
logical to tag versions that *do* have the bug, so I wanted to confirm :)

Thanks!
Chris

On Wed, Jan 14, 2015 at 11:07 AM, Chris Neal 
wrote:

> Hi Masaru,
>
> Beautiful!  That's exactly it.  Thank you very much. :)
>
> Chris
>
> On Wed, Jan 14, 2015 at 10:42 AM, Masaru Hasegawa 
> wrote:
>
>> Hi Chris,
>>
>> I think you hit this issue
>> https://github.com/elasticsearch/elasticsearch/issues/8890.
>> Workaround would be to use index template (as described in the issue) or
>> to update them by indices settings API.
>>
>>
>> Masaru
>>
>>
>> On Wed, Jan 14, 2015 at 9:52 AM, Chris Neal 
>> wrote:
>>
>>> Hi all.
>>>
>>> I'm reposting an earlier thread of mine with a more appropriate subject
>>> in hopes that someone might have an idea on this one. :)
>>>
>>> Each node in my cluster has its configuration set via elasticsearch.yml
>>> only.  I do not apply any index level settings, however the nodes in the
>>> cluster are overwriting my config settings with the defaults.  I have been
>>> unable to figure out why this is happening, and was hoping someone else
>>> might.
>>>
>>> My elasticsearch.yml file defines these settings:
>>>
>>> index:
>>>   codec:
>>> bloom:
>>>   load: false
>>>   merge:
>>> policy:
>>>   max_merge_at_once: 4
>>>   max_merge_at_once_explicit: 4
>>>   max_merged_segment: 1gb
>>>   segments_per_tier: 4
>>>   type: tiered
>>> scheduler:
>>>   max_thread_count: 1
>>>   type: concurrent
>>>   number_of_replicas: 0
>>>   number_of_shards: 1
>>>   refresh_interval: 5s
>>>
>>> From the head plugin, I can see these settings are in effect:
>>>
>>>
>>>-
>>>   settings: {
>>>  - index: {
>>> - codec: {
>>>- bloom: {
>>>   - load: false
>>>}
>>> }
>>> - number_of_replicas: 1
>>> - number_of_shards: 6
>>> - translog: {
>>>- flush_threshold_size: 1GB
>>> }
>>> - search: {
>>>- slowlog: {
>>>   - threshold: {
>>>  - fetch: {
>>> - warn: 2s
>>> - info: 1s
>>>  }
>>>  - index: {
>>> - warn: 10s
>>> - info: 5s
>>>  }
>>>  - query: {
>>> - warn: 10s
>>> - info: 5s
>>>  }
>>>   }
>>>}
>>> }
>>> - refresh_interval: 60s
>>> - merge: {
>>>- scheduler: {
>>>   - type: concurrent
>>>   - max_thread_count: 1
>>>}
>>>- policy: {
>>>   - type: tiered
>>>   - max_merged_segment: 1gb
>>>   - max_merge_at_once_explicit: 4
>>>   - max_merge_at_once: 4
>>>   - segments_per_tier: 4
>>>}
>>> }
>>>  }
>>>  - bootstrap: {
>>> - mlockall: true
>>>  }
>>>  -
>>>
>>>
>>> But each node outputs this on new index creation:
>>>
>>> [2015-01-13 02:12:52,062][INFO ][index.merge.policy   ]
>>> [elasticsearch-test] [test-20150113][1] updating [segments_per_tier] from
>>> [4.0] to [10.0]
>>> [2015-01-13 02:12:52,062][INFO ][index.merge.policy   ]
>>> [elasticsearch-test] [test-20150113][1] updating [max_merge_at_once] from
>>> [4] to [10]
>>> [2015-01-13 02:12:52,062][INFO ][index.merge.policy   ]
>>> [elasticsearch-test] [test-20150113][1] updating
>>> [max_merge_at_once_explicit] from [4] to [30]
>>> [2015-01-13 02:12:52,062][INFO ][index.merge.policy   ]
>>> [elasticsearch-test] [test-20150113][1] updating [max_merged_segment] from
>>> [1024.0mb] to [5gb]
>>>
>>> This is happening both on two clusters for me.  My "regular" ES cluster
>>> of 3 nodes, and my dedicated Marvel cluster of 1 node.  So strange.
>>>
>>> [2015-01-06 04:04:53,320][INFO ][cluster.metadata ]
>>> [elasticsearch-ip-10-0-0-42] [.marvel-2015.01.06] update_mapping
>>> [cluster_state] (dynamic)
>>> [2015-01-06 04:04:56,704][INFO ][index.merge.policy   ]
>>> [elasticsearch-ip-10-0-0-42] [.marvel-2015.01.06][0] updating
>>> [segments_per_tier] from [4.0] to [10.0]
>>> [2015-01-06 04:04:56,704][INFO ][index.merge.policy   ]
>>> [elasticsearch-ip-10-0-0-42] [.marvel-2015.01.06][0] updating
>>> [max_merge_at_once] from [4] to [10]
>>> [2015-01-06 04:04:56,704][INFO ][index.merge.policy   ]
>>> [elasticsearch-ip-10-0-0-42] [.marvel-2015.01.06][0] updating
>>> [max_merge_at_once_explicit] from [4] to [30]
>>> [2015-01-06 04:04:56,704][INFO ][index.merge.po

next kibana4 release

2015-01-19 Thread sulemanmubarik
Hi
I don’t know if this is correct place to ask but my question is when is the
next kibana4 release ?
Thanks
Suleman




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/next-kibana4-release-tp4069287.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1421687398298-4069287.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Why no exception when circuit break is triggered?

2015-01-19 Thread Jinyuan Zhou
 

The following line indicates, the fieddata would exceeds configured breaker 
indices.breaker.fielddata.limit. which by default is 60% of heap. accroding 
to doc, it should throw exception. But it seems my query returned OK. I 
noticed large amount such quereis eventually leads to OOM error.  I am 
using 1.4.0. Is it trying to get around this limit?

[2015-01-20 02:19:38,605][WARN ][indices.breaker  ] [***] 
[FIELDDATA] New used memory 19207691320 [17.8gb] from field [ts] would be 
larger than configured breaker: 19206989414 [17.8gb], breaking

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c57176b-555e-4655-9e32-6ee330ac391c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


JsonObject to SortBuilder object

2015-01-19 Thread Alex Thurston
I would like to turn an arbitrary JsonObject (which presumably follows the 
Search/Sort DSL into a SortBuilder which can then be passed to the 
SearchRequestBuilder::addSort.

I've gotten this to work by simple parsing the JsonObject myself and 
calling the appropriate calls in the SortBuilder, but that means that I 
have to implement the parsing for every variation of the DSL.

If I've got a Java JsonObject that looks like:

{
   "first_name": "asc"
}

OR

{
  "first_name": {
"order": "asc"
  }
}

OR

{
  "_geo_distance":{
"my_position":{
  "order": "asc"
}
  }
}

All of which are valid Json for the sort, I would imagine there's a way to 
call:

JsonObject sort_json = 
SortBuilder sort = new SortBuilder()
sort.setSort(sort_json);

I'm almost certain I'm missing something but can't for the life of me 
figure out how to do it.

Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c09c4e67-68fe-4552-9161-eca139264511%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Why does _stats doc count differ from _search/_count doc count for an index?

2015-01-19 Thread Peter
Hi Team,

I'm trying to work out which documents the _stats is counting when the 
index _count is so much smaller.

On a test index with no replicas.

When hitting the stats:

localhost:9200/my-index/_stats
indices.my-index.primaries.docs.count
=68910 docs
(and deleted docs = 0)

where as search/count shows:

localhost:9200/my-index/_search?search_type=count
=11485 docs
localhost:9200/my-index/_count
=11485 docs

What docs is the stats api counting that the search/count api is not 
counting?

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbebbad8-f9f6-4c45-99b3-479e4f6f5a23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?

2015-01-19 Thread kaspersky_us via elasticsearch
Thanks Mark. Sounds like this issue affects a lot of people.

I looked at your suggestion about FLT, and the ignore_tf parameter should 
help, however unless I'm missing something, it doesn't seem like this would 
address the IDF, and results could be biased. But I will experiment.

Ultimately I think what my particular use case requires is a scorer that 
only uses edit distance (when querying with fuzziness) and field boosts, 
but no TF / IDF.


On Monday, January 19, 2015 at 3:15:47 PM UTC-8, Mark Harwood wrote:
>
> This issue rounds up a bunch of related issues that have been raised 
> previously: https://github.com/elasticsearch/elasticsearch/issues/9103
>
> For now try FuzzyLikeThis (
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-flt-query.html#query-dsl-flt-query
>  
> )
> It blends More Like This and fuzzy functionality but includes the 
> adjustments to IDF that I think make more sense than the other 
> implementations with their bias towards rewarding scarcity.
>
>
> On Monday, January 19, 2015 at 6:48:49 PM UTC, kasper...@yahoo.com wrote:
>>
>> I have the same problem, where some results with higher edit distance are 
>> ranked higher than other results that are closer in terms of edit distance.
>>
>> I suspect it does have to do with document frequency, as you think 
>> Adrien. In my case I want to ignore document frequency completely. Any 
>> suggestion to achieve this? 
>>
>> I'm a taker of any solution as this looks like a show stopper for us, so 
>> even a workaround would help. 
>>
>> I can try to create this other rewrite method you mentioned if you could 
>> point me in the right direction. 
>>
>> Thanks
>>
>> On Thursday, January 15, 2015 at 7:44:57 AM UTC-8, Adrien Grand wrote:
>>>
>>> This is because the score takes two factors into account: the document 
>>> frequency and the edit distance. Quite likely in your case, even though 
>>> Boss is closer than Bose, Bose has a much lower document frequency which 
>>> helped it eventually get a better score. I guess we should have another 
>>> rewrite method that would not take freqs into account (or somehow merge 
>>> them) to avoid that issue.
>>>
>>> On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner  
>>> wrote:
>>>
 Any ideas?


  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com
  
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>
>>>
>>> -- 
>>> Adrien Grand
>>>  
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbfa04e8-afc4-46fe-b945-d006f89ca90f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?

2015-01-19 Thread Mark Harwood
This issue rounds up a bunch of related issues that have been raised 
previously: https://github.com/elasticsearch/elasticsearch/issues/9103

For now try FuzzyLikeThis 
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-flt-query.html#query-dsl-flt-query
 
)
It blends More Like This and fuzzy functionality but includes the 
adjustments to IDF that I think make more sense than the other 
implementations with their bias towards rewarding scarcity.


On Monday, January 19, 2015 at 6:48:49 PM UTC, kasper...@yahoo.com wrote:
>
> I have the same problem, where some results with higher edit distance are 
> ranked higher than other results that are closer in terms of edit distance.
>
> I suspect it does have to do with document frequency, as you think Adrien. 
> In my case I want to ignore document frequency completely. Any suggestion 
> to achieve this? 
>
> I'm a taker of any solution as this looks like a show stopper for us, so 
> even a workaround would help. 
>
> I can try to create this other rewrite method you mentioned if you could 
> point me in the right direction. 
>
> Thanks
>
> On Thursday, January 15, 2015 at 7:44:57 AM UTC-8, Adrien Grand wrote:
>>
>> This is because the score takes two factors into account: the document 
>> frequency and the edit distance. Quite likely in your case, even though 
>> Boss is closer than Bose, Bose has a much lower document frequency which 
>> helped it eventually get a better score. I guess we should have another 
>> rewrite method that would not take freqs into account (or somehow merge 
>> them) to avoid that issue.
>>
>> On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner  
>> wrote:
>>
>>> Any ideas?
>>>
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Adrien Grand
>>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8328d71a-b2be-40aa-abdc-ebeddb9a713d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Stuck with "re-syncing mappings with cluster state for types ..." after upgrade 0.90.9 to 1.4.2

2015-01-19 Thread Mark Walkom
Did you upgrade Java as well as ES? What version are you on?

On 20 January 2015 at 04:55, Eike Dehling  wrote:

> Hi List,
>
> we have recently upgraded our ES cluster (12 nodes, ~900GB data) from
> 0.90.9 to 1.4.2. We did a restart-upgrade, so backup data and then start
> the new ES version with the old data directories.
>
> Since then we're seeing lot of messages like "re-syncing mappings with
> cluster state for types ..." in the log of the master node, about 5-10
> every second (Once for each index). The load and network traffic on the
> machines also seems to have increased quite a bit.
>
> Some googling showed me various older (0.17/0.18) questions about this,
> and that it's related to serialization of the mapping. Our mapping is here:
> https://gist.github.com/anonymous/33b7ba153794e739fd33
>
> One suggested quick fix i found was to close/re-open the affected indices.
> Other suggestions? Or tips how to debug why this is going wrong, and what
> to change in the mapping?
>
> How serious is this? Does this need immediate attention, or should the
> cluster survive for a while?
>
> best regards,
> Eike Dehling
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/de51ebc3-ca47-475c-914d-91d7a747e69e%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zVCd1NPbMQqpoMeEW5oyfvonP-syf7h8kATA1cAceoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Update an Elasticsearch document's array field without using scripting

2015-01-19 Thread Jason Lee
  
  
I'm trying to add new values to an existing array field in a document. I've 
noticed that using the update api just overwrites the entire array field. 
So far all the examples I have found are using scripting but I can't do 
that because of security. Adding the script in the config/script folder as 
per http://www.elasticsearch.org/blog/scripting-security/ is not ideal 
because that would mean I would be required to manually place the script in 
wherever our ES instances would be installed, which could be alot of 
places. 

I'm actually using the Apache HttpClient to make http calls for GET and 
POST etc., and a way around this is to make multiple requests in order to 
get the existing array, update it from Java code, and then send a request 
with the updated array. However I do not like this approach as I end up 
making more than one request. I do not want to use the Java API for this 
(even if there is a way).

Is there some other way to do this? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e7f5a2ad-2e0b-4686-b90d-e0d5c6fb720c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Securing elasticsearch and still access it via kibana

2015-01-19 Thread Iain Woolf
As per the subject, my goal is to prevent general external access to 
elasticsearch, but still allow authenticated access from kibana.

Looking around the internet there are multiple descriptions of how to do 
pieces of this, but I haven't come across a good explanation of how to pull 
it all together. Hence my post here.

Using nginx, I can set up authenticated access to elasticsearch (as 
per http://www.elasticsearch.org/blog/playing-http-tricks-nginx/)
However, there's also the ability to set up basic authentication for 
elasticsearch with https://github.com/Asquera/elasticsearch-http-basic  How 
do these two differ / complement each other?

In Kibana3, I know there's the setting in config.js which instructs kibana 
to connect to ES with credentials:

elasticsearch: {"http://"+window.location.hostname+":8080";, 
withCredentials: true},

Does this "withCredentials" option assume the presence of 
elasticsearch-http-basic auth ES plugin?

So far I have the ES basic auth working via nginx config on port 8080 (I 
haven't installed elasticsearch-http-basic), but Kibana is showing a blank 
page with "{{dashboard.current.title}}" and an error on the console:

 Uncaught SyntaxError: Unexpected token +
>
> app.js:8 TypeError: Cannot read property 'elasticsearch' of undefined
>
> at new  (http://10.2.3.174:8081/app/app.js:22:5857)
>
> at d (http://10.2.3.174:8081/app/app.js:8:6414)
>
> at Object.e [as instantiate] (http://10.2.3.174:8081/app/app.js:8:6527)
>
> at Object. (http://10.2.3.174:8081/app/app.js:8:4811)
>
> at Object.d [as invoke] (http://10.2.3.174:8081/app/app.js:8:6414)
>
> at http://10.2.3.174:8081/app/app.js:8:6930
>
> at c (http://10.2.3.174:8081/app/app.js:8:5751)
>
> at d (http://10.2.3.174:8081/app/app.js:8:5885)
>
> at Object.e [as instantiate] (http://10.2.3.174:8081/app/app.js:8:6527)
>
> at Object. (http://10.2.3.174:8081/app/app.js:8:4811)
>
>
Cheers,

 Iain 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/357fd600-0ca9-470f-8cde-24a645672d1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Strange behaviour phrase suggester.

2015-01-19 Thread Silvestre Losada
Hi all I'm using phrase suggester to obtain spell check corrections over 
user query. I have on my corpus the next words 

spider, spyder, spade, spde

I want to get suggestions for spider and I'm entering next query,

  "suggest" : {
"spell_ngram" : {
  "text" : "spider",
  "phrase" : {
"field" : "spell_ngram",
"size" : 5,
"confidence" : 0.7,
"max_errors" : 0.9,
"gram_size" : 5

  }
}
  }

The output is  
spell_ngram: 
[
1]
0:  
{
text: "spider"
offset: 0
length: 6
options: 
[
3]
0:  
{
text: "spade"
score: 0.15242973
}
-
1:  
{
text: "spde"
score: 0.077127695
}
-
2:  
{
text: "spider"
score: 0.07542856
}
-
-
}
-
-
}


I think that spyder should be the best one because the edit distance is 1, 
is there any thing that I'm doing wrong.


Best

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c576390-fc88-4731-a009-ea1dac82648e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


False positive results for spatial search

2015-01-19 Thread Lukasz Smoron
hi,

 
I got simple spatial field:
PRIMARY_GEOMETRY: {
type: "geo_shape"
}

I index following geometry
{"PRIMARY_GEOMETRY":
{"geometries":[
{"coordinates":[[[-87.6544,41.9677],[-87.6544,41.9717],[-87.6489,41.9717],[-87.6489,41.9677],[-87.6544,41.9677]]],"type":"Polygon"},
{"coordinates":[[[-87.6544,41.9472],[-87.6544,41.9513],[-87.6489,41.9513],[-87.6489,41.9472],[-87.6544,41.9472]]],"type":"Polygon"}
],"type":"GeometryCollection"}
}

and then I try to search with following query

{"query":{"filtered":{"query":{"geo_shape":{"PRIMARY_GEOMETRY":{"shape":{"coordinates":[[[-87.8826,41.6908],[-87.9127,41.7161],[-87.9382,41.7441],[-87.959,41.7743],[-87.9745,41.8062],[-87.9845,41.8392],[-87.989,41.873],[-87.9878,41.9069],[-87.9808,41.9404],[-87.9683,41.973],[-87.9505,42.0042],[-87.9274,42.0335],[-87.8997,42.0603],[-87.8676,42.0844],[-87.8317,42.1053],[-87.7925,42.1226],[-87.7507,42.1361],[-87.707,42.1456],[-87.662,42.1509],[-87.6164,42.152],[-87.571,42.1488],[-87.5266,42.1414],[-87.4837,42.1298],[-87.4431,42.1144],[-87.4055,42.0952],[-87.3715,42.0727],[-87.3415,42.0472],[-87.316,42.0191],[-87.2955,41.9888],[-87.2803,41.9569],[-87.2706,41.9237],[-87.2665,41.89],[-87.2681,41.8561],[-87.2754,41.8226],[-87.2882,41.79],[-87.3064,41.759],[-87.3296,41.7298],[-87.3574,41.7031],[-87.3895,41.6792],[-87.4254,41.6585],[-87.4643,41.6413],[-87.5059,41.6278],[-87.5493,41.6184],[-87.5939,41.6131],[-87.6391,41.6121],[-87.6841,41.6153],[-87.7282,41.6226],[-87.7708,41.6341],[-87.8111,41.6494],[-87.8486,41.6684],[-87.8826,41.6908]],[[-87.8818,41.6914],[-87.9117,41.7167],[-87.9372,41.7446],[-87.9578,41.7747],[-87.9733,41.8064],[-87.9834,41.8394],[-87.9878,41.873],[-87.9866,41.9068],[-87.9797,41.9402],[-87.9672,41.9727],[-87.9494,42.0038],[-87.9264,42.033],[-87.8988,42.0597],[-87.8668,42.0837],[-87.831,42.1045],[-87.792,42.1218],[-87.7503,42.1353],[-87.7067,42.1447],[-87.6618,42.15],[-87.6164,42.1511],[-87.5712,42.1479],[-87.5269,42.1405],[-87.4842,42.129],[-87.4438,42.1136],[-87.4063,42.0945],[-87.3723,42.0721],[-87.3424,42.0467],[-87.3171,42.0187],[-87.2967,41.9885],[-87.2815,41.9566],[-87.2718,41.9236],[-87.2677,41.8899],[-87.2693,41.8561],[-87.2766,41.8228],[-87.2894,41.7904],[-87.3074,41.7594],[-87.3306,41.7303],[-87.3583,41.7037],[-87.3903,41.6799],[-87.426,41.6592],[-87.4649,41.6421],[-87.5063,41.6287],[-87.5495,41.6193],[-87.594,41.614],[-87.6391,41.613],[-87.6839,41.6161],[-87.7279,41.6235],[-87.7703,41.6349],[-87.8105,41.6502],[-87.8479,41.6691],[-87.8818,41.6914]]],"type":"Polygon"}}}


This  query returns previously indexed polygons. The problem is that I am 
querying for shapes interacting with thin ring and those two shapes should 
not be interacting with it.

What  could I be doing wrong here?

Thanks
Luke


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8fabeb5e-eead-4d44-bf0a-74358841177a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Regexp Filter boost

2015-01-19 Thread Michael Irwin
I think I've figured out another way to do what I want. This seems to work:

{
  "query": {
"bool": {
  "must": [
{
  "query_string": {
"query": "frank",
"default_operator": "AND"
  }
},
{
  "term": {
"site_id": {
  "term": 1
}
  }
}
  ],
  "should": [
{
  "match_phrase_prefix": {
"name": {
  "query": "frank",
  "max_expansions": 5
}
  }
}
  ]
}
  }
}

when using the analyzer as explained in 
http://www.elasticsearch.org/blog/starts-with-phrase-matching/

On Monday, January 19, 2015 at 12:13:42 PM UTC-5, Michael Irwin wrote:
>
> Hello,
>
> I'm trying to figure out if it's possible to boost hits based on a regexp. 
> For example, searching through records with user's names, I'd like to boost 
> those that start with the query. I've tried a query like the following 
> but it doesn't work like I'd like:
>
> {
>   "query": {
> "function_score": {
>   "functions": [
> {
>   "boost_factor": 2,
>   "filter": {
> "regexp": {
>   "name_not_analyzed": "^frank.*"
> }
>   }
> }
>   ],
>   "score_mode": "multiply",
>   "query": {
> "bool": {
>   "must": [
> {
>   "query_string": {
> "query": "frank",
> "default_operator": "AND"
>   }
> },
> {
>   "term": {
> "site_id": {
>   "term": 1
> }
>   }
> }
>   ]
> }
>   }
> }
>   }
> }
>
> Is this possible? If so, how? Thanks!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/154c83d7-17cc-490f-a64b-b5e6f7a95b8d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: scalability questions

2015-01-19 Thread 'Cindy' via elasticsearch
Could you please tell me what the Java API to use in terms of the following 
REST API? I did quite a lot search but am not able to find an example how 
to do it using Java API.

PUT /forums/_alias/baking
{
  "routing": "baking",
  "filter": {
"term": {
  "forum_id": "baking"
}
  }
}



Many thanks,

Cindy



On Wednesday, 14 January 2015 15:21:29 UTC-5, Ed Kim wrote:
>
> The shard identification/routing is completely arbitrary. For instance, 
> users who's usernames start from A-F can be routed to shard 1, G-M to shard 
> 2, etc. So you can imagine, user Ed, Cindy and user David data can live in 
> shard 1. Use Greg will have his data in shard 2.
>
> On Wednesday, January 14, 2015 at 12:14:50 PM UTC-8, Cindy wrote:
>>
>> Hi David,
>>
>>  
>>
>> The documentations you pointed out are exactly what I am looking for. 
>> They are really helpful and demonstrate the uniqueness of Elasticsearch on 
>> scalability :-)
>>
>>  
>>
>> I like the tips in "faking index per user with aliases" very much, but 
>> since it basically routes the request to a single shard, I just want to 
>> double check with you whether multiple users can share the same shard. 
>>
>> Thanks,
>> Cindy
>>
>>
>> On Wednesday, 14 January 2015 06:23:07 UTC-5, David Pilato wrote:
>>>
>>> I think I would start reading this: 
>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/kagillion-shards.html
>>> This 
>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/user-based.html
>>> and this 
>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/faking-it.html
>>>
>>> Actually the full chapter: 
>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scale.html
>>>  :)
>>>
>>> HTH
>>>
>>> -- 
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>>> *
>>> @dadoonet  | @elasticsearchfr 
>>>  | @scrutmydocs 
>>> 
>>>
>>>
>>>  
>>> Le 14 janv. 2015 à 02:04, 'Cindy' via elasticsearch <
>>> elasti...@googlegroups.com> a écrit :
>>>
>>> Hello
>>>
>>> We are using some other search engine and consider moving to use 
>>> Elasticsearch. After done quite a lot reading, I am still not quite sure 
>>> what the optimized way should be in our case, especially after I read that 
>>> the number of shards can NOT be changed once the index is created.
>>>
>>>  
>>> In our situation, our product is hosted in cloud environment and has 
>>> rapid growing number of users, and each user is given various disk 
>>> space(several gigabytes to hundreds gigabytes) to import their datasets. We 
>>> index these datasets with fixed number of fields and the fields are all the 
>>> same for some purpose. Each user can only search in their own imported 
>>> datasets for security reason (segregated). So there is no need to query 
>>> against the entire index and query time is much more important than 
>>> indexing time. Our current query time is about 10 to 40 ms.
>>>
>>>  
>>> It's very crucial for us how to scale out horizontally smoothly.
>>>
>>>  
>>> If everything is added into one index with one type, I worried the 
>>> index/search will be getting slower and slower with growing of the size of 
>>> the indices. 
>>>
>>>  
>>> So I plan to split the indices to speed up query, and here are some 
>>> options
>>>
>>>1. Use one index and create a type for each user such that the query 
>>>from one user is directly against his own type. But since the number of 
>>>users can be over million, can elasticsearch be able to handle million 
>>>types in one index? 
>>>2. Group users into different indices such that the index/query can 
>>>be dispatched  to different indices, so a smaller index to query from. 
>>> But 
>>>this means our application has to handle the complexity of horizontal 
>>> scale 
>>>out. 
>>>
>>>  
>>> Is any option doable? Any option would you recommend?
>>>
>>>  
>>> Besides, could you please tell me how many shards one index should have 
>>> in best practice? Does too many shards also have performance hit?
>>>
>>> Many thanks,
>>> Cindy
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/a2adcd16-1c7b-4e78-a131-d9ae4d61379b%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an

Re: asciifolding character filter

2015-01-19 Thread joergpra...@gmail.com
Hey, cool idea. That's fairly easy to implement. I've just added a char
folding char filter into my version of ICU plugin

https://github.com/jprante/elasticsearch-plugin-bundle/commit/e4294cc0f4d45dabf50d840713820f8eb57152b6

Jörg

On Mon, Jan 19, 2015 at 7:18 PM, Mathijs Biesmans <
mathijs.biesm...@gmail.com> wrote:

> I'm curious whether there exists an asciifolding *character* filter, I
> know there is a asciifolding *token* filter and that the analysis chain
> works as follows: input text > char_filter > tokenizer > token filter >
> output tokens.
>
> The text on
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html
> mentions: [...]With Western languages, this can be done with the
> asciifolding *character* filter.[...], though the url says
> *asciifolding-token-filter*. An error in the docs?
>
> I also checked the icu-plugin: the *icu_normalizer* can be used both as a
> character filter and a token filter. But the *icu_folding* filter is only
> available as a token filter (that actually incorporates the icu_normalizer).
>
> I'm generating ngrams and shingles, so it seems more logical to aplpy
> ascii/icu folding as a character filter. But I can't find one?
>
>
>
>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1fe8fcec-7d9b-4b92-ad29-d4a7289de8dc%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF8BYg%2BqzHfodFyp913Cf-NhbvwqHFwRwV34RFJafbW9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


In jmv gc log message, what are two numbers after [gc][old] and before duration?

2015-01-19 Thread Jinyuan Zhou
Hi, 
I copied this log message from some postings. there two numbers between 
[gc][old] and duration:  *[155221][169]*. What are they?

2014-04-16 16:32:57,505][WARN ][monitor.jvm  ] [elasticsearch2.
trend1] [gc][old]*[**15221**]**[**169]* duration [1.3m], collections [1]/[
1.3m], total [1.3m]/[12.6m], memory [29.1gb]->[25.4gb]/[29.9gb], all_pools 
{[young][25mb]->[194.1mb]/[665.6mb]}{[survivor] [36.3mb]->[0b]/[83.1mb]}{[
old] [29gb]->[25.2gb]/[29.1gb]

Thanks,

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c1965379-7d61-4fcf-bf6c-8c71599096ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: When searching for 'Boss' with fuzziness, get higher score for 'Bose' than 'Boss'. ???? How Comes !?!?

2015-01-19 Thread kaspersky_us via elasticsearch
I have the same problem, where some results with higher edit distance are 
ranked higher than other results that are closer in terms of edit distance.

I suspect it does have to do with document frequency, as you think Adrien. 
In my case I want to ignore document frequency completely. Any suggestion 
to achieve this? 

I'm a taker of any solution as this looks like a show stopper for us, so 
even a workaround would help. 

I can try to create this other rewrite method you mentioned if you could 
point me in the right direction. 

Thanks

On Thursday, January 15, 2015 at 7:44:57 AM UTC-8, Adrien Grand wrote:
>
> This is because the score takes two factors into account: the document 
> frequency and the edit distance. Quite likely in your case, even though 
> Boss is closer than Bose, Bose has a much lower document frequency which 
> helped it eventually get a better score. I guess we should have another 
> rewrite method that would not take freqs into account (or somehow merge 
> them) to avoid that issue.
>
> On Thu, Jan 15, 2015 at 4:06 PM, Eylon Steiner  > wrote:
>
>> Any ideas?
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/52e09e54-90b6-4014-8454-34e3db5756e5%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e92f143a-c4db-488d-9db4-7bedfaa14d2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


asciifolding character filter

2015-01-19 Thread Mathijs Biesmans
I'm curious whether there exists an asciifolding *character* filter, I know 
there is a asciifolding *token* filter and that the analysis chain works as 
follows: input text > char_filter > tokenizer > token filter > output 
tokens.

The text on 
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html
 
mentions: [...]With Western languages, this can be done with the 
asciifolding *character* filter.[...], though the url says 
*asciifolding-token-filter*. An error in the docs?

I also checked the icu-plugin: the *icu_normalizer* can be used both as a 
character filter and a token filter. But the *icu_folding* filter is only 
available as a token filter (that actually incorporates the icu_normalizer).

I'm generating ngrams and shingles, so it seems more logical to aplpy 
ascii/icu folding as a character filter. But I can't find one?






-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1fe8fcec-7d9b-4b92-ad29-d4a7289de8dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Regexp Filter boost

2015-01-19 Thread Michael Irwin
Hello,

I'm trying to figure out if it's possible to boost hits based on a regexp. 
For example, searching through records with user's names, I'd like to boost 
those that start with the query. I've tried a query like the following but 
it doesn't work like I'd like:

{
  "query": {
"function_score": {
  "functions": [
{
  "boost_factor": 2,
  "filter": {
"regexp": {
  "name_not_analyzed": "^frank.*"
}
  }
}
  ],
  "score_mode": "multiply",
  "query": {
"bool": {
  "must": [
{
  "query_string": {
"query": "frank",
"default_operator": "AND"
  }
},
{
  "term": {
"site_id": {
  "term": 1
}
  }
}
  ]
}
  }
}
  }
}

Is this possible? If so, how? Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8bb66f5f-7b98-4c22-b9d1-1254fa768ab4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-19 Thread Julien Naour
Ok it works with a simple saveJsonToEs("aft-{date}/analysis")

Thanks Costin

2015-01-19 16:45 GMT+01:00 Julien Naour :

> Thanks for the reply Costin.
> I'm not really clear but the basic idea is to index data by day
> considering a feature available in each line
>
> Example of data (~700 millions lines for ~90days):
>
> 2014-01-01,05,06,ici
> 2014-01-04,05,06,la
>
> The first one have to be send to my-index-2014-01-01/my-type and the other
> my-index-2014-01-04/my-type
> I would like to do it without having to launch 90 saveJsonToES (using the
> elasticsearch-hadoop spark API)
>
> Is it more clear?
>
> It seems that the dynamic index could work for me. I'll try that right
> away.
>
> Thanks again
>
> Julien
>
>
>
> 2015-01-19 16:18 GMT+01:00 Costin Leau :
>
>> Hi Julien,
>>
>> I'm unclear of what you are trying to achieve and what doesn't work.
>> es-hadoop allows either a static index/type or a dynamic one [1] [2]. One
>> can also use a 'formatter' so for example
>> you can use a pattern like "{@timestamp:-MM-dd}" - meaning the field
>> @timestamp will be used as a target but first
>> it will formatted into year/month/day.
>>
>> There's work underway to extend that for API/real-time environments, if
>> the global settings (which are pluggable are not enough)
>> like Spark to customize the metadata per entry [3].
>>
>> Hope this helps,
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
>> master/configuration.html#cfg-multi-writes
>> [2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
>> master/spark.html#spark-write-dyn
>> [3] https://github.com/elasticsearch/elasticsearch-hadoop/issues/358
>>
>>
>> On 1/19/15 4:50 PM, Julien Naour wrote:
>>
>>> I implemented a solution for my problem.
>>> I use a foreachPartitions and instantiate a bulk processor using a
>>> transport client (i.e. one by partition) to send
>>> documents.
>>> It's not fast but it works.
>>> Somebody have an idea to be more efficient?
>>>
>>> Julien
>>>
>>> On Thursday, January 15, 2015 at 4:40:22 PM UTC+1, Julien Naour wrote:
>>>
>>> My previous idea doesn't seem to work. Cannot send documents
>>> directly to _bulk only to "index/type" pattern
>>>
>>> On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour
>>> wrote:
>>>
>>> Hi,
>>>
>>> I work on a complex workflow using Spark (Parsing, Cleaning,
>>> Machine Learning).
>>> At the end of the workflow I want to send aggregated results to
>>> elasticsearch so my portal could query data.
>>> There will be two types of processing: streaming and the
>>> possibility to relaunch workflow on all available data.
>>>
>>> Right now I use elasticsearch-hadoop and particularly the spark
>>> part to send document to elasticsearch with the
>>> saveJsonToEs(myindex, mytype) method.
>>> The target is to have an index by day using the proper template
>>> that we build.
>>> AFAIK you could not add consideration of a feature in a document
>>> to send it to the proper index in
>>> elasticsearch-hadoop.
>>>
>>> What is the proper way to implement this feature?
>>> Have a special step useing spark and bulk so that each executor
>>> send documents to the proper index considering
>>> the feature of each line?
>>> Is there something that I missed in elasticsearch-hadoop?
>>>
>>> Julien
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to
>>> elasticsearch+unsubscr...@googlegroups.com >> unsubscr...@googlegroups.com>.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/dad84f38-
>>> cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com
>>> >> cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com?utm_medium=
>>> email&utm_source=footer>.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/
>> topic/elasticsearch/5-LwjQxVlhk/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/54BD2062.1050907%40gmail.com.
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA

Stuck with "re-syncing mappings with cluster state for types ..." after upgrade 0.90.9 to 1.4.2

2015-01-19 Thread Eike Dehling
Hi List,

we have recently upgraded our ES cluster (12 nodes, ~900GB data) from 
0.90.9 to 1.4.2. We did a restart-upgrade, so backup data and then start 
the new ES version with the old data directories.

Since then we're seeing lot of messages like "re-syncing mappings with 
cluster state for types ..." in the log of the master node, about 5-10 
every second (Once for each index). The load and network traffic on the 
machines also seems to have increased quite a bit.

Some googling showed me various older (0.17/0.18) questions about this, and 
that it's related to serialization of the mapping. Our mapping is here: 
https://gist.github.com/anonymous/33b7ba153794e739fd33 

One suggested quick fix i found was to close/re-open the affected indices. 
Other suggestions? Or tips how to debug why this is going wrong, and what 
to change in the mapping?

How serious is this? Does this need immediate attention, or should the 
cluster survive for a while?

best regards,
Eike Dehling

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de51ebc3-ca47-475c-914d-91d7a747e69e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-19 Thread Julien Naour
Thanks for the reply Costin.
I'm not really clear but the basic idea is to index data by day considering
a feature available in each line

Example of data (~700 millions lines for ~90days):

2014-01-01,05,06,ici
2014-01-04,05,06,la

The first one have to be send to my-index-2014-01-01/my-type and the other
my-index-2014-01-04/my-type
I would like to do it without having to launch 90 saveJsonToES (using the
elasticsearch-hadoop spark API)

Is it more clear?

It seems that the dynamic index could work for me. I'll try that right away.

Thanks again

Julien



2015-01-19 16:18 GMT+01:00 Costin Leau :

> Hi Julien,
>
> I'm unclear of what you are trying to achieve and what doesn't work.
> es-hadoop allows either a static index/type or a dynamic one [1] [2]. One
> can also use a 'formatter' so for example
> you can use a pattern like "{@timestamp:-MM-dd}" - meaning the field
> @timestamp will be used as a target but first
> it will formatted into year/month/day.
>
> There's work underway to extend that for API/real-time environments, if
> the global settings (which are pluggable are not enough)
> like Spark to customize the metadata per entry [3].
>
> Hope this helps,
>
> [1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
> master/configuration.html#cfg-multi-writes
> [2] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
> master/spark.html#spark-write-dyn
> [3] https://github.com/elasticsearch/elasticsearch-hadoop/issues/358
>
>
> On 1/19/15 4:50 PM, Julien Naour wrote:
>
>> I implemented a solution for my problem.
>> I use a foreachPartitions and instantiate a bulk processor using a
>> transport client (i.e. one by partition) to send
>> documents.
>> It's not fast but it works.
>> Somebody have an idea to be more efficient?
>>
>> Julien
>>
>> On Thursday, January 15, 2015 at 4:40:22 PM UTC+1, Julien Naour wrote:
>>
>> My previous idea doesn't seem to work. Cannot send documents directly
>> to _bulk only to "index/type" pattern
>>
>> On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:
>>
>> Hi,
>>
>> I work on a complex workflow using Spark (Parsing, Cleaning,
>> Machine Learning).
>> At the end of the workflow I want to send aggregated results to
>> elasticsearch so my portal could query data.
>> There will be two types of processing: streaming and the
>> possibility to relaunch workflow on all available data.
>>
>> Right now I use elasticsearch-hadoop and particularly the spark
>> part to send document to elasticsearch with the
>> saveJsonToEs(myindex, mytype) method.
>> The target is to have an index by day using the proper template
>> that we build.
>> AFAIK you could not add consideration of a feature in a document
>> to send it to the proper index in
>> elasticsearch-hadoop.
>>
>> What is the proper way to implement this feature?
>> Have a special step useing spark and bulk so that each executor
>> send documents to the proper index considering
>> the feature of each line?
>> Is there something that I missed in elasticsearch-hadoop?
>>
>> Julien
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to
>> elasticsearch+unsubscr...@googlegroups.com > unsubscr...@googlegroups.com>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/dad84f38-
>> cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com
>> > cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com?utm_medium=
>> email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Costin
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/elasticsearch/5-LwjQxVlhk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/54BD2062.1050907%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFGA2c2xpkS5EGYL8yDC%3DkjraR0nHoz0Dhm%2BAdR52GF1NV93jg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: shards flapping

2015-01-19 Thread singh.janmejay
I have had such a problem with a few shards. The allocation fails with the
following message:

RecoveryFailedException: Recovery failed from X into Y; nested:
RemoteTransportException Y; nested: RecoveryEngineException ... Phase[2]
Execution failed; nested: RemoteTransportException
...[index/shard/recovery/translogOps]]; nested:
MapperParsingException[failed to parse [...]]; nested:
ElasticsearchIllegalArgumentException[unknown property [foo]]; ]]

In this case it seems dynamic-mapping for property foo wasn't propagated to
all shards, now a few shards that somehow don't know the mapping for foo
are not able to initialize on any node.

I need to find out why the mapping for field foo is unknown though.

Doesn't log tell you why its not able to initialize the shard?


On Mon, Jan 19, 2015 at 8:51 PM, daaku gee  wrote:

> By bouncing I mean the shard sits on one node for a little less than a
> minute then moves over to a second node, then back to the first node after
> a few seconds.  It stays in INITIALIZING state.
>
> On Mon, Jan 19, 2015 at 5:26 PM, Mark Walkom  wrote:
>
>> Bouncing? Does it allocate or just sit in a allocating state?
>>
>> On 20 January 2015 at 00:06, daaku gee  wrote:
>>
>>> I am running 3 node Elasticsearch-1.3 cluster.  One of the primary
>>> shards for an index that has no more documents being indexed to it, keeps
>>> bouncing from node to node. No its not relocating.
>>>
>>> Allocating the shard to any one  node doesn't work.
>>>
>>> What can I do to stabilize it?
>>>
>>> Thanks
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAH53p94GkK2iw3m4-4UKTv8JPORD6k%2B7DZ1jLYrfgT7LqX_20g%40mail.gmail.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ntBtgGMfM2YFx7wgUSdydjTW-s4XD5wxv-9af%3DkAA2g%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAH53p94mMXv5jMcE_wKuVfKzrkkX6LsFG%2BAyXaK41R3T%2Bry6Gw%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Regards,
Janmejay
http://codehunk.wordpress.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGB1VvwCjggJCwXuUo9R48ffuaMrxr297iRUMpm6Fm1eqHDSow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: How highlighting actually works?

2015-01-19 Thread Karol Sikora
Thank for your answer. I'm probably posted too few details, here is better
description:

I'm using postings highlighter, but also checked plain and fvh - both was
remarkable slower in my case.
Fields text_content* are mapped through dynamic template:
"dynamic_templates": [
{
  "text_content": {
"match": "text_content*",
"match_mapping_type": "string",
"mapping": {
  "type": "string",
  "analyzer": "polish",
  "index_options": "offsets"
}
  }
}
  ]
}

polish analyzer is defined as follow (using this plugin:
https://github.com/monterail/elasticsearch-analysis-morfologik which
provides morfologik_stem token filter):
"analyzer": {
"polish": {
  "type": "custom",
  "tokenizer": "standard",
  "filter": [
"lowercase",
"morfologik_stem"
  ]
}
  }
Pure quering in text_content is always very fast - tooks <40ms.
Total amount of time for executing request is increasing when number of
matched documents grows (more items added to index).
So i've stared thinking that highlighter is working for all of matched
documents, not only for items requested by current request (start and size
parameters). It's correct? There is some way to speed up such case (forcing
to highlight only in requested window of documents?).


Karol



2015-01-19 1:31 GMT+01:00 Nikolas Everett :

> Highlighting is complex and more hacky than you'd imagine at first glance.
> Each highlighter is different and we can't tell which one you are using
> without seeing your mapping. For the plain highlighter the cost is roughly
> proportional to the length of the highlighted field. So in your case its
> the cost to reanalyze every one of those pages.
>
> You could return which page is matched pretty cheaply if you were willing
> to write a plugin. Especially if you just wanted to know the first page or
> something.
>
> You could try using explain if you searched for text_content_*.  That'd
> tell you which field matched.
>
> Nik
> On Jan 18, 2015 6:21 PM, "Karol Sikora"  wrote:
>
>> Hi all,
>>
>> I have some specific requirements for highlighting. I need to search in
>> full content of item for phrase, and then show on which page searched
>> phrase is occuring. So i've created one field named text_content and fields
>> named text_content_{page_number} (text_content_1, text_content_2, etc.).
>> Example query is:
>> {
>> "highlight": {
>> "fields": {
>> "text_content_*": {}
>> }
>> },
>> "query": {
>> "match": {
>> "text_content": "lorem"
>> }
>> },
>> "size": 40
>> }
>>
>> I've noticed that this query is fast, but only if i have small number of
>> documents in index. Quiering for documents is always fast (<40ms), but
>> highlight phase time is growing when number of documents in index is
>> growing.
>> I've stared thinking that highlighting may be processed before appending
>> "size": 40 - on the all matched documents. It's correct? How can in speed
>> up such case?
>>
>> Regards,
>> Karol
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/b8354eb3-3a75-4999-a180-6493240eb0cc%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/FzSTLVWyok8/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd221YctsJE3QrkqnffjXACNzcZ5WaiuR1Ucrr0DV_U_NA%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAN8rAyJC38RPkxZTxb0tvX1UcsW4mtO_f1tBRrpQ3ssSQQaXHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: shards flapping

2015-01-19 Thread daaku gee
By bouncing I mean the shard sits on one node for a little less than a
minute then moves over to a second node, then back to the first node after
a few seconds.  It stays in INITIALIZING state.

On Mon, Jan 19, 2015 at 5:26 PM, Mark Walkom  wrote:

> Bouncing? Does it allocate or just sit in a allocating state?
>
> On 20 January 2015 at 00:06, daaku gee  wrote:
>
>> I am running 3 node Elasticsearch-1.3 cluster.  One of the primary shards
>> for an index that has no more documents being indexed to it, keeps bouncing
>> from node to node. No its not relocating.
>>
>> Allocating the shard to any one  node doesn't work.
>>
>> What can I do to stabilize it?
>>
>> Thanks
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAH53p94GkK2iw3m4-4UKTv8JPORD6k%2B7DZ1jLYrfgT7LqX_20g%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ntBtgGMfM2YFx7wgUSdydjTW-s4XD5wxv-9af%3DkAA2g%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAH53p94mMXv5jMcE_wKuVfKzrkkX6LsFG%2BAyXaK41R3T%2Bry6Gw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-19 Thread Costin Leau

Hi Julien,

I'm unclear of what you are trying to achieve and what doesn't work.
es-hadoop allows either a static index/type or a dynamic one [1] [2]. One can 
also use a 'formatter' so for example
you can use a pattern like "{@timestamp:-MM-dd}" - meaning the field 
@timestamp will be used as a target but first
it will formatted into year/month/day.

There's work underway to extend that for API/real-time environments, if the global settings (which are pluggable are not 
enough)

like Spark to customize the metadata per entry [3].

Hope this helps,

[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#cfg-multi-writes
[2] 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/spark.html#spark-write-dyn
[3] https://github.com/elasticsearch/elasticsearch-hadoop/issues/358

On 1/19/15 4:50 PM, Julien Naour wrote:

I implemented a solution for my problem.
I use a foreachPartitions and instantiate a bulk processor using a transport 
client (i.e. one by partition) to send
documents.
It's not fast but it works.
Somebody have an idea to be more efficient?

Julien

On Thursday, January 15, 2015 at 4:40:22 PM UTC+1, Julien Naour wrote:

My previous idea doesn't seem to work. Cannot send documents directly to _bulk only 
to "index/type" pattern

On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:

Hi,

I work on a complex workflow using Spark (Parsing, Cleaning, Machine 
Learning).
At the end of the workflow I want to send aggregated results to 
elasticsearch so my portal could query data.
There will be two types of processing: streaming and the possibility to 
relaunch workflow on all available data.

Right now I use elasticsearch-hadoop and particularly the spark part to 
send document to elasticsearch with the
saveJsonToEs(myindex, mytype) method.
The target is to have an index by day using the proper template that we 
build.
AFAIK you could not add consideration of a feature in a document to 
send it to the proper index in
elasticsearch-hadoop.

What is the proper way to implement this feature?
Have a special step useing spark and bulk so that each executor send 
documents to the proper index considering
the feature of each line?
Is there something that I missed in elasticsearch-hadoop?

Julien

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/dad84f38-cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54BD2062.1050907%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Where is the data stored? ElasticSearch YARN

2015-01-19 Thread Costin Leau
Installing a plugin or changing a configuration means restarting each node and since YARN is not persistent, one would 
have to handle this outside.
es-yarn could potentially address that however at that point, it becomes more a puppet/chef feature which is outside the 
scope of the project.
The simplest solution would be to simply modify the elasticsearch.zip that you are using as that one would be installed 
on each node - whether it's a configuration

or installing a plugin, as long as its part of the zip, it will be distributed 
across each node.

On 1/17/15 6:35 PM, Dan Cieslak wrote:

The section in 
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/ey-setup.html#_storage
 does
not really describe _how_ to configure the different options, just that they 
are available. From the page:

Each container can currently access its local storage - with proper 
configuration this can be kept outside the
disposable container folder thus allowing the data to live between 
restarts. This is the recommended approach as it
offers the best performance and due to Elasticsearch itself, redundancy as 
well (through replicas).


But below it says:

If no storage is configured, out of the box Elasticsearch will use its 
container storage which means when the
container is disposed, so is its data. In other words, between restarts any 
existing data is destroyed.


What is _not_ described is how to configure storage, especially for the "recommended 
approach" where data would live
between restarts

So how would one configure elasticsearch-yarn for the recommended approach? 
Does one make changes in the
elasticsearch.zip's config files? If so, what settings?

Thanks
Dan

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to
elasticsearch+unsubscr...@googlegroups.com 
.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d1c1cf45-853f-402b-a49a-6bd8c32bd876%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.


--
Costin

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54BD1EA9.3080807%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch-hadoop for spark, index documents from a RDD in different index by day: myindex-2014-01-01 for example

2015-01-19 Thread Julien Naour
I implemented a solution for my problem.
I use a foreachPartitions and instantiate a bulk processor using a 
transport client (i.e. one by partition) to send documents.
It's not fast but it works.
Somebody have an idea to be more efficient?

Julien

On Thursday, January 15, 2015 at 4:40:22 PM UTC+1, Julien Naour wrote:
>
> My previous idea doesn't seem to work. Cannot send documents directly to 
> _bulk only to "index/type" pattern
>
> On Thursday, January 15, 2015 at 4:17:57 PM UTC+1, Julien Naour wrote:
>>
>> Hi,
>>
>> I work on a complex workflow using Spark (Parsing, Cleaning, Machine 
>> Learning).
>> At the end of the workflow I want to send aggregated results to 
>> elasticsearch so my portal could query data.
>> There will be two types of processing: streaming and the possibility to 
>> relaunch workflow on all available data.
>>
>> Right now I use elasticsearch-hadoop and particularly the spark part to 
>> send document to elasticsearch with the saveJsonToEs(myindex, mytype) 
>> method.
>> The target is to have an index by day using the proper template that we 
>> build.
>> AFAIK you could not add consideration of a feature in a document to send 
>> it to the proper index in elasticsearch-hadoop.
>>
>> What is the proper way to implement this feature? 
>> Have a special step useing spark and bulk so that each executor send 
>> documents to the proper index considering the feature of each line?
>> Is there something that I missed in elasticsearch-hadoop?
>>
>> Julien
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dad84f38-cc36-4351-9d13-e0d1f461ebe9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Configure Kibana for HTTPS

2015-01-19 Thread Karthik M
 I want the front end of ES (kibana) to run on SSL but keep the backend 
connection from Kibana to ES unencrypted since both are running on the same 
host. I configured Apache2 to accept SSL connections and it works but when 
Kibana populates the dashboard it get the below error. Any help is very 
much appreciated. Could not reach 
http://ec2-XX-XX-XX-XX.compute-1.amazonaws.com/elasticsearch/_nodes 
. If 
you are using a proxy, ensure it is configured correctly

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00d13d00-3687-42ca-8b9a-b5c52612de48%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: knapsack use case

2015-01-19 Thread Matteo Moci
…just a small update in case anyone was wondering:

I completed an elasticsearch-knapsack export to file from a 0.90.7 (with
the plugin built with 0.90.7 dependencies) that was correctly re-imported
in a 1.4.2 instance with the latest plugin version, including settings and
aliases.

I just checked out the source from github and changed 2-3 lines due to API
changes and assembled the plugin to be installed on a 0.90.7 instance.

Just wanted to say thanks!

Best,
Matteo


On Thu, Jul 31, 2014 at 11:33 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> Snapshot/restore is always recommended, but is a 1.0 feature. This is a
> standard API of ES and well supported by the ES team. With that, you can
> handle all kinds of index data safely on a binary level, fully and
> incrementally.
>
> Knapsack plugin is for document export/import only. I wrote it to
> transport _source data harvested over a long time period from a  < 1.0
> system to a production system. It works on _source or stored fields only.
> It uses search/query and bulk indexing API without snapshots, so it is up
> to the admin to stop index writes while knapsack runs. There is also a
> lookup of index settings and mappings, this information is also included in
> the export archive file, and re-applied at import time. But, there is no
> check if these settings/mappings can be applied on the target successfully,
> this is left to the admin to prepare plugins, analyzers, etc. Aliases are
> not transported but this is a good idea for improvement.
>
> Currently, knapsack plugin does not work on ES 1.3 but I am progressing to
> implement this. I am adding a Java-level API. Currently it is REST only.
>
> Jörg
>
>
> On Thu, Jul 31, 2014 at 11:05 AM, Matteo Moci  wrote:
>
>> Hi All,
>> I have some questions about the knapsack plugin [1].
>>
>> My idea to use the tool to do a backup to a file, starting from a 0.90.x
>> instance and then restore it on a different 1.2.x or 1.3.x instance. I see
>> it can't be done directly, copying to a local/remote cluster.
>>
>> Would it work doing an intermediate step with a file?
>> Or the backup still has metadata about the es version it was generated
>> from, making it impossible?
>>
>> Is the snapshot and restore feature [2] useful in my use case, or not?
>>
>> Is the knapsack plugin able to backup and restore also aliases and
>> mappings, or do I have to manually migrate them before restoring data?
>>
>> Thanks for the patience and the great work!
>> Matteo
>>
>> [1] https://github.com/jprante/elasticsearch-knapsack
>> [2]
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html
>>
>> --
>> Matteo Moci
>> http://mox.fm
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGrXthJjCchEf2oyvXKnSZyBp31nvnAeXwAZJaEkvnT5Q%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ6PvJyeF04ERyeb26LhXoHR%3DMMs5sc5KV2ASLF_UK6b%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Logstash Websocket input plugin

2015-01-19 Thread Mark Sherman
Hi,

Has anyone written a Logstash Websocket input plugin that implements a 
Websocket server? The one in the current version only supports a Websocket 
client.

Thanks,
Mark

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57913fd7-40ad-4708-92be-a33b3481d335%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


template queries and mustache used for collections

2015-01-19 Thread Rafael Kubina
Hello @all,

I´m struggling with mustache and collections together with template queries:

my objects/docs:
{
 "id":1
 "knows":[2,3,4,5]
}
.. and so on

desired query to be rendered:
...
"query" : {
  "should": [
{ 
  "term": {"knows": 2 }
   },
{ 
  "term": {"knows": 3 }
   },
  ]
}
...

current mustache template (w/o errors while storing):
{
  "template" : {
"query" : {
  "filtered" : {
"filter" : {
  "query" : {
"should": [
  "{{#know}}{{ { \"term\": { \"knows\": {{know}} } } 
}}{{/know}}"
]
  }
}
  }
}
  }
}


passed params (throws  Parse Failure [Failed to parse source 
[{"query":{"filtered":{"filter":{"query":{"should":[" } } }} } } 
}}"]}]]]  ):
{
  "params" : {
"know" : [2,3]
  }
}


Is anyone able to help me out?


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a2dd823a-6270-48e3-b019-78dee5f71e49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Only send requests directly to data notes and not master nodes?

2015-01-19 Thread Mark Walkom
Again, it's up to you. I'd suggest yes and let the masters manage the
cluster. You may need more client nodes though.

On 19 January 2015 at 19:48, Justin Zhu  wrote:

> Would all transport clients only connect to this client node? Right now we
> have them connecting to all 3 master node.
>
> On Sunday, January 18, 2015 at 8:43:08 PM UTC-8, Mark Walkom wrote:
>>
>> It depends on your use, but try adding one client in with 8GB heap and
>> see how you go.
>>
>> On 19 January 2015 at 16:48, Justin Zhu  wrote:
>>
>>> We give the master nodes 5gb of memory -- stats are showing low cpu &
>>> memory utilization. Would you still recommend the client only node? If so,
>>> how many & powerful?
>>>
>>>
>>> On Saturday, January 17, 2015 at 6:55:12 PM UTC-8, Mark Walkom wrote:

 Depends, sounds like you need a few client nodes if you are OOMing your
 masters (which, is a bad thing to happen to masters).

 On 18 January 2015 at 10:23, Justin Zhu  wrote:

> We have a 9 node cluster, 3 masters, 6 data. We've been using the java
> transport client, which connects to all 3 masters. Occassionaly the 
> masters
> become unresponsive and needs to restart.
>
> Should the transport clients connect to the 6 data nodes directly
> instead?
>
> Thanks.
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/752c15cc-bf88-4219-8699-279466534696%40goo
> glegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/6d55f417-6584-46b4-83e5-daf1267e771e%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/08805992-9794-4465-b900-d067bcbed3a3%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8wht-8HQ3t5B%3D5E9X5nXBRw%3D_318czUy0PZ_a__ZKq8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: shards flapping

2015-01-19 Thread Mark Walkom
Bouncing? Does it allocate or just sit in a allocating state?

On 20 January 2015 at 00:06, daaku gee  wrote:

> I am running 3 node Elasticsearch-1.3 cluster.  One of the primary shards
> for an index that has no more documents being indexed to it, keeps bouncing
> from node to node. No its not relocating.
>
> Allocating the shard to any one  node doesn't work.
>
> What can I do to stabilize it?
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAH53p94GkK2iw3m4-4UKTv8JPORD6k%2B7DZ1jLYrfgT7LqX_20g%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_ntBtgGMfM2YFx7wgUSdydjTW-s4XD5wxv-9af%3DkAA2g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: A question about keyword_marker

2015-01-19 Thread Nassim
Thanks. Is a list of 10 000 words is too big ? 

Le dimanche 18 janvier 2015 22:29:22 UTC+1, Adrien Grand a écrit :
>
> Tokens are stored in a hash table, which provides random access in 
> constant time so I would not worry too much about performance. However, 
> these tokens will be stored in memory so you should keep the size of the 
> list reasonable.
>
> On Sun, Jan 18, 2015 at 4:58 PM, Nassim 
> > wrote:
>
>> Hi all,
>>
>> I would like to know if there is a limitation of the number of words that 
>> we can give to the keyword_marker instruction ? And if there is a big 
>> impact on the performance of ES ?
>>
>> Thank you ! 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/039be4ae-c273-4ec8-b131-b4eeb7d38f5c%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/daa4715c-9d73-4080-9e94-95ca4d2c28e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


shards flapping

2015-01-19 Thread daaku gee
I am running 3 node Elasticsearch-1.3 cluster.  One of the primary shards
for an index that has no more documents being indexed to it, keeps bouncing
from node to node. No its not relocating.

Allocating the shard to any one  node doesn't work.

What can I do to stabilize it?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAH53p94GkK2iw3m4-4UKTv8JPORD6k%2B7DZ1jLYrfgT7LqX_20g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-19 Thread joergpra...@gmail.com
As said, insert/update/delete in DB are not visible to the JDBC plugin and
they can not be detected by the JDBC plugin. The metric is for the SQL
select statement the plugin is using. If you mean insert/update/delete in
ES, there is no such thing, there are indexing operations performed in
bulk, which can be monitored by ES monitoring plugins or ES metrics.

Jörg

On Mon, Jan 19, 2015 at 11:55 AM, 4m7u1  wrote:

> Okay. Thanks a lot. Is there any other way in which I can see the metrics.
> If I have indexed my db with 2 records in Elasticsearch, for that I can
> see the metrics as above. But say that, I have made one new
> insert/update/delete into my db and that is picked by elasticsearch, is
> there any way in which I can see the time taken for that one particular
> insert/update/delete?
>
> On Monday, January 19, 2015 at 3:36:47 PM UTC+5:30, Jörg Prante wrote:
>>
>> The numbers in brackets mean the average numbers over the last 1 / 5 / 15
>> minutes. This useful for a long running indexer.
>>
>> The "elapsed" time is the time since this metric counter is active.
>>
>> Jörg
>>
>> On Mon, Jan 19, 2015 at 10:41 AM, 4m7u1  wrote:
>>
>>> Hi,
>>> I've got a small doubt again.
>>>
>>> *[2015-01-19 15:06:08,410][INFO ][river.jdbc.RiverMetrics  ] pipeline
>>> org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@1cede282 complete: river
>>> jdbc/my_test_river metrics: 34 rows, 170.35844075426294 mean,
>>> (147.82076731011736 30.98274869393544 10.409690875802436), ingest metrics:
>>> elapsed 8 seconds, 57.57 MB bytes, 177.0 bytes avg, 6.848 MB/s*
>>>
>>> So here,
>>> Average no of rows picked per second is 170 and  what do the values in
>>> the brackets mean  (147.82076731011736 30.98274869393544
>>> 10.409690875802436).
>>> and also ingest metrics: elapsed 8 seconds mean that the total time
>>> taken in the corresponding poll is 8 seconds?
>>>
>>> Thank you.
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/4c9f209b-cefb-4709-a2b9-a98f971eecb9%
>>> 40googlegroups.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/139a365e-aca7-45e3-b1d7-7bd3a08dfbda%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH8bvDfBAxsP0vi3GKpzDfTs1_dpH%3D62br9SaJYUBHwsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-19 Thread 4m7u1
Okay. Thanks a lot. Is there any other way in which I can see the metrics. 
If I have indexed my db with 2 records in Elasticsearch, for that I can 
see the metrics as above. But say that, I have made one new 
insert/update/delete into my db and that is picked by elasticsearch, is 
there any way in which I can see the time taken for that one particular 
insert/update/delete?

On Monday, January 19, 2015 at 3:36:47 PM UTC+5:30, Jörg Prante wrote:
>
> The numbers in brackets mean the average numbers over the last 1 / 5 / 15 
> minutes. This useful for a long running indexer.
>
> The "elapsed" time is the time since this metric counter is active.
>
> Jörg
>
> On Mon, Jan 19, 2015 at 10:41 AM, 4m7u1 
> > wrote:
>
>> Hi,
>> I've got a small doubt again.
>>
>> *[2015-01-19 15:06:08,410][INFO ][river.jdbc.RiverMetrics  ] pipeline 
>> org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@1cede282 complete: river 
>> jdbc/my_test_river metrics: 34 rows, 170.35844075426294 mean, 
>> (147.82076731011736 30.98274869393544 10.409690875802436), ingest metrics: 
>> elapsed 8 seconds, 57.57 MB bytes, 177.0 bytes avg, 6.848 MB/s*
>>
>> So here,
>> Average no of rows picked per second is 170 and  what do the values in 
>> the brackets mean  (147.82076731011736 30.98274869393544 
>> 10.409690875802436).
>> and also ingest metrics: elapsed 8 seconds mean that the total time taken 
>> in the corresponding poll is 8 seconds?
>>
>> Thank you.
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4c9f209b-cefb-4709-a2b9-a98f971eecb9%40googlegroups.com
>>  
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/139a365e-aca7-45e3-b1d7-7bd3a08dfbda%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


filter over record... its posible

2015-01-19 Thread Pablo Blasco
It's posible to filter by record field like cliente.cif i dont have results 
with this query but i have results if i put on a query ... thx

POST /EMPRESA-1/_search

{
  "from" : 0,
  "size" : 500,
  "query" : {
"filtered" : {
  "query" : {
"bool" : {
  "minimum_should_match" : "100%"
}
  },
  "filter" : {
"and" : {
  "filters" : [ {
"term" : {
  "idempresa" : 1
}
  }, {
"term" : {
  "idorganizacion" : "1"
}
  }, {
"term" : {
  "tipo" : "emi"
}
  }, {
"term" : {
  "cliente.cif" : "XX"
}
  } ]
}
  }
}
  },
  "explain" : true,
  "fields" : [ "idfactura", "numero", "serie", "tipo", 
"centro.denominacion", "cliente.razonsocial", "cliente.idcliente", 
"proveedor.razonsocial", "proveedor.idproveedor", "tieneNotas", 
"tieneAttachs", "total", "fecha", "peso", "filepath", "filename", 
"firmada_ok", "leida", "firmada", "enviada", "idmoneda" ],
  "sort" : [ {
"fecha" : {
  "order" : "desc"
}
  } ]
}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ae22819-1eac-434d-9f87-9cf431131987%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Correct way to use TransportClient connection object

2015-01-19 Thread joergpra...@gmail.com
This is an exception because TransportClient still has open requests and
was not closed in time, i.e. before Tomcat closed the web app and removed
the class loader.

Jörg

On Mon, Jan 19, 2015 at 4:37 AM, Subhadip Bagui  wrote:

> Hi,
>
> In the same context... some times when I'm shutting down tomcat getting
> the below exception. And other times it works. Any idea why ?
>
> Jan 19, 2015 8:59:30 AM org.apache.catalina.core.StandardContext
> listenerStop
> SEVERE: Exception sending context destroyed event to listener instance of
> class com.aricent.aricloud.es.service.ESClientFactory
> java.lang.NoClassDefFoundError:
> org/elasticsearch/transport/netty/NettyTransport$4
> at
> org.elasticsearch.transport.netty.NettyTransport.doStop(NettyTransport.java:403)
> at
> org.elasticsearch.common.component.AbstractLifecycleComponent.stop(AbstractLifecycleComponent.java:105)
> at
> org.elasticsearch.transport.TransportService.doStop(TransportService.java:100)
> at
> org.elasticsearch.common.component.AbstractLifecycleComponent.stop(AbstractLifecycleComponent.java:105)
> at
> org.elasticsearch.common.component.AbstractLifecycleComponent.close(AbstractLifecycleComponent.java:117)
> at
> org.elasticsearch.client.transport.TransportClient.close(TransportClient.java:268)
> at
> com.aricent.aricloud.es.service.ESClientFactory.shutdown(ESClientFactory.java:118)
> at
> com.aricent.aricloud.es.service.ESClientFactory.contextDestroyed(ESClientFactory.java:111)
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4b948428-c260-4aef-ad82-93346e7488cb%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFv4vXGUHK%3DGR11UDb6KZ1N8jT7WMGeKkpj1UCcSwV45w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-19 Thread joergpra...@gmail.com
The numbers in brackets mean the average numbers over the last 1 / 5 / 15
minutes. This useful for a long running indexer.

The "elapsed" time is the time since this metric counter is active.

Jörg

On Mon, Jan 19, 2015 at 10:41 AM, 4m7u1  wrote:

> Hi,
> I've got a small doubt again.
>
> *[2015-01-19 15:06:08,410][INFO ][river.jdbc.RiverMetrics  ] pipeline
> org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@1cede282 complete: river
> jdbc/my_test_river metrics: 34 rows, 170.35844075426294 mean,
> (147.82076731011736 30.98274869393544 10.409690875802436), ingest metrics:
> elapsed 8 seconds, 57.57 MB bytes, 177.0 bytes avg, 6.848 MB/s*
>
> So here,
> Average no of rows picked per second is 170 and  what do the values in the
> brackets mean  (147.82076731011736 30.98274869393544 10.409690875802436).
> and also ingest metrics: elapsed 8 seconds mean that the total time taken
> in the corresponding poll is 8 seconds?
>
> Thank you.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4c9f209b-cefb-4709-a2b9-a98f971eecb9%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH7GxfsFveCTCdk77JZ_sUySmtn33Peerk2UrRYV%2BDNuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: JDBC plugin Feeder Mode

2015-01-19 Thread joergpra...@gmail.com
The "push model" works like this: a standalone JVM is running the JDBC
plugin, this plugin connects to an Elasticsearch cluster using the
TransportClient. Then, SQL statement(s) are executed, the rows are
processed, and indexed into Elasticsearch with the bulk processor.

Because the nodes in the cluster are not "pulling" data from an external
source into the cluster JVMs like river instances do, the standalone JDBC
plugin JVM is what I call "push" model.

The JDBC plugin does not recognize if there is new data in the DB.

Jörg

On Mon, Jan 19, 2015 at 8:01 AM, 4m7u1  wrote:

> Hi,
>
> This is what I've understood so far, JDBC plugin in Feeder mode is run as
> a bash script with parameters similar to river. The documentation says that
> it is a push model. Can anyone explain how does it work? If i have a new
> data pushed into my db, what role does the feeder play from here on?
>
> Thank you.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/54750384-49bd-471d-8cac-86aa9f43a9fb%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGCLs6XFVpCo4-YZpi%3DYEKZESKdXNo-Qi5R2Hmm8R3u%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elsticsearch JDBC river plugin metrics

2015-01-19 Thread 4m7u1
Hi,
I've got a small doubt again.

*[2015-01-19 15:06:08,410][INFO ][river.jdbc.RiverMetrics  ] pipeline 
org.xbib.elasticsearch.plugin.jdbc.RiverPipeline@1cede282 complete: river 
jdbc/my_test_river metrics: 34 rows, 170.35844075426294 mean, 
(147.82076731011736 30.98274869393544 10.409690875802436), ingest metrics: 
elapsed 8 seconds, 57.57 MB bytes, 177.0 bytes avg, 6.848 MB/s*

So here,
Average no of rows picked per second is 170 and  what do the values in the 
brackets mean  (147.82076731011736 30.98274869393544 10.409690875802436).
and also ingest metrics: elapsed 8 seconds mean that the total time taken 
in the corresponding poll is 8 seconds?

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c9f209b-cefb-4709-a2b9-a98f971eecb9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Performance issues when sending documents to multiple indexes at the same time.

2015-01-19 Thread Nawaaz Soogund
Hi Mark.


Thanks for getting back to us. What are the options should we want to keep 
our customers' data separate - like a chinese wall strategy? Although it is 
technically possible to have them together, we have other operational and 
business reasons to have them separate.

I'll try with the one replica x 3 shards with the setup we have on one 
customer only and post the findings :)

Thanks

On Friday, January 16, 2015 at 11:15:29 PM UTC, Mark Walkom wrote:
>
> You've got too many replicas and shards. One shard per node (maybe 2) and 
> one replica is enough.
>
> You should be using the bulk API as well.
>
> What's your heap set to?
>
> Also consider combining customers into one index, it'll reduce the work 
> you need to do.
>  On 17/01/2015 4:07 am, "Nawaaz Soogund"  > wrote:
>
>> We are experiencing some performance issues or anomalies on a 
>> elasticsearch specifically on a system we are currently building.
>>
>>  
>>
>> *The requirements:*
>>
>> We need to capture data for multiple of our customers,  who will query 
>> and report on them on a near real time basis. All the documents received 
>> are the same format with the same properties and are in a flat structure 
>> (all fields are of primary type and no nested objects). We want to keep 
>> each customer’s information separate from each other.
>>
>>  
>>
>> *Frequency of data received and queried:*
>>
>> We receive data for each customer at a fluctuating rate of 200 to 700 
>> documents per second – with the peak being in the middle of the day.
>>
>> Queries will be mostly aggregations over around 12 million documents per 
>> customer – histogram/percentiles to show patterns over time and the 
>> occasional raw document retrieval to find out what happened a particular 
>> point in time. We are aiming to serve 50 to 100 customer at varying rates 
>> of documents inserted – the smallest one could be 20 docs/sec to the 
>> largest one peaking at 1000 docs/sec for some minutes.
>>
>>  
>>
>> *How are we storing the data:*
>>
>> Each customer has one index per day. For example, if we have 5 customers, 
>> there will be a total of 35 indexes for the whole week. The reason we break 
>> it per day is because it is mostly the latest two that get queried with 
>> occasionally the remaining others. We also do it that way so we can delete 
>> older indexes independently of customers (some may want to keep 7 days, 
>> some 14 days’ worth of data)
>>
>>  
>>
>> *How we are inserting:*
>>
>> We are sending data in batches of 10 to 2000 – every second. One document 
>> is around 900bytes raw.
>>
>>  
>>
>> *Environment*
>>
>> AWS C3-Large – 3 nodes
>>
>> All indexes are created with 10 shards with 2 replica for the test 
>> purposes
>>
>> Both Elasticsearch 1.3.2 and 1.4.1
>>
>>  
>>
>> *What we have noticed:*
>>
>>  If I push data to one index only, Response time starts at 80 to 100ms 
>> for each batch inserted when the rate of insert is around 100 documents per 
>> second.  I ramp it up and I can reach 1600 before the rate of insert goes 
>> to close to 1sec per batch and when I increase it to close to 1700, it will 
>> hit a wall at some point because of concurrent insertions and the time will 
>> spiral to 4 or 5 seconds. Saying that, if I reduce the rate of inserts, 
>> Elasticsearch recovers nicely. CPU usage increases as rate increases.
>>
>>  
>>
>> If I push to 2 indexes concurrently, I can reach a total of 1100 and CPU 
>> goes up to 93% around 900 documents per second.
>>
>> If I push to 3 indexes concurrently, I can reach a total of 150 and CPU 
>> goes up to 95 to 97%. I tried it many times. The interesting thing is that 
>> response time is around 109ms at the time. I can increase the load to 900 
>> and response time will still be around 400 to 600 but CPU stays up.
>>
>>
>> *Question:*
>>
>> Looking at our requirements and findings above, is the design convenient 
>> for what’s asked? Are there any tests that I can do to find out more? Is 
>> there any setting that I need to check (and change)? 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/76ecd8bb-97cc-4125-8f1a-50de69c2790f%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c441669-4f65-4f99-8402-d1681

Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-01-19 Thread Lukáš Vlček
Hi Petr,

let me try to address some of your questions:

ad 1) I am not sure I understand what you mean. If you want to use span
type of query then simply use it instead of query string query. Especially,
if you pass user input into the query then it is recommended NOT to use
query string query and you should consider using different query type (like
span query in your case).

ad 2) Not sure I fully understand but I can see match for some of those
requested features in span queries. Like "slop". I would recommend you to
read through chapters of "Proximity Matching" [1] to see how you can use
"slop".

ad 3) The input that goes into span queries can go through text analysis
process (as long as I am not mistaken). The fact that there are term
queries behind the scene does not mean you can not process your analysis
first.

May be if you can share some of your configs/documents/queries we can help
you more.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/proximity-matching.html

Regards,
Lukas

On Mon, Jan 19, 2015 at 10:02 AM, Petr Janský  wrote:

> Noone? :-(
>
> Petr
>
> Dne úterý 13. ledna 2015 15:37:18 UTC+1 Petr Janský napsal(a):
>>
>> Hi there,
>>
>> I'm looking for a way how to access span_near and span_first
>> functionality to users via search box in gui that uses query string query.
>>
>>1. Is there any easy way how to do it?
>>2. Will ElasticSeach folks implement operators like NEARx, BEFOR,
>>AFTER, FIRSTx, LASTx to be able search by (using query string):
>>   - specific max word distance between key words
>>   - order of key words
>>   - word position of key word in field from start and end of field
>>   text
>>3. Span queries enable to use only terms, is there a way how to use
>>words that will be analysed by lang. analyser - stemming etc.?
>>
>>
>> Thanks
>> Petr
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbmQvoJDFQ2aqQdy2eFLOH4RJja4KR__hfyqanJfx2fmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Pre-Registered Templates & Java API & passing collection to param

2015-01-19 Thread David Pilato
I’m afraid you can not as this fix has been only applied in master and 1.x.
https://github.com/elasticsearch/elasticsearch/commit/fda1576d55fdd56fd44df6d4608dca74d33d1536
 

https://github.com/elasticsearch/elasticsearch/commit/350a2db8dfcaee134f85c8d8419970812a49ea2c
 


So it will be available in 1.5.


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet  | @elasticsearchfr 
 | @scrutmydocs 




> Le 19 janv. 2015 à 09:54, Rafael Kubina  a écrit :
> 
> Hello Group,
> 
> I´m facing a problem with pre-registered templates. I´m able to store them, 
> and I´m able to call them via REST (passing params) - everything works like a 
> charm :)
> But using the Java-API there is a problem: the setTemplateParams method only 
> accepts a Map as a parameter.
> 
> I want to pass a collection of numbers to my tempate in order to get a 
> result, without success, because i need to pass a Map to my 
> SearchRequestBuilder.
> 
> I´ve found this (this is similar to what I´m looking for): 
> https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/index/query/TemplateQueryTest.java#L393
> 
> Map arrayTemplateParams = new HashMap<>();
> String[] fieldParams = {"foo","bar"};
> arrayTemplateParams.put("fieldParam", fieldParams);
> 
> SearchResponse searchResponse = 
> client().prepareSearch("test").setTypes("type").setTemplateName("/mustache/4").setTemplateType(ScriptService.ScriptType.INDEXED).setTemplateParams(arrayTemplateParams).get();
> assertHitCount(searchResponse, 5);
> 
> 
> Not in version 1.4.2 nor in 1.4.3-SNAPSHOT the method signature have changed. 
> So how can I pass (not string) collections to my template?
> 
> Cheers,
> Rafael
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/99aac531-20e7-44c5-821f-e04b4f7b9f38%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7AEDDBDE-2B9D-4660-8792-8257DF0EEC19%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Operators NEARx, BEFOR, AFTER, FIRSTx, LASTx

2015-01-19 Thread Petr Janský
Noone? :-(

Petr

Dne úterý 13. ledna 2015 15:37:18 UTC+1 Petr Janský napsal(a):
>
> Hi there,
>
> I'm looking for a way how to access span_near and span_first functionality 
> to users via search box in gui that uses query string query.
>
>1. Is there any easy way how to do it?
>2. Will ElasticSeach folks implement operators like NEARx, BEFOR, 
>AFTER, FIRSTx, LASTx to be able search by (using query string):
>   - specific max word distance between key words
>   - order of key words
>   - word position of key word in field from start and end of field 
>   text
>3. Span queries enable to use only terms, is there a way how to use 
>words that will be analysed by lang. analyser - stemming etc.?
>
>
> Thanks
> Petr
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f90a0eba-1b61-4a23-a2af-ec6a0c5e461f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Pre-Registered Templates & Java API & passing collection to param

2015-01-19 Thread Rafael Kubina
Hello Group,

I´m facing a problem with pre-registered templates. I´m able to store them, 
and I´m able to call them via REST (passing params) - everything works like 
a charm :)
But using the Java-API there is a problem: the setTemplateParams method 
only accepts a Map as a parameter.

I want to pass a collection of numbers to my tempate in order to get a 
result, without success, because i need to pass a Map to my 
SearchRequestBuilder.

I´ve found this (this is similar to what I´m looking 
for): 
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/index/query/TemplateQueryTest.java#L393

Map arrayTemplateParams = new HashMap<>();
String[] fieldParams = {"foo","bar"};
arrayTemplateParams.put("fieldParam", fieldParams);

SearchResponse searchResponse = 
client().prepareSearch("test").setTypes("type").setTemplateName("/mustache/4").setTemplateType(ScriptService.ScriptType.INDEXED).setTemplateParams(arrayTemplateParams).get();
assertHitCount(searchResponse, 5);


Not in version 1.4.2 nor in 1.4.3-SNAPSHOT the method signature have 
changed. So how can I pass (not string) collections to my template?

Cheers,
Rafael

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/99aac531-20e7-44c5-821f-e04b4f7b9f38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elk 1.4.2 + kibana-3.1.2 Connection Failed, add "http.cors.enabled: true" does not help.

2015-01-19 Thread EShen
1. install elastic search 1.4.2 (up and running http://hostip:9200 works) 
2. install kibana 3.1.2  
3. install lighthttpd.
1,2,3 are in the same node.
1,2,3 are run by same user/group.   
Could anyone help? 

configuration is below. 
1-elastic-search-config.  add line in elasticsearch.yml: http.cors.enabled: 
true
2-kibana config:  add one line in /var/www/kibana3/config.js:  elasticsearch: 
"http://localhost:9200";,
3.-lighthttpd config: /etc/lighttpd/conf.d/kibana3.conf 



  #ServerName FQDN


  DocumentRoot /var/www/kibana3

  

Allow from all

Options -Multiviews

  


  LogLevel debug

  ErrorLog /var/log/lighthttpd/error_log

  CustomLog /var/log/lighthttpd/access_log combined


  # Set global proxy timeouts

  http://127.0.0.1:9200>

ProxySet connectiontimeout=5 timeout=90

  


  # Proxy for _aliases and .*/_search

  

ProxyPassMatch http://127.0.0.1:9200/$1

ProxyPassReverse http://127.0.0.1:9200/$1

  


  # Proxy for kibana-int/{dashboard,temp} stuff (if you don't want auth on 
/, then you will want these to be protected)

  

ProxyPassMatch http://127.0.0.1:9200/$1$2

ProxyPassReverse http://127.0.0.1:9200/$1$2

  


 # 

 #   AuthType Basic

 #   AuthBasicProvider file

 #   AuthName "Restricted"

 #   AuthUserFile /etc/lighthttpd/conf.d/kibana-htpasswd

 #   Require valid-user

 # 




/etc/lighthttpd/lighthttpd.conf: 

server.modules = (

"mod_access",

"mod_alias",

"mod_compress",

"mod_redirect",

#   "mod_rewrite",

)


server.document-root= "/var/www"

server.upload-dirs  = ( "/var/cache/lighttpd/uploads" )

server.errorlog = "/var/log/lighttpd/error.log"

server.pid-file = "/var/run/lighttpd.pid"

server.username = "myusernamerunelk"

server.groupname= "myusergrouprunelk"


index-file.names= ( "index.php", "index.html",

"index.htm", "default.htm",

   " index.lighttpd.html" )


url.access-deny = ( "~", ".inc" )


static-file.exclude-extensions = ( ".php", ".pl", ".fcgi" )


## Use ipv6 if available

#include_shell "/usr/share/lighttpd/use-ipv6.pl"


dir-listing.encoding= "utf-8"

server.dir-listing  = "enable"


compress.cache-dir  = "/var/cache/lighttpd/compress/"

compress.filetype   = ( "application/x-javascript", "text/css", 
"text/html", "text/plain" )


include_shell "/usr/share/lighttpd/create-mime.assign.pl"

include_shell "/usr/share/lighttpd/include-conf-enabled.pl"

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ded075ac-5074-468d-88b1-8da76dffe842%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.