date:20150501

Re: Documents not deleted when using DeleteRequest within BulkProcessor

2015-05-01 Thread David Pilato

No as soon as you have only one index for this alias, indexing and deleting 
should work.

I don’t see anything suspicious here.

Any chance you could share on github your full code?


When you say that nothing happens, do you mean that you never get the debug 
LOG « Processing {} … » ?
Or do you mean that the document has not been removed ?

How do you test all that?

-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs





 Le 30 avr. 2015 à 19:10, Diana Tuck dtu...@gmail.com a écrit :
 
 Thank you for the reply, David.
 
 We are using an alias to delete.  Is that a problem?  Indexing with the alias 
 through the bulk processor works fine.
 
 There are no errors reported, it just seems to disappear into the oblivion.  
 Here's our code for the BulkProcessor:
 
 public static BulkProcessor getBulkProcessor(Client client, int 
 esConcurrencyLevel, int esBulkSize, int esFlushInterval) {
 return BulkProcessor.builder(client, new BulkProcessor.Listener() {
 
 @Override
 public void beforeBulk(long executionId, BulkRequest bulkRequest) {
 LOG.debug(Processing {} requests in bulk process {}, 
 bulkRequest.numberOfActions(), executionId);
 }
 
 @Override
 public void afterBulk(long executionId, BulkRequest bulkRequest, 
 BulkResponse response) {
 
 if (response.hasFailures()) {
 for (BulkItemResponse item : response.getItems()) {
 LOG.error(Processing to index \{}\ failed for entity 
 id {} with message {}, item.getIndex(),
 item.getId(), item.getFailureMessage());
 }
 }
 }
 
 @Override
 public void afterBulk(long executionId, BulkRequest bulkRequest, 
 Throwable throwable) {
 LOG.error(Failed to process {} requests in bulk request {}: {}, 
 bulkRequest.numberOfActions(),
 executionId, throwable.getMessage());
 throwable.printStackTrace();
 }
 })
 .setBulkActions(esBulkSize)
 .setFlushInterval(TimeValue.timeValueSeconds(esFlushInterval))
 .setConcurrentRequests(esConcurrencyLevel)
 .build();
 }
 
 Code for the delete request:
 
 bulkProcessor.add(new DeleteRequest(index.getIndexingAlias(), 
 index.getType(), entityId));
 
 where index.getIndexingAlias() is an alias (same alias used for indexing 
 which is working), type is the document type company and entityId is the 
 document ID.
 
 What data would be helpful?  An example document, the index metadata, 
 something else?
 
 On Wednesday, April 29, 2015 at 9:53:41 PM UTC-7, David Pilato wrote:
 Do you try to delete a doc using an alias?
 Any failure or error reported by the bulk processor?
 
 Hard to tell more without seeing the code / data.
 
 David
 
 Le 30 avr. 2015 à 02:03, Diana Tuck dtu...@gmail.com javascript: a écrit :
 
 Trying to index/delete documents within one BulkProcessor object in the Java 
 API.  Indexing documents works great!  Deleting, however, does not.
 
 bulkProcessor.add(new DeleteRequest(index.getIndexingAlias(), 
 index.getType(), entityId));
 
 Nothing happens.  Any ideas?
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e2774458-8542-4634-bd8d-1ccfd9837409%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e2774458-8542-4634-bd8d-1ccfd9837409%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com 
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/01b6ab18-78a8-44d0-b574-c649501ec21a%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/01b6ab18-78a8-44d0-b574-c649501ec21a%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

Re: Returning partial strings in Kibana visualisation

2015-05-01 Thread Johnny Wang

{script: _value.substring(0,8)} works for you, needing groovy sandbox
enabled.

On Wednesday, April 29, 2015 at 9:39:33 PM UTC+8, Stuart Kenworthy wrote:

I have a number of different load injector boxes and processes that
generate load through our system under test. The tool in use produces
masses of logs out but none of it is easily accessible or readable. I am
therefore using ELK to process the loads with success, however, presenting
some of the data is problematic.

The process names have a structure of:

process_name_Stressnn_Thread_nn

but there are around 180 of them. Each process thread generates 1 of 11
different message types. The message types are only distinguishable using a
10 character substring within a field containing strings and semi_colon
delimited text and generally in the same location (between character 60 and
character 70.

In elasticsearch none of these fields are analysed as this makes the
queries and results even messier in Kibana and poses the same problem when
choosing analysed elements of a field (only picking element 12 or 10-12).

When aggregation is done on either of these fields, message type is
presented as the long string in the visualisation key with only the first
10-15 characters showing, and process name resutls in all 180 processes
rather than the 7 process types.

These processes are likely to change over time as we introduce new test
scenarios and message types so I do not want to hard code them in just in
case we miss something.

Is it possible to have elasticsearch return substrings, partials, lefts,
rights etc of a field and group them as such rather than the entire field
content so all process_namea are grouped together and *msg_typeA* are
grouped together? Ideally without code edits to either elastic or kibana?
Something in JSON Input such as { field_length: 10 } or {
partial_start: 60, partial_for: 15 } would suffice.

This is akin to renaming keys, columns and rows.

Thanks

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/266c13a5-6a95-4aea-bd3e-1e7edf6eb977%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

GeoNames, Autocomplete and boost

2015-05-01 Thread Info Cascade

Hi.  I'm trying to improve autocomplete search results on a GeoNames Cities 
index.
I have been using django-haystack, but have run into issues there. I may 
need to replace it, or bypass it.  But my question here pertains to 
indexing and querying with autocomplete using multiple fields.

Users expect to be able to use two-letter abbreviations for states to 
narrow their city choices.  For example,
San Francisco, CA and New York, NY should have the cities you'd expect 
at the top of the list.
However that is not the case, and I think for different reasons.  You can 
see the results below.

It turns out that there are a lot of San Franciscos in the world! 
 Searching for San Francisco CA retrieves

San Francisco, Caraga, 13, PH 5.5191193
San Francisco, Caraga, 13, PH 5.5163627
San Francisco, Calabarzon, 40, PH 5.4498897
San Francisco, Calabarzon, 40, PH 5.281434
San Francisco, Caraga, 13, PH 5.281434
San Francisco, California, CA, US 5.2123656
South San Francisco, California, CA, US 4.3138
San Francisco (El Calvito), Chiapas, 05, MX 4.137272
San Francisco, Baja California Sur, 03, MX 4.137272
San Francisco (Baños de Agua Caliente), Guanajuato, 11, MX 3.3008962

I would like to boost the state (region_code) value so that San Francisco 
and South San Francisco are at the top.

For New York NY I get 

Nyack, New York, US 3.0575132
West Nyack, New York, US 2.670291
South Nyack, New York, US 2.5124028
Upper Nyack, New York, US 2.5124028

Instead of what I want, which is New York City, New  York, US.

The autocomplete field is EdgeNGram called content_auto.  It currently 
has the following format, which is what I want to return: CityName, 
RegionName, CountryCode. 

So I think what I want to do in both cases is boost results if there is a 
match on the region_code field, but *not* display the region_code field in 
the results.

The type of the search is currently query_string, which is what haystack 
uses.  If there is some way to make that work, then that would be good. 
 However, I'm afraid it is limiting what I'm able to do.

I did some experiments --
If I query directly with curl for sf using

{
query:{
multi_match:{
query: San Francisco CA,
type: cross_fields,
fields: [content_auto, region_code^3]
}
}
}


I get a result I'm satisfied with.  However the similar query using New 
York NY puts the city as the sixth result!  I also tried putting the 
region_code in the content_auto string, and boosting the region_code field.
Also, the following works for SF, but I have no way of knowing in advance 
what the region_code is going to be.  It ranks New  York City third, and I 
would have to pick out two-letter combinations.
default_field: text,
  default_operator: OR,
  query: (content_auto:(san) AND content_auto:(francisco)) 
CA^1.5


It would really help if someone could help me limit my *own* queries about 
how ElasticSearch works, so that I can focus on the best approach!

Thanks in advance for your help :-)




curl 'localhost:9200/cities/_mapping?pretty'
{
  cities : {
mappings : {
  modelresult : {
_boost : {
  name : boost,
  null_value : 1.0
},
properties : {
  content_auto : {
type : string,
analyzer : edgengram_analyzer
  },
  django_ct : {
type : string,
index : not_analyzed,
include_in_all : false
  },
  django_id : {
type : string,
index : not_analyzed,
include_in_all : false
  },
  id : {
type : string
  },
  location : {
type : geo_point
  },
  region_code : {
type : string,
analyzer : snowball
  },
  text : {
type : string,
analyzer : snowball
  }
}
  }
}
  }
}


NY example:
curl -XGET 'http://localhost:9200/cities/modelresult/_search?pretty' -d '{
   from: 0,
   query: {
 filtered: {
   filter: {
 terms: {
   django_ct: [
 cities.city
   ]
 }
   },
   query: {
 query_string: {
   analyze_wildcard: true,
   auto_generate_phrase_queries: true,
   default_field: text,
   default_operator: AND,
   query: (content_auto:(new) AND content_auto:(york,) AND 
content_auto:(ny))
 }
   }
 }
   },
   size: 10,
   sort: [
 {
   _score: {
 order: desc
   }
 }
   ]
 }'
{
  took : 2,
  timed_out : false,
  _shards : {
total : 5,
successful : 5,
failed : 0
  },
  hits : {
total : 4,
max_score : 3.0575132,
hits : [ {
  _index : cities,
  _type : modelresult,
  _id : cities.city.5129433,
  _score : 3.0575132,
  _source:{django_id: 5129433, region_code: NY,

Re: How to replicate this type of search

2015-05-01 Thread Peter Sorensen

Thanks John  Ivan for your insight. Very helpful. Just to see if I'm 
getting this:

In order to have a search where my users can type title: {query term} to 
limit the search only to titles, I need to program my application to parse 
the query string and then add the additional filters to the ES request. Do 
I have this right?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e8ad15f1-61bf-4f65-a061-ecfe784b62f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: too many open files problems and suggestions on cluster configuration

2015-05-01 Thread Ann Yablunovskaya

How to calculate the best amount of shards?

пятница, 1 мая 2015 г., 18:21:47 UTC+3 пользователь David Pilato написал:

Add more nodes or reduce the number of shards per node.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 17:05, Ann Yablunovskaya lad@gmail.com javascript:
a écrit :

I am looking for suggestions on cluster configuration.

I have 2 nodes (master/data and data), 544 indices, about 800 mil
documents.

If I try to insert more documents and create more indices, I will
catch error too many open files.

My node's configuration:

CentOS 7
Intel(R) Xeon(R) CPU x16
RAM 62 Gb

# ulimit -n
10

In future I will have a lot of indices (about 2000) and a lot of documents
(~5 bil or maybe more)

How can I avoid the error too many open files?

https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

es_config.pp

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7c2e1952-e718-4563-ac5c-bb92b45b0aa5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to replicate this type of search

2015-05-01 Thread Peter Sorensen

Thanks John! It's all very clear now.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b78e9ef6-1d15-47d2-ab04-7aceab61aeef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Marvel license file/order number baked into a container

2015-05-01 Thread Joel Potischman

For anyone else coming across this, an Elasticsearch engineer confirmed to
me in this question
https://groups.google.com/d/msg/elasticsearch/CFUZp6j5TOc/_muPZUAMz_kJ
that it's not currently possible but a feature request for it has now been
opened.

On Wednesday, April 29, 2015 at 3:07:21 PM UTC-4, Joel Potischman wrote:

Hi Boaz. I know this is an old thread but I can't find anything newer and
it seems very related to this issue.

Is there documentation anywhere on how to script installation of the
license initially? We have not yet run Marvel in production and we do not
allow manual steps in our deployment process. We'd like to be able to
deploy Marvel to a completely clean box and have the license already
present at the end of the deploy. Can you point me to an example curl
(dummy values, obviously)?

On Monday, October 13, 2014 at 8:48:45 AM UTC-4, Boaz Leskes wrote:

.marvel-kibana now has a 'state-2' file inside it. and obv for now
since i didnt restart or do anything of that nature i am not asked for the
license details. I wonder if what I see is that .marvel-kibana is only
stored with one primary and one replica and when i reload the cluster
sometimes i happen to load the nodes first which dont have .marvel-kibana
so thats why i get that question about the license.

The order in which you restart the nodes shouldn't really matter. Try
searching `GET .marvel-kibana/_search` and see that you get the license
document back?

Anyways, where would marvel should me how to add the license from the
command line? I'd like to see that cause I haven't run into such a prompt
yet.

Marvel only shows it if it fails to save it to the cluster. If you want
the command, it's not a problem but please reach out off list with your
details.

On Fri, Oct 10, 2014 at 9:13 PM, Daniel Schonfeld downwi...@gmail.com
wrote:

Boaz,

Anyways, where would marvel should me how to add the license from the
command line? I'd like to see that cause I haven't run into such a prompt
yet.

Thanks!

On Thursday, October 9, 2014 2:40:58 PM UTC-4, Boaz Leskes wrote:

That's weird. You should for the content of the .marvel-kibana index.
That's where it stored when you enter your license info in the UI.

Is the Marvel UI allowed to post back to ES? If that's blocked it may
explain things. Normally you will get a message from Marvel instructing
you
how to add the license from the command line.

On 9 okt. 2014, at 6:06 p.m., Daniel Schonfeld downwi...@gmail.com
wrote:

Hi Boaz,

No the data folder is persisted. And with it i have all my cluster and
indices data... but for some reason marvel asks for the license/order
number again.

Is there a file I can check for in my data folder?

Thanks!
Daniel

On Thursday, October 9, 2014 5:46:52 AM UTC-4, Boaz Leskes wrote:

Hi Daniel,

When you restart the cluster, do you also wipe all content? The marvel
license should persist once entered but if you clean the data folder,
that
will go away as well.

Cheers,
Boaz

On Wednesday, October 8, 2014 6:12:19 AM UTC+2, Daniel Schonfeld wrote:

Hello,

We have recently purchase our Marvel license, but everytime we
restart our cluster it asks us for the order number again. We use
docker
and so our containers are immutable.

Is there a file or something we can change in the filesystem that
will bake the license key into the container?

Thanks!

Daniel Schonfeld

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/UzvDQObssCM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/e4db266f-03d7-40b6-ae77-29262e155492%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e4db266f-03d7-40b6-ae77-29262e155492%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

too many open files problems and suggestions on cluster configuration

2015-05-01 Thread Ann Yablunovskaya

I am looking for suggestions on cluster configuration.

I have 2 nodes (master/data and data), 544 indices, about 800 mil documents.

If I try to insert more documents and create more indices, I will 
catch error too many open files.

My node's configuration:

CentOS 7
Intel(R) Xeon(R) CPU x16
RAM 62 Gb

# ulimit -n
10

In future I will have a lot of indices (about 2000) and a lot of documents 
(~5 bil or maybe more)

How can I avoid the error too many open files?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


es_config.pp
Description: Binary data

Re: How to replicate this type of search

2015-05-01 Thread John Veldboom

Yes, that's correct.

Depending on your application, it might be easier to have the filters as a 
dropdown. Similar how sites like Amazon allow you to choose which 
department to search within. Otherwise you'll have to rely on your users to 
type in the correct syntax - which might be what you want to allow more 
flexibility.


On Friday, May 1, 2015 at 11:26:38 AM UTC-4, Peter Sorensen wrote:

 Thanks John  Ivan for your insight. Very helpful. Just to see if I'm 
 getting this:

 In order to have a search where my users can type title: {query term} to 
 limit the search only to titles, I need to program my application to parse 
 the query string and then add the additional filters to the ES request. Do 
 I have this right?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbf03829-7675-41f2-ae5d-274e246a5fc8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: too many open files problems and suggestions on cluster configuration

2015-05-01 Thread joergpra...@gmail.com

The number of open files does not depend on the number of documents.

A shard comes not for free. Each shard can take around ~150 open file
descriptors (sockets, segment files) and up to 400-500 if actively being
indexed.

Take care of number of shards, if you have 5 shards per index, and 2000
indices per node, you would hvae to prepare 10k * 150 open file
descriptors. That is a challenge on a single RHEL 7 system providing 131072
file descriptors by default so you would have to change system limits (cat
/proc/sys/fs/file-max) - the default is already very high.

I recommend using fewer shards and redesign the application for fewer
indices (or even a single index) if you are limited to 2 nodes only. You
can look at shard routing and index aliasing if this helps:

http://www.elastic.co/guide/en/elasticsearch/guide/master/kagillion-shards.html

http://www.elastic.co/guide/en/elasticsearch/guide/master/faking-it.html

Jörg

On Fri, May 1, 2015 at 5:05 PM, Ann Yablunovskaya lad.sh...@gmail.com
wrote:

I am looking for suggestions on cluster configuration.

I have 2 nodes (master/data and data), 544 indices, about 800 mil
documents.

If I try to insert more documents and create more indices, I will
catch error too many open files.

My node's configuration:

CentOS 7
Intel(R) Xeon(R) CPU x16
RAM 62 Gb

# ulimit -n
10

In future I will have a lot of indices (about 2000) and a lot of documents
(~5 bil or maybe more)

How can I avoid the error too many open files?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE_EjkMHgT_M_KPvV%3DDSdf-NyidqOziZvg5HXizx8J8rQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Perma-Unallocated primary shards after a node has left the cluster

2015-05-01 Thread Alex Schokking

Probably super evident but the output above was actually from 
_cat/allocation?v not /recovery, sorry about that.

On Wednesday, April 29, 2015 at 5:19:08 PM UTC-7, Alex Schokking wrote:

 Hi guys, I would really appreciate some help understanding what's going 
 down with shard allocation in this case: 

 Elasticsearch version: 1.4.4

 We had 3 nodes with 1 shard and 1 replica per index (so net 2 copies of 
 everything). 1 node went down and the cluster went red. It started to 
 reallocate shards as expected and there were originally ~50 unallocated 
 shards with 15 primary and the rest replicas. 

 It's been a few hours now and there are still 15 outstanding shards that 
 are all primary that don't seem to be getting re-allocated. I thought this 
 would be a pretty standard scenario so I was really hoping I wouldn't need 
 to manually walk through and re-allocate the primary shards, but I'm not 
 sure what else to try at this point to get back to green. Any pointers 
 would be really appreciated. Here is some of the relevant seeming bits 
 folks asked about on the IRC:

 In the ES logs for the unallocated index names there are lines along the 
 line of 
 [2015-04-29 22:08:22,803][DEBUG][action.admin.indices.stats] [Agent Axis] 
 [webaccesslogs-2015.04.24][0], node[-r2iQnH4R-mcUy4NicCB5g], [P], 
 s[STARTED]: failed to execute 
 [org.elasticsearch.action.admin.indices.stats.IndicesStatsRequest@6a564a91] 
 org.elasticsearch.transport.SendRequestTransportException: [Jean-Paul 
 Beaubier][inet[/10.155.165.126:9300]][indices:monitor/stats[s]]
 Jean-Paul Beaubier is the node that went down

 _cat/recovery
 shards disk.used disk.avail disk.total disk.percent host  ip   
   node
42021.2gb   77gb 98.3gb   21 ip-10-234-164-148 
 10.234.164.148 Agent Axis  
420  41gb 57.2gb 98.3gb   41 ip-10-218-145-237 
 10.218.145.237 Ebon Seeker 
 15 
   UNASSIGNED 

 I'm trying to understand why it's stuck in this state given there is no 
 other info in the logs as far as I can tell about why the shards can't be 
 allocated. Shouldn't the replicas just be promoted in place to new 
 primaries and then new replicas created on the other node?

 Thanks and regards -- Alex 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/44f2f680-0560-448f-a19f-893fda5aab41%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Max documents 10,500?

2015-05-01 Thread Blake McBride

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents
to each. Afterwards, one shows the 50,000+ documents and the other only
shows 10,500. The second machine seems to be capping out at 10,500. Why,
and how can I correct this? The relevant facts are as follows:

1. Both machines are current 64 bit Linux machines with at least 8GB of
RAM and more than sufficient disk space.

2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running
local to each machine.

3. Both machines are running the exact same program to load up ES. Each
has nearly default ES config files (just different names).

4. The program keeps a counter of the number of times documents are added
to ES, and the return codes of each add is checked. Both are 50,000+.

5. When I do a the same query on each machine with curl, the good machine
shows a max_score of 8.2, the bad machine shows .499 - remember, same set
of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-05-01 Thread David Reagan

Moving away from mailing lists for anything except announcements would be 
awesome. Forums are a much better way to have solid discussions with 
multiple people in. Email is fine, but when you add in more than a couple 
people, it gets confusing fast. Forums are also far more user friendly for 
people who haven't learned the various ways developers communicate. 

That said, if the forum idea is scrapped, please be sure to stick with 
Google Groups or something similar. Don't switch to something like the 
Debian user lists. Every time a search result pops up from a list like that 
when I am looking for help, I can never figure out if I've seen all the 
emails in the thread or not. The interface is just horrid. Google Groups at 
least has conversation view.

On a similar subject, is there any chance we could get a real time chat app 
that is more user friendly than IRC? Does something exist that could sit on 
top of IRC and alleviate the IRC's user unfriendliness?

On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source 
 projects, we’ve found Google Groups to be a difficult solution for dealing 
 with all of our needs for community support. We’ve got multiple mailing 
 lists going, which can be confusing for new folks trying to figure out 
 where to go to ask a question. 

 We’ve also found our lists are becoming noisy in the “good problem to 
 have” kind of way. As we’ve seen more user adoption, and across such a wide 
 variety of use cases, we’re getting widely different types of questions 
 asked. For example, I can imagine that folks not using our Python client 
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the 
 fact that it is no longer a supported product by Google, it provides no API 
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering shuttering 
 the elasticsearch-user and logstash-users Google Groups in favor of a 
 Discourse forum. You can read more about Discourse at 
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all of 
 our users for a few reasons:

 * More fine grained conversation topics = less noise and better targeted 
 discussions. e.g. we can offer a forum for each language client, individual 
 logstash plugin or for each city to plan user group meetings, etc.

 * Facilitates discussions that are not generally happening on list now, 
 such as best practices by use case or tips from moving to development to 
 production

 * Easier for folks who are purely end users - and less used to getting 
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing 
 list - however, email interaction with Discourse is supported and will 
 continue to allow you to participate in discussions over email (though 
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this 
 transition, and we know they’re working to resolve this particular issue. 
 We’re also still determining how Discourse will handle our needs for both 
 user and list archive migration, and we’ll know the precise details of how 
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives, and 
 cut over to Discourse completely for community support discussions. 

 We’re looking at making the cut over in ~30 days from today, but obviously 
 that’s subject to the feedback we receive from all of you. We’re sharing 
 this information to set expectations about time frame for making the 
 switch. It’s not set in stone. Our highest priority is to ensure effective 
 migration of our list archives and subscribers, which may mean a longer 
 time horizon for deploying Discourse, as well.

 In the meantime, though, we wanted to communicate early and often and get 
 your feedback. Would this change make your life better? Worse? Meh?

 Please share your thoughts with us so we can evaluate your feedback. We 
 don’t take this switch lightly, and we want to understand how it will 
 impact your overall workflow and experience.

 We’ll make regular updates to the list responding to incoming feedback and 
 be completely transparent about how our thought processes evolve based on 
 it.

 Thanks in advance!

 [0] - https://meta.discourse.org/t/migrating-from-google-groups/24695


 Cheers,
 LH

 Leslie Hawthorn
 Director of Developer Relations
 http://elastic.co


 Other Places to Find Me:
 Freenode: lh
 Twitter: @lhawthorn


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To

Re: Max documents 10,500?

2015-05-01 Thread David Pilato

If you have nothing in logs it could mean that you have an issue with your
injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blake1...@gmail.com a écrit :

Greetings,

I have two similar but unrelated machines. I am adding 50,000+ documents to
each. Afterwards, one shows the 50,000+ documents and the other only shows
10,500. The second machine seems to be capping out at 10,500. Why, and how
can I correct this? The relevant facts are as follows:

1. Both machines are current 64 bit Linux machines with at least 8GB of RAM
and more than sufficient disk space.

2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running
local to each machine.

3. Both machines are running the exact same program to load up ES. Each has
nearly default ES config files (just different names).

4. The program keeps a counter of the number of times documents are added to
ES, and the return codes of each add is checked. Both are 50,000+.

5. When I do a the same query on each machine with curl, the good machine
shows a max_score of 8.2, the bad machine shows .499 - remember, same set of
documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would sure
be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/E2B14086-CFD2-4DF0-AC4A-E00A0204C8A5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Max documents 10,500?

2015-05-01 Thread David Pilato

Could you compare disk size (/data dir) for your two elasticsearch instances?
Also, could you GIST the result of a simple _search?pretty on both nodes?

-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs





 Le 1 mai 2015 à 21:58, Blake McBride blake1...@gmail.com a écrit :
 
 No, for two reasons:
 
 1.  I am using the exact same code and data on both machines.
 
 2.  I've seen duplicates in the past and I get an error message.
 
 Thanks.
 
 Blake
 
 
 On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:
 Any chance you are using the same id multiple times?
 
 
 -- 
 David Pilato - Developer | Evangelist 
 elastic.co http://elastic.co/
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs
 
 
 
 
 
 Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com javascript: a 
 écrit :
 
 I changed the code to read:
 
 
 var counter = 0;
 
 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 if (counter++ % 1000 === 0) {
 console.log('Adding document #' + counter);
 }
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };
 
 The last printout reads:  Adding document #53001
 
 The code that does the error check looks like:
 
 esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, 
 response, status) {
 if (err !== undefined || status !== 201 || response.created !== true) {
 console.log('Unexpected ES response: ' + status + ' ' + err + 
 response.created);
 }
 });
 
 I never see that message.  Finally, after the above I get:
 
 $ curl -s -XPOST  'http://localhost:9200/components/_count' 
 http://localhost:9200/components/_count'
 {count:10500,_shards:{total:5,successful:5,failed:0}}
 
 
 Thanks for the help!
 
 
 
 On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:
 Could you add a counter in your JS app to make sure you sent all docs?
 
 I suspect something wrong in your index process 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com  a écrit :
 
 The log only contains:
 
 [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] 
 [mmsapp-na-component] [components-1430504530354] creating index, cause 
 [api], templates [], shards [5]/[1], mappings [index_name, component]
 
 Each document is being added individually from JavaScript via:
 
 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };
 
 
 
 On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
 If you have nothing in logs it could mean that you have an issue with your 
 injector.
 May be you are using bulk but you don't check the bulk response?
 
 David
 
 Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com  a écrit :
 
 Greetings,
 
 I have two similar but unrelated machines.  I am adding 50,000+ documents 
 to each.  Afterwards, one shows the 50,000+ documents and the other only 
 shows 10,500.  The second machine seems to be capping out at 10,500.  Why, 
 and how can I correct this?  The relevant facts are as follows:
 
 1.  Both machines are current 64 bit Linux machines with at least 8GB of 
 RAM and more than sufficient disk space.
 
 2.  Both have current 64 bit Java 7 and elasticsearch 1.5.1.  ES is 
 running local to each machine.
 
 3.  Both machines are running the exact same program to load up ES.  Each 
 has nearly default ES config files (just different names).
 
 4.  The program keeps a counter of the number of times documents are added 
 to ES, and the return codes of each add is checked.  Both are 50,000+.
 
 5.  When I do a the same query on each machine with curl, the good machine 
 shows a max_score of 8.2, the bad machine shows .499 - remember, same set 
 of documents and same search query.
 
 
 I've spent a day on this, and I am running out of ideas.  Any help would 
 sure be appreciated.
 
 Blake McBride
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com .
 To view this discussion on the web visit

Re: too many open files problems and suggestions on cluster configuration

2015-05-01 Thread David Pilato

Add more nodes or reduce the number of shards per node.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 17:05, Ann Yablunovskaya lad.sh...@gmail.com a écrit :

I am looking for suggestions on cluster configuration.

I have 2 nodes (master/data and data), 544 indices, about 800 mil documents.

If I try to insert more documents and create more indices, I will catch error
too many open files.

My node's configuration:

CentOS 7
Intel(R) Xeon(R) CPU x16
RAM 62 Gb

# ulimit -n
10

In future I will have a lot of indices (about 2000) and a lot of documents
(~5 bil or maybe more)

How can I avoid the error too many open files?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c5d45b95-b3d7-4b6a-80fa-111d66f3f65a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
es_config.pp

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/54E58499-F862-4427-A765-E72FCBDC8D92%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Max documents 10,500?

2015-05-01 Thread Blake McBride

I changed the code to read:


var counter = 0;

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
if (counter++ % 1000 === 0) {
console.log('Adding document #' + counter);
}
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};


The last printout reads:  Adding document #53001


The code that does the error check looks like:


esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, response, 
status) {
if (err !== undefined || status !== 201 || response.created !== true) {
console.log('Unexpected ES response: ' + status + ' ' + err + 
response.created);
}
});


I never see that message.  Finally, after the above I get:


$ curl -s -XPOST  'http://localhost:9200/components/_count'
{count:10500,_shards:{total:5,successful:5,failed:0}}



Thanks for the help!




On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

 Could you add a counter in your JS app to make sure you sent all docs?

 I suspect something wrong in your index process 

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com javascript: a 
 écrit :

 The log only contains:

 [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] 
 [mmsapp-na-component] [components-1430504530354] creating index, cause 
 [api], templates [], shards [5]/[1], mappings [index_name, component]

 Each document is being added individually from JavaScript via:

 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };




 On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

 If you have nothing in logs it could mean that you have an issue with 
 your injector.
 May be you are using bulk but you don't check the bulk response?

 David

 Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

 Greetings,

 I have two similar but unrelated machines.  I am adding 50,000+ documents 
 to each.  Afterwards, one shows the 50,000+ documents and the other only 
 shows 10,500.  The second machine seems to be capping out at 10,500.  Why, 
 and how can I correct this?  The relevant facts are as follows:

 1.  Both machines are current 64 bit Linux machines with at least 8GB of 
 RAM and more than sufficient disk space.

 2.  Both have current 64 bit Java 7 and elasticsearch 1.5.1.  ES is 
 running local to each machine.

 3.  Both machines are running the exact same program to load up ES.  Each 
 has nearly default ES config files (just different names).

 4.  The program keeps a counter of the number of times documents are 
 added to ES, and the return codes of each add is checked.  Both are 50,000+.

 5.  When I do a the same query on each machine with curl, the good 
 machine shows a max_score of 8.2, the bad machine shows .499 - remember, 
 same set of documents and same search query.


 I've spent a day on this, and I am running out of ideas.  Any help would 
 sure be appreciated.

 Blake McBride

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit

Re: Max documents 10,500?

2015-05-01 Thread Blake McBride

No, for two reasons:

1.  I am using the exact same code and data on both machines.

2.  I've seen duplicates in the past and I get an error message.

Thanks.

Blake


On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:

 Any chance you are using the same id multiple times?


 -- 
 *David Pilato* - Developer | Evangelist 
 *elastic.co http://elastic.co*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs




  
 Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com javascript: a 
 écrit :

 I changed the code to read:


 var counter = 0;

 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 if (counter++ % 1000 === 0) {
 console.log('Adding document #' + counter);
 }
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };


 The last printout reads:  Adding document #53001


 The code that does the error check looks like:


 esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, 
 response, status) {
 if (err !== undefined || status !== 201 || response.created !== true) {
 console.log('Unexpected ES response: ' + status + ' ' + err + 
 response.created);
 }
 });


 I never see that message.  Finally, after the above I get:


 $ curl -s -XPOST  'http://localhost:9200/components/_count'
 {count:10500,_shards:{total:5,successful:5,failed:0}}



 Thanks for the help!




 On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

 Could you add a counter in your JS app to make sure you sent all docs?

 I suspect something wrong in your index process 

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit :

 The log only contains:

 [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] 
 [mmsapp-na-component] [components-1430504530354] creating index, cause 
 [api], templates [], shards [5]/[1], mappings [index_name, component]

 Each document is being added individually from JavaScript via:

 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };




 On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

 If you have nothing in logs it could mean that you have an issue with 
 your injector.
 May be you are using bulk but you don't check the bulk response?

 David

 Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

 Greetings,

 I have two similar but unrelated machines.  I am adding 50,000+ 
 documents to each.  Afterwards, one shows the 50,000+ documents and the 
 other only shows 10,500.  The second machine seems to be capping out at 
 10,500.  Why, and how can I correct this?  The relevant facts are as 
 follows:

 1.  Both machines are current 64 bit Linux machines with at least 8GB of 
 RAM and more than sufficient disk space.

 2.  Both have current 64 bit Java 7 and elasticsearch 1.5.1.  ES is 
 running local to each machine.

 3.  Both machines are running the exact same program to load up ES. 
  Each has nearly default ES config files (just different names).

 4.  The program keeps a counter of the number of times documents are 
 added to ES, and the return codes of each add is checked.  Both are 50,000+.

 5.  When I do a the same query on each machine with curl, the good 
 machine shows a max_score of 8.2, the bad machine shows .499 - remember, 
 same set of documents and same search query.


 I've spent a day on this, and I am running out of ideas.  Any help would 
 sure be appreciated.

 Blake McBride


 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit

Re: Max documents 10,500?

2015-05-01 Thread David Pilato

Any chance you are using the same id multiple times?


-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs





 Le 1 mai 2015 à 21:25, Blake McBride blake1...@gmail.com a écrit :
 
 I changed the code to read:
 
 
 var counter = 0;
 
 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 if (counter++ % 1000 === 0) {
 console.log('Adding document #' + counter);
 }
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };
 
 The last printout reads:  Adding document #53001
 
 The code that does the error check looks like:
 
 esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, 
 response, status) {
 if (err !== undefined || status !== 201 || response.created !== true) {
 console.log('Unexpected ES response: ' + status + ' ' + err + 
 response.created);
 }
 });
 
 I never see that message.  Finally, after the above I get:
 
 $ curl -s -XPOST  'http://localhost:9200/components/_count'
 {count:10500,_shards:{total:5,successful:5,failed:0}}
 
 
 Thanks for the help!
 
 
 
 On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:
 Could you add a counter in your JS app to make sure you sent all docs?
 
 I suspect something wrong in your index process 
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com javascript: a 
 écrit :
 
 The log only contains:
 
 [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] 
 [mmsapp-na-component] [components-1430504530354] creating index, cause 
 [api], templates [], shards [5]/[1], mappings [index_name, component]
 
 Each document is being added individually from JavaScript via:
 
 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };
 
 
 
 On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
 If you have nothing in logs it could mean that you have an issue with your 
 injector.
 May be you are using bulk but you don't check the bulk response?
 
 David
 
 Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com  a écrit :
 
 Greetings,
 
 I have two similar but unrelated machines.  I am adding 50,000+ documents 
 to each.  Afterwards, one shows the 50,000+ documents and the other only 
 shows 10,500.  The second machine seems to be capping out at 10,500.  Why, 
 and how can I correct this?  The relevant facts are as follows:
 
 1.  Both machines are current 64 bit Linux machines with at least 8GB of 
 RAM and more than sufficient disk space.
 
 2.  Both have current 64 bit Java 7 and elasticsearch 1.5.1.  ES is running 
 local to each machine.
 
 3.  Both machines are running the exact same program to load up ES.  Each 
 has nearly default ES config files (just different names).
 
 4.  The program keeps a counter of the number of times documents are added 
 to ES, and the return codes of each add is checked.  Both are 50,000+.
 
 5.  When I do a the same query on each machine with curl, the good machine 
 shows a max_score of 8.2, the bad machine shows .499 - remember, same set 
 of documents and same search query.
 
 
 I've spent a day on this, and I am running out of ideas.  Any help would 
 sure be appreciated.
 
 Blake McBride
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com .
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com

Re: How to take a snapshots of a specific index with the php library?

2015-05-01 Thread David Reagan

Figured it out. The key was to use $params['body'] instead of 
$params['custom']. Which is odd since that's what I tried in the first 
place. I guess I must have had some other bug in the way at the time that 
got fixed as I was testing things...

On Thursday, April 30, 2015 at 5:10:58 PM UTC-7, David Reagan wrote:

 When I try to list a specific index to take a snapshot of, the library 
 seems to ignore it, and instead takes a snapshot of my entire cluster. It's 
 the end of the day, and I'm likely missing something obvious, so any help 
 would be appreciated.

 Ultimately, I'm trying to do the equivalent of:

 curl -XPUT localhost:9200/_snapshot/my_backup/snapshot_1 -d '{
 indices: logstash-2014.09.25,
 ignore_unavailable: true,
 include_global_state: false
 }'



 Here is my function:

 function take(
 $name,
 $indices = array(),
 $settings = array(
 'wait_for_completion' = true,
 'ignore_unavailable' = true,
 'include_global_state' = false
 )
 ) {
 $params['repository'] = $this-clusterConfig['repository'];
 $params['snapshot'] = $name;
 $params['custom'] = array(
 'ignore_unavailable' = $settings['ignore_unavailable'],
 'include_global_state' = $settings['include_global_state']
 );
 $indicesString = ;
 if (!empty($indices)) {
 foreach ($indices as $value) {
 $indicesString .= $value . ,;
 }
 //remove ending comma
 $indicesString = rtrim($indicesString, ,);
 $params['custom']['indices'] = $indicesString;
 }
 $result = $this-es-snapshot()-create($params);
 if (!$result['acknowledged']) {
 $this-logger-addError(Failed to take snapshot $name in 
 cluster $this-clusterName.);
 }
 }


 Called like:

 take(testingstuff-logstash-2014.09.25,array(logstash-2014.09.25));


 $this-es is an instance of Elasticsearch\Client. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2470204b-bd53-442b-94c6-b9ab2be92cec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: SHIELD terms lookup filter : AuthorizationException BUG

2015-05-01 Thread spark

I'm having the same problem with Elasticsearch 1.4.5 with shield 1.1

On Thursday, April 23, 2015 at 2:03:23 PM UTC-5, Jay Modi wrote:

Hi Bert,

I don't know of a workaround to accomplish this in a single query right
now. We have been discussing how to fix this issue in depth over the past
few days and have ideas on how to move forward but no timeline on it being
resolved.

Regarding support contracts and fixes, I'm going to defer that question to
the person your company is in contact with. They'll be able to answer that
much better than I can.

On Wednesday, April 22, 2015 at 9:15:21 AM UTC-4, Bert Vermeiren wrote:

Hi Jay,

Thanks to acknowledge !

Is there any way to work around this issue ? We definitely need a kind of
join filter for limiting the returned data based on some
permissions/tokens.

We are also starting discussions for a support and re-distribution
license with both your and our marketing organisation.

Is there any way to get a fix within a support contract ?

Thanks,

Regards, Bert.

Op woensdag 22 april 2015 14:34:07 UTC+2 schreef Jay Modi:

Hi Bert,

Thank you for the detailed report and reproduction of this issue. This
is a known limitation with Shield and certain operations in elasticsearch.
We're working to resolve this in a future release.

We will be documenting this limitation and all of the operations
affected shortly; this was something that we had forgotten to document.

-Jay

On Monday, April 20, 2015 at 10:46:40 AM UTC-4, Bert Vermeiren wrote:

Hi,

Using:
* ElasticSearch 1.5.1
* SHIELD 1.2

Whenever I use a terms lookup filter in a search query, I get an
UnAuthorizedException for the [__es_system_user] user although the actual
user has even 'admin' role privileges.
This seems a bug to me, where the terms filter does not have the
correct security context.

This is very easy to reproduce, see gist :

https://gist.github.com/bertvermeiren/c29e0d9ee54bb5b0b73a

Scenario :

# Add user 'admin' with default 'admin' role.
./bin/shield/esusers useradd admin -p admin1 -r admin

# create index.
curl -XPUT 'admin:admin1@localhost:9200/customer'

# create a document on the index
curl -XPUT 'admin:admin1@localhost:9200/customer/external/1' -d '
{
name : John Doe,
token : token1
}'

# create additional index for the terms lookup filter functionality
curl -XPUT 'admin:admin1@localhost:9200/tokens'

# create document in 'tokens' index
curl -XPUT 'admin:admin1@localhost:9200/tokens/tokens/1' -d '
{
group : 1,
tokens : [token1, token2 ]
}'

# search with a terms lookup filter on the customer index, referring
to the 'tokens' index.

curl -XGET 'admin:admin1@localhost:9200/customer/external/_search' -d '
{
query: {
filtered: {
query: {
match_all: {}
},
filter: {
terms: {
token: {
index: tokens,
type: tokens,
id: 1,
path: tokens
}
}
}
}
}
}'

= org.elasticsearch.shield.authz.AuthorizationException: action
[indices:data/read/get] is unauthorized for user [__es_system_user]

CONFIDENTIAL COMMUNICATION:

This email may contain confidential or legally privileged material, and is
for the sole use of the intended recipient. Use or distribution by an
unintended recipient is prohibited, and may be a violation of law. If you
believe that you received this email in error, please do not read, forward,
print or copy this email or any attachments. Please delete the email and
all attachments, and inform the sender that you have deleted the email and
all attachments. Thank you.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7ca71964-2c03-4a26-8f1f-f63ac40269e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Max documents 10,500?

2015-05-01 Thread David Pilato

Could you add a counter in your JS app to make sure you sent all docs?

I suspect something wrong in your index process

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 1 mai 2015 à 20:40, Blake McBride blake1...@gmail.com a écrit :

The log only contains:

[2015-05-01 18:22:10,398][INFO ][cluster.metadata ]
[mmsapp-na-component] [components-1430504530354] creating index, cause [api],
templates [], shards [5]/[1], mappings [index_name, component]

Each document is being added individually from JavaScript via:

exports.addDoc = function (index, type, id, doc, callback) {
if (client !== undefined) {
var json = {
index: index,
type: type,
id: id,
body: doc
};
client.create(json, callback);
} else if (callback !== undefined) {
callback('elastic search not connected', undefined);
}
};

On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:
If you have nothing in logs it could mean that you have an issue with your
injector.
May be you are using bulk but you don't check the bulk response?

David

Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

Greetings,

1. Both machines are current 64 bit Linux machines with at least 8GB of
RAM and more than sufficient disk space.

2. Both have current 64 bit Java 7 and elasticsearch 1.5.1. ES is running
local to each machine.

3. Both machines are running the exact same program to load up ES. Each
has nearly default ES config files (just different names).

4. The program keeps a counter of the number of times documents are added
to ES, and the return codes of each add is checked. Both are 50,000+.

5. When I do a the same query on each machine with curl, the good machine
shows a max_score of 8.2, the bad machine shows .499 - remember, same set
of documents and same search query.

I've spent a day on this, and I am running out of ideas. Any help would
sure be appreciated.

Blake McBride

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c44faf88-1b67-4843-a6ed-52d31c055716%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/cc41f76d-9716-4f06-9f3a-ec18f0e07bf0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/137922C9-B539-452E-9181-B82E16AFC9C5%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-05-01 Thread Leslie Hawthorn

On Fri, May 1, 2015 at 7:26 PM, David Reagan jer...@gmail.com wrote:

 Moving away from mailing lists for anything except announcements would be
 awesome. Forums are a much better way to have solid discussions with
 multiple people in. Email is fine, but when you add in more than a couple
 people, it gets confusing fast. Forums are also far more user friendly for
 people who haven't learned the various ways developers communicate.

 Thanks for your feedback!


 That said, if the forum idea is scrapped, please be sure to stick with
 Google Groups or something similar. Don't switch to something like the
 Debian user lists. Every time a search result pops up from a list like that
 when I am looking for help, I can never figure out if I've seen all the
 emails in the thread or not. The interface is just horrid. Google Groups at
 least has conversation view.


Mailman is little painful, but with the latest release it's become a bit
friendlier.


 On a similar subject, is there any chance we could get a real time chat
 app that is more user friendly than IRC? Does something exist that could
 sit on top of IRC and alleviate the IRC's user unfriendliness?


Why do you find IRC unfriendly? Have you tried using a web based client
like irccloud.com?

Cheers,
LH



 On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all of
 our users for a few reasons:

 * More fine grained conversation topics = less noise and better targeted
 discussions. e.g. we can offer a forum for each language client, individual
 logstash plugin or for each city to plan user group meetings, etc.

 * Facilitates discussions that are not generally happening on list now,
 such as best practices by use case or tips from moving to development to
 production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we have them.)

 The final goal would be to move Google Groups to read-only archives, and
 cut over to Discourse completely for community support discussions.

 We’re looking at making the cut over in ~30 days from today, but
 obviously that’s subject to the feedback we receive from all of you. We’re
 sharing this information to set expectations about time frame for making
 the switch. It’s not set in stone. Our highest priority is to ensure
 effective migration of our list archives and subscribers, which may mean a
 longer time horizon for deploying Discourse, as well.

 In the meantime, though, we wanted to communicate early and often and get
 your feedback. Would this change make your life better? Worse? Meh?

 Please share your thoughts with us so we can evaluate your feedback. We
 don’t take this switch lightly, and we want to understand how it will
 impact your overall workflow and experience.

 We’ll make regular updates to the list responding to incoming feedback
 and be completely transparent about how our thought processes evolve based
 on it.

 Thanks in advance!

 [0] - https://meta.discourse.org/t/migrating-from-google-groups/24695


 Cheers,
 LH

 Leslie Hawthorn
 Director of Developer Relations
 http://elastic.co


 Other Places to Find Me:
 Freenode: lh

Re: Shield and Proxy Users

2015-05-01 Thread Jay Modi

Thanks Michael. Are you interested in Shield performing the authorization
with AD/LDAP for a given proxy user (assumed as being authenticated by your
application) or would/can your application also pass the authorization
information and then Shield restricts access accordingly?

On Wednesday, April 29, 2015 at 10:34:54 PM UTC-4, Michael Young wrote:

If you would like to get more specific use case details, I'm more than
willing to exchange emails or engage in phone calls.

Michael

On Wednesday, April 29, 2015 at 10:34:25 PM UTC-4, Michael Young wrote:

I thought that might be the case.

The problem with Shield for my use case is authentication and
authorization are closely tied together. Generally speaking, we want to
limit access to indexes via LDAP/AD groups which are assigned to Shield
roles. We want to be able to use a system/daemon account to query
Elasticserach, but pass in a proxy or impersonation user which can be
used to looked up to see what effective groups they have and from which
indexes they can get results. Without the proxy user ability, we are
forced to login the user via their username and password. The problem is
that users will not directly access Easticsearch and we don't have access
to their password.

Our users will be authenticated via a separate application/user interface
which will be using single sign on tokens. The application doesn't have
access to the user's password to pass to Elasticsearch. So there isn't an
easy way to say I have user1234 running a query and I need you to filter
index results appropriately for this authenticated user.

We want to manage index permissions using LDAP/AD groups and roles using
Shield. We don't want to have to do that in the application. The current
work around seems to be some sort of api overlay to elasticsearch which
will first check to see if the user exists using an admin account. If the
user account doesn't exist (first time logging in), then create the user
account using a hash of the users group permissions from LDAP/AD. It's not
ideal, but it'll probably get the job done until Shield is
extended/enhanced.

On Wednesday, April 29, 2015 at 5:03:51 PM UTC-4, Jay Modi wrote:

Hi Michael,

We don't currently have a way to do this with Shield. Can you tell us a
little more about your scenario? Your users are logging into your
application and then accessing data in Elasticsearch, which is protected by
Shield?

This type of information is helpful for us as we plan features for
future releases of Shield.

-Jay

On Wednesday, April 29, 2015 at 3:06:57 PM UTC-4, Michael Young wrote:

I have Elasticsearch 1.5.2 and Shield 1.2.0 configured and working
against Active Directory. This seems to work pretty well. However, I was
wondering if there was a way to pass in a proxy user from an application
to get the appropriate index filtering via access controls without having
to pass in the username AND password from the application.

Is there a way to do this with Shield?

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c7cb2cd6-3ce0-4bd4-9e21-b67fc05b2b46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

abnormal file input behavior?

2015-05-01 Thread Sitka

I have a file of logging records I am using to debug some filter parses. 
  I am using file input and have set starting_position to beginning. 
 So I startup logstash see what I get and killed it and make fixes and go 
again.  I have seen if sometimes it reads the file and sometimes not.  I 
ran some experiments and found that if I delete the file and rewrite it and 
then start up logstash it reads the file.  If I have previously read the 
file, then when logstash starts it doesn't read the file despite  being 
told to start at beginning.   

Am I missing something here?  Is this intended behavior or a possible bug?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7ad0633b-0e9d-4f10-8881-5a34c93c9500%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-05-01 Thread David Reagan

 Why do you find IRC unfriendly? Have you tried using a web based client
like irccloud.com?
I use webchat.freenode.net.

There's a big difference between, Here's our live chat app. and Learn
how to connect to IRC in order to use live chat. I actively avoided live
chat for years simply because I had no interest in learning anything about
IRC. For a new user, IRC is not friendly, especially if you are not a
developer or are new to it.

There's also the fact that you are told off if you post any code when
asking for help. I could see it if the code sample was really long, but
when people get told off when they post three or four lines of code, that's
not friendly at all. This has happened to me, and I've seen it happen to
others. (Not sure it if was in an ELK channel or not...)

Plus, figuring out how to start a private conversation, or set your status
to away, or registering your username so others can't impersonate you, and
so on is not obvious. I still haven't taken the time to figure it out.

Ideally, we'd have something that functions similar to HipChat. Nice UI,
and code snippets are automatically shortened unless you choose to show the
full snippet.


--David Reagan

On Fri, May 1, 2015 at 11:17 AM, Leslie Hawthorn leslie.hawth...@elastic.co
 wrote:



 On Fri, May 1, 2015 at 7:26 PM, David Reagan jer...@gmail.com wrote:

 Moving away from mailing lists for anything except announcements would be
 awesome. Forums are a much better way to have solid discussions with
 multiple people in. Email is fine, but when you add in more than a couple
 people, it gets confusing fast. Forums are also far more user friendly for
 people who haven't learned the various ways developers communicate.

 Thanks for your feedback!


 That said, if the forum idea is scrapped, please be sure to stick with
 Google Groups or something similar. Don't switch to something like the
 Debian user lists. Every time a search result pops up from a list like that
 when I am looking for help, I can never figure out if I've seen all the
 emails in the thread or not. The interface is just horrid. Google Groups at
 least has conversation view.


 Mailman is little painful, but with the latest release it's become a bit
 friendlier.


 On a similar subject, is there any chance we could get a real time chat
 app that is more user friendly than IRC? Does something exist that could
 sit on top of IRC and alleviate the IRC's user unfriendliness?


 Why do you find IRC unfriendly? Have you tried using a web based client
 like irccloud.com?

 Cheers,
 LH



 On Thursday, April 2, 2015 at 8:36:33 AM UTC-7, leslie.hawthorn wrote:

 Hello everyone,

 As we’ve begun to scale up development on three different open source
 projects, we’ve found Google Groups to be a difficult solution for dealing
 with all of our needs for community support. We’ve got multiple mailing
 lists going, which can be confusing for new folks trying to figure out
 where to go to ask a question.

 We’ve also found our lists are becoming noisy in the “good problem to
 have” kind of way. As we’ve seen more user adoption, and across such a wide
 variety of use cases, we’re getting widely different types of questions
 asked. For example, I can imagine that folks not using our Python client
 would rather not be distracted with emails about it.

 There’s also a few other strikes against Groups as a tool, such as the
 fact that it is no longer a supported product by Google, it provides no API
 hooks and it is not available for users in China.

 We’ve evaluated several options and we’re currently considering
 shuttering the elasticsearch-user and logstash-users Google Groups in favor
 of a Discourse forum. You can read more about Discourse at
 http://www.discourse.org

 We feel Discourse will allow us to provide a better experience for all
 of our users for a few reasons:

 * More fine grained conversation topics = less noise and better targeted
 discussions. e.g. we can offer a forum for each language client, individual
 logstash plugin or for each city to plan user group meetings, etc.

 * Facilitates discussions that are not generally happening on list now,
 such as best practices by use case or tips from moving to development to
 production

 * Easier for folks who are purely end users - and less used to getting
 peer support on a mailing list - to get help when they need it

 Obviously, Discourse does not function the exact same way as a mailing
 list - however, email interaction with Discourse is supported and will
 continue to allow you to participate in discussions over email (though
 there are some small issues related to in-line replies. [0])

 We’re working with the Discourse team now as part of evaluating this
 transition, and we know they’re working to resolve this particular issue.
 We’re also still determining how Discourse will handle our needs for both
 user and list archive migration, and we’ll know the precise details of how
 that would work soon. (We’ll share when we

Re: Failed to get setting group for [threadpool.] setting prefix and setting [threadpool.bulk] because of a missing '.'

2015-05-01 Thread marc . falzon

It was a few months ago so I don't exactly recall the command line we used, 
but it must have been something like:

curl -X PUT -d '{
persistent : {
threadpool.bulk: { type: fixed, queue_size: 250 }
}
}' localhost:9200/_cluster/settings

m.

On Thursday, April 30, 2015 at 11:35:55 PM UTC+2, Mark Walkom wrote:

 How are you setting this?

 On 30 April 2015 at 22:02, marc@happn.com javascript: wrote:

 Hello there

 I noticed the following warning message in my nodes logs when starting:

 [2015-04-30 12:40:13,265][INFO ][node ] 
 [***server***] started
 [2015-04-30 12:40:32,154][WARN ][node.settings] 
 [***server***] failed to refresh settings for 
 [org.elasticsearch.threadpool.ThreadPool$ApplySettings@1ca801ce]
 org.elasticsearch.common.settings.SettingsException: Failed to get 
 setting group for [threadpool.] setting prefix and setting 
 [threadpool.bulk] because of a missing '.'
 at 
 org.elasticsearch.common.settings.ImmutableSettings.getGroups(ImmutableSettings.java:527)
 at 
 org.elasticsearch.common.settings.ImmutableSettings.getGroups(ImmutableSettings.java:505)
 at 
 org.elasticsearch.threadpool.ThreadPool.updateSettings(ThreadPool.java:396)
 at 
 org.elasticsearch.threadpool.ThreadPool$ApplySettings.onRefreshSettings(ThreadPool.java:682)
 at 
 org.elasticsearch.node.settings.NodeSettingsService.clusterChanged(NodeSettingsService.java:84)
 at 
 org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:428)
 at 
 org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

 Here are the current cluster settings:

 curl http://localhost:9200/_cluster/settings?pretty
 {
   persistent : {
 indices : {
   store : {
 throttle : {
   max_bytes_per_sec : 40mb
 }
   }
 },
 threadpool : {
   bulk : 250
 },
 cluster : {
   routing : {
 allocation : {
   enable : all,
   balance : {
 shard : 0.9f
   },
   disable_allocation : false
 }
   }
 }
   },
   transient : {
 cluster : {
   routing : {
 allocation : {
   enable : all
 }
   }
 }
   }
 }

 Is there anything wrong with my settings? We tried to change the 
 `threadpool.bulk` setting months ago, but it doesn't seem to have been 
 taken into account (bulk queue still caped at default value). We currently 
 use version 1.2.3, and are in the process of upgrading to 1.5.x performing 
 rolling upgrades.

 Cheers,

 m.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/092de7fc-e290-4897-b7c6-f5985b0e8e7f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/092de7fc-e290-4897-b7c6-f5985b0e8e7f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f8aac10d-51ad-4fbb-8d55-0cc56fd87978%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES upgrade

2015-05-01 Thread Mark Walkom

Are you running the same ES and java versions on all your nodes and clients?

On 1 May 2015 at 21:21, phani.nadimi...@goktree.com wrote:

 Hi All

i upgraded elastic search from 1.4.2 to  1.5.2 i am getting following
 warning from console after upgrade please explain me the following error
 why it is coming.

 [2015-05-01 06:15:13,361][WARN ][transport.netty  ] [ES_Node1]
 exception caught on transport layer [[id: 0xc483723b, /ip of node:3845 =
 /ip of node:9300]], closing connection
 java.io.StreamCorruptedException: invalid internal transport message
 format, got (47,45,54,20)
 at
 org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
 at
 org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
 at
 org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
 at
 org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
 at
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
 at
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
 at
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
 at
 org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
 at
 org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 at
 org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


 Thanks
 phani

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92TwbFPSy0q9bP-KkWEwQethunYXBm7nWy0QdSWOtO4g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Max documents 10,500?

2015-05-01 Thread Blake McBride

The question of relative size has, I believe, led me to the problem.

I create aliases.  After the load, I have the alias point to the new index. 
 One of the indexes had a bad document that made the change alias mechanism 
fail.  This means I kept loading the document into an index but the alias 
was always pointing to an old index.  So, on the working system the 
database size was relatively small - since I got rid of the old indexes. 
 The bad machine was taking up a lot of space because the old indexes were 
never deleted because the alias didn't point to them.

Thanks a lot for the help!!

Blake



On Friday, May 1, 2015 at 3:39:05 PM UTC-5, David Pilato wrote:

 Could you compare disk size (/data dir) for your two elasticsearch 
 instances?
 Also, could you GIST the result of a simple _search?pretty on both nodes?

 -- 
 *David Pilato* - Developer | Evangelist 
 *elastic.co http://elastic.co*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs




  
 Le 1 mai 2015 à 21:58, Blake McBride blak...@gmail.com javascript: a 
 écrit :

 No, for two reasons:

 1.  I am using the exact same code and data on both machines.

 2.  I've seen duplicates in the past and I get an error message.

 Thanks.

 Blake


 On Friday, May 1, 2015 at 2:50:57 PM UTC-5, David Pilato wrote:

 Any chance you are using the same id multiple times?


 -- 
 *David Pilato* - Developer | Evangelist 
 *elastic.co http://elastic.co/*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs




  
 Le 1 mai 2015 à 21:25, Blake McBride blak...@gmail.com a écrit :

 I changed the code to read:


 var counter = 0;

 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 if (counter++ % 1000 === 0) {
 console.log('Adding document #' + counter);
 }
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };


 The last printout reads:  Adding document #53001


 The code that does the error check looks like:


 esUtils.addDoc(esIndex, 'component', compObj.id, doc, function (err, 
 response, status) {
 if (err !== undefined || status !== 201 || response.created !== true) {
 console.log('Unexpected ES response: ' + status + ' ' + err + 
 response.created);
 }
 });


 I never see that message.  Finally, after the above I get:


 $ curl -s -XPOST  'http://localhost:9200/components/_count'
 {count:10500,_shards:{total:5,successful:5,failed:0}}



 Thanks for the help!




 On Friday, May 1, 2015 at 1:57:42 PM UTC-5, David Pilato wrote:

 Could you add a counter in your JS app to make sure you sent all docs?

 I suspect something wrong in your index process 

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 1 mai 2015 à 20:40, Blake McBride blak...@gmail.com a écrit :

 The log only contains:

 [2015-05-01 18:22:10,398][INFO ][cluster.metadata ] 
 [mmsapp-na-component] [components-1430504530354] creating index, cause 
 [api], templates [], shards [5]/[1], mappings [index_name, component]

 Each document is being added individually from JavaScript via:

 exports.addDoc = function (index, type, id, doc, callback) {
 if (client !== undefined) {
 var json = {
 index: index,
 type: type,
 id: id,
 body: doc
 };
 client.create(json, callback);
 } else if (callback !== undefined) {
 callback('elastic search not connected', undefined);
 }
 };




 On Friday, May 1, 2015 at 11:42:12 AM UTC-5, David Pilato wrote:

 If you have nothing in logs it could mean that you have an issue with 
 your injector.
 May be you are using bulk but you don't check the bulk response?

 David

 Le 1 mai 2015 à 18:36, Blake McBride blak...@gmail.com a écrit :

 Greetings,

 I have two similar but unrelated machines.  I am adding 50,000+ 
 documents to each.  Afterwards, one shows the 50,000+ documents and the 
 other only shows 10,500.  The second machine seems to be capping out at 
 10,500.  Why, and how can I correct this?  The relevant facts are as 
 follows:

 1.  Both machines are current 64 bit Linux machines with at least 8GB 
 of RAM and more than sufficient disk space.

 2.  Both have current 64 bit Java 7 and elasticsearch 1.5.1.  ES is 
 running local to each machine.

 3.  Both machines are running the exact same program to load up ES. 
  Each has nearly default ES config files (just different names).

 4.  The program keeps a counter of the number of times documents are 
 added to ES, and the return codes of each add is checked.  Both are 
 50,000+.

 5.  When

ES upgrade

2015-05-01 Thread phani . nadiminti

Hi All

   i upgraded elastic search from 1.4.2 to  1.5.2 i am getting following 
warning from console after upgrade please explain me the following error 
why it is coming.

[2015-05-01 06:15:13,361][WARN ][transport.netty  ] [ES_Node1] 
exception caught on transport layer [[id: 0xc483723b, /ip of node:3845 = 
/ip of node:9300]], closing connection
java.io.StreamCorruptedException: invalid internal transport message 
format, got (47,45,54,20)
at 
org.elasticsearch.transport.netty.SizeHeaderFrameDecoder.decode(SizeHeaderFrameDecoder.java:47)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at 
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


Thanks
phani

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/341f2003-4211-4c70-aebf-a8d1fa649820%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to Boost

2015-05-01 Thread GWired

I got it:

To add an array of var to an anonymous type add them like this.

var field1 = new {field=Name^20};
var field2 = new {field = _all};
var field3 = new {field = Id^20};
var myfields = new [] {field1,field2};

This is slick now it puts the brackets in and there is no escaping :-)


On Thursday, April 30, 2015 at 9:01:27 PM UTC-4, GWired wrote:

 I was able to get this to work in ES using head and got back what i needed.

 However translating to ElasticSearch.net has been an issue.

 I have an anonymous type doing a match all query and this works fine and 
 dandy.

 I am  trying to do the multi_match as above but it isn't returning any 
 results. I'm not sure 

 var myfields = [\Name#94;20\,\Id#94;20\,\_all\];  - this gets 
 no results
 var myfields = [\Name^20\,\Id^20\,\_all\];  - this crashes with 
 all kinds of errors

 How do you do these bracketed values?

 var search = new
 {

 query = new
 {
 multi_match = new
 {
 fields = myfields,
 query = keyword,
 type = best_fields,
 use_dis_max= true
 }
 },
 from = 0,
 size = limitAllTypes,
 aggs = new
 {
 top_types = new
 {
 terms = new
 {
 field = _type
 },
 aggs = new
 {
 top_type_hits = new
 {
 top_hits = new
 {
 size = limitPerType
 }
 }
 }
 }
 }
 };




 On Wednesday, April 29, 2015 at 12:06:47 PM UTC-4, Joel Potischman wrote:

 You could use a boosting query 
 http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-boosting-query.html,
  
 or you could use a multi-match query 
 http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html
  and 
 add a boost directly to the name and id fields. Something like:

 {
 multi_match: {
 fields: [
 name^2,
 id^2,
 _all
 ],
 query: keyword,
 type: best_fields
 }
 }

 That tells Elasticsearch to search all fields, but to additionally search 
 name and id and double their score if they match (that's the ^2 part). 
 the best_fields type will use the best score from the name, id, and _all 
 searches. You might want most_fields to rank records that contain your 
 query terms in name *and* id *and* other fields even higher.  Note that 
 my snippet is using the Query DSL. The syntax for the client you're using 
 will probably be slightly different but that's the general idea.

 Also note that I've been using Elasticsearch less than a year though so 
 there may be better approaches, but that's where I'd start.

 -joel


 On Tuesday, April 28, 2015 at 4:42:56 PM UTC-4, GWired wrote:

 I am attempting to boost values for queries.

 I'm searching across all fields and tables and returning 25 results for 
 each type.

 This is working fine however I need to Boost if the field name Name or 
 the Field Name ID have the value in it.

 I'm using ElasticSearchClient and sending this search.

 search = new
 {

 query = new
 {
 query_string = new
 {
 query = keyword,
 default_field = _all
 }
 },
 from = 0,
 size = limitAllTypes,
 aggs = new
 {
 top_types = new
 {
 terms = new
 {
 field = _type
 },
 aggs = new
 {
 top_type_hits = new
 {
 top_hits = new
 {
 size = limitPerType
 }
 }
 }
 }
 }

 ElasticsearchResponseDynamicDictionary searchResponse = 
 client.Search(jdbc, search, null);

 How do i tell this to boost the name and id fields over all other fields.

 If I'm searching for My Searched Company and that is in the Name field 
 I want it at the top of the list.  vs in notes, addresses or whatever other 
 columns

Re: Documents not deleted when using DeleteRequest within BulkProcessor

Re: Returning partial strings in Kibana visualisation

GeoNames, Autocomplete and boost

Re: How to replicate this type of search

Re: too many open files problems and suggestions on cluster configuration

Re: How to replicate this type of search

Re: Marvel license file/order number baked into a container

too many open files problems and suggestions on cluster configuration

Re: How to replicate this type of search

Re: too many open files problems and suggestions on cluster configuration

Re: Perma-Unallocated primary shards after a node has left the cluster

Max documents 10,500?

Re: Evaluating Moving to Discourse - Feedback Wanted

Re: Max documents 10,500?

Re: Max documents 10,500?

Re: too many open files problems and suggestions on cluster configuration

Re: Max documents 10,500?

Re: Max documents 10,500?

Re: Max documents 10,500?

Re: How to take a snapshots of a specific index with the php library?

Re: SHIELD terms lookup filter : AuthorizationException BUG

Re: Max documents 10,500?

Re: Evaluating Moving to Discourse - Feedback Wanted

Re: Shield and Proxy Users

abnormal file input behavior?

Re: Evaluating Moving to Discourse - Feedback Wanted

Re: Failed to get setting group for [threadpool.] setting prefix and setting [threadpool.bulk] because of a missing '.'

Re: ES upgrade

Re: Max documents 10,500?

ES upgrade

Re: How to Boost

31 matches

Site Navigation

Mail list logo

Footer information