Re: ScrollId doesn't advance with 2 indexes on a read alias 1.4.4

2015-04-14 Thread Todd Nine
at calling the same scroll_id won't return the next results? > > AFAIK, the scroll_id can be the same and still return new records > > 2015-04-14 14:26 GMT-03:00 Todd Nine : > >> Hey guys, >> I have 2 indexes. I have a read alias on both of the indexes (A and &g

ScrollId doesn't advance with 2 indexes on a read alias 1.4.4

2015-04-14 Thread Todd Nine
Hey guys, I have 2 indexes. I have a read alias on both of the indexes (A and B), and a write alias on 1 (B). I then insert 10 documents to the write alias which inserts them into index B. I perform the following query. { "from" : 0, "size" : 1, "post_filter" : { "bool" : {

Creating a no-op QueryBuilder and FilterBuilder

2015-04-06 Thread Todd Nine
Hey all, I have a bit of an odd question, hopefully someone can give me an answer. In usergrid, we have our own existing query language. We parse this language into an AST, then visit each of the nodes and construct an ES query with the java client. So far, very straight forward. However,

Re: Set index.query.parse.allow_unmapped_fields = false for a single index

2015-04-02 Thread Todd Nine
Figured this out (derp). So my next question, is how can I disable dynamic mappings so that they don't occur? Thanks, Todd On Thursday, April 2, 2015 at 3:55:07 PM UTC-6, Todd Nine wrote: > > Hey guys, > We're running ES as a core service, so I can't guarantee that m

Set index.query.parse.allow_unmapped_fields = false for a single index

2015-04-02 Thread Todd Nine
Hey guys, We're running ES as a core service, so I can't guarantee that my application will be the only one using the cluster. Is it possible to disallow unmapped fields explicitly for an index without modifying the global setting? Thanks, Todd -- You received this message because you are

Using scroll and different results sizes

2015-03-24 Thread Todd Nine
Hey all, We use ES as our indexing sub system. Our canonical record store is Cassandra. Due to the denormalization we perform for faster query speed, occasionally the state of documents in ES can lag behind the state of our Cassandra instance. To accommodate this eventually consist system,

Very slow index speeds with dynamic mapping and large volume of documents with new fields

2015-03-11 Thread Todd Nine
Hey all, We're bumping up against a production problem I could use a hand with. We're experiencing steadily decreasing index speeds. We have 12 c3.4xl data nodes, and 1 c3.8xl master node (with 2 backups that are smaller). We're indexing 45 million documents into a single index. Single sha

Re: Unexpected shard allocation with 0 replica indexes

2015-02-23 Thread Todd Nine
h/reference/current/modules-cluster.html#allocation-awareness > > On 23 February 2015 at 06:46, Todd Nine > > wrote: > >> Hi All, >> We have several indexes in our ES cluster. ES is not our canonical >> system record, we use it only for searching. >> >>

Unexpected shard allocation with 0 replica indexes

2015-02-22 Thread Todd Nine
Hi All, We have several indexes in our ES cluster. ES is not our canonical system record, we use it only for searching. Some of our applications have very high write throughput, so for these we allocate a singular primary shard for each of our nodes. For example, we have 6 nodes, and we cr

Re: One shard continually fails to allocate

2015-02-18 Thread Todd Nine
Hey Aaron, What do you get back if you try to use these sets of commands to manually allocate the shard to a node? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-reroute.html I had this problem before, but it turned out we had 1 node that had accidentally be upg

Uneven primary shard distribution with increasing index allocation

2015-02-17 Thread Todd Nine
Hey All, We're on ES 1.4.3. Initially, I thought I had this issue, however it appears I was incorrect. https://github.com/elasticsearch/elasticsearch/issues/9023 We're seeing very uneven primary shard distribution as we're adding new indexes to our system. We're running a 6 node cluster. W

Topology of tribe nodes in a distributed replicated environment and automatic index + alias creation.

2015-02-11 Thread Todd Nine
Hey guys, We have a slightly different use case than I'm able to find examples for with the tribe node. Any feedback would be appreciated. What we have now: Single region: We create indexes in our code automatically. They're based on timeuuids, so we never have to worry about them conflict

Re: Help creating a near real time streaming plugin to perform replication between clusters

2015-01-23 Thread Todd Nine
the source cluster > to a node in the target cluster, and place a restore action on a queue on > the target cluster master, plus a rollback logic if shard transaction > fails. So in short, the ES cluster to cluster replication process could be > realized by a "primary

Re: Help creating a near real time streaming plugin to perform replication between clusters

2015-01-23 Thread Todd Nine
current/streams.html > > I'm in favor of Ratpack since it comes with Java 8, Groovy, Google Guava, > and Netty, which has a resemblance to ES. > > In ES, for inter cluster communication, there is not much coded afaik, > except snapshot/restore. Maybe snapshot/restore c

Help creating a near real time streaming plugin to perform replication between clusters

2015-01-15 Thread Todd Nine
Hey all, I would like to create a plugin, and I need a hand. Below are the requirements I have. - Our documents are immutable. They are only ever created or deleted, updates do not apply. - We want mirrors of our ES cluster in multiple AWS regions. This way if the WAN between

Re: Multi region replication help

2015-01-14 Thread Todd Nine
nks, Todd On Wed, Jan 14, 2015 at 3:03 PM, Mark Walkom wrote: > You could use snapshot and restore, or even Logstash. > > On 15 January 2015 at 10:07, Todd Nine wrote: > >> Hi all, >> We have a deployment scenario I can't seem to find any examples of, and &g

Multi region replication help

2015-01-14 Thread Todd Nine
Hi all, We have a deployment scenario I can't seem to find any examples of, and any help would be greatly appreciated. We're running ElasticSearch in 3 AWS regions. We want these regions to survive a failure from other regions, and we want all writes and reads from our clients to occur in th

Internal implementation details when using geo hash

2014-11-14 Thread Todd Nine
Hey All, I have a question about the internal implementation of geo hashes and distance filters. Here is my current understanding, I'm struggling to figure out how to apply these to our queries internally in ES. Using bool queries are very efficient. Internally they perform bitmap union, i

Refresh on 1.4.0 not working as expected

2014-11-14 Thread Todd Nine
Hey guys, We're testing ES 1.4.0 from 1.3.2. I'm noticing some strange behavior in our clients in our integration tests. They perform the following logic. Create an the first index in the cluster (single node) with a custom __default__ dynamic mapping Add 3 documents, each of a a new type

Re: NPE from server when using query + geo filter + sort

2014-11-11 Thread Todd Nine
w the NPE you > posted. > > Jörg > > On Tue, Nov 11, 2014 at 6:11 PM, Todd Nine > > wrote: > >> Hi all, >> I'm getting some strange behavior from the ES server when using a term >> query + a geo distance filter + a sort. I've tried this wit

Re: NPE from server when using query + geo filter + sort

2014-11-11 Thread Todd Nine
} }, { "nu_created" : { "order" : "asc", "ignore_unmapped" : true } }, { "bu_created" : {

NPE from server when using query + geo filter + sort

2014-11-11 Thread Todd Nine
Hi all, I'm getting some strange behavior from the ES server when using a term query + a geo distance filter + a sort. I've tried this with 1.3.2, 1.3.5, as well as 1.4.0. All exhibit this same behavior. I'm using the Java transport client. Here is my SearchRequestBuilder payload in toSt

Help with a larger cluster in EC2 and Node clients failing to join

2014-10-30 Thread Todd Nine
Hey guys, We're running some load tests, and we're finding some of our clients are having issues joining the cluster. We have the following setup running in EC2. 24 c3.4xlarge ES instances. These instances are our data storage and search instances. 40 c3.xlarge Tomcat instances. These are

Help with profiling our code's usage of the Node java client using YourKit

2014-10-08 Thread Todd Nine
Hi All, I've been using ES for a while and I'm really enjoying it, but we have a few slow calls in our code. I have a few hunches around the code that's using the client inefficiently, but I would like some definitive proof. I've attempted to profile our application using YourKit when we're

Re: Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

2014-10-07 Thread Todd Nine
, if you want to add millions of edges to an > ES doc one by one, this will not be efficient. > > So I would like to suggest to avoid the overhead of updating fields by > script in preference to add / remove relations by their "relation id", i.e. > to treat relations as first

Re: Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

2014-10-06 Thread Todd Nine
e statement "Bob likes restaurant Duo" > > and then you can run ES queries on the field "likes" or better > "user.likes" for finding the users that like a restaurant etc. Referencing > the "id" it is possible to lookup another document in another index ab

Re: Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

2014-10-03 Thread Todd Nine
So clearly I need to RTFM. I missed this in the documentation the first time. http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping.html#_how_types_are_implemented Will filters at this scale be fast enough? On Friday, October 3, 2014 11:48:40 AM UTC-6, Todd Nine wrote

Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

2014-10-03 Thread Todd Nine
Hey guys, We're currently storing entities and edges in Cassandra. The entities are JSON, and edges are directed edges with a source---type-->target. We're using ElasticSearch for indexing and I could really use a hand with design. What we're doing currently, is we take an entity, and turn

Re: Upper bounds on the number of indexes in an elastic search cluster

2014-09-26 Thread Todd Nine
of hundreds and > thousands of nodes. From my understanding, the communication of the master > and 20 nodes is not a serious problem. This becomes an issue at ~500-1000 > nodes. > > Jörg > > > > > On Sat, Sep 27, 2014 at 1:12 AM, Todd Nine wrote: >

Re: Upper bounds on the number of indexes in an elastic search cluster

2014-09-26 Thread Todd Nine
many thousands of aliases on a single (or few) > indices with just a few shards. There is no limit defined by ES, when your > configuration / hardware capacity is exceeded, you will see the node > getting sluggish. > > Jörg > > On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine > wrote: &

Upper bounds on the number of indexes in an elastic search cluster

2014-09-26 Thread Todd Nine
Hey guys. We’re building a Multi tenant application, where users create applications within our single server.For our current ES scheme, we're building an index per application. Are there any stress tests or documentation on the upper bounds of the number of indexes a cluster can handl

Re: Shard count and plugin questions

2014-06-10 Thread Todd Nine
as ES is sensitive to > latency. > > You may be better off using the snapshot/restore process, or another > export/import method. > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com > web: www.campaignmonitor.co

Re: Shard count and plugin questions

2014-06-10 Thread Todd Nine
few 100 GB - it is faster at index recovery or at reallocation time. > > Jörg > > > > > > > > On Thu, Jun 5, 2014 at 9:44 PM, Todd Nine > > wrote: > >> Hey Jorg, >> Thanks for the reply. We're using Cassandra heavily in production, I'

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
tly disk I/O and > memory related. > > In the end, you can take as a rule of thumb: > > - add replica to scale "read" load > - add new indices (i.e. new shards) to scale "write" load > - and add nodes to scale out the whole cluster for both read and write l

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
Do not think about rivers, they are not built for such use cases. Rivers > are designed as a "play tool" for fetching data quickly from external > sources, for demo purpose. They are discouraged for serious production use, > they are not very reliable if they run unattended. > &

Re: Shard count and plugin questions

2014-06-05 Thread Todd Nine
hardware will handle things. > > As for reading the transaction log and searching it, you might be playing > a losing game as your code to parse and search would have to be super quick > to make worth doing. > > Regards, > Mark Walkom > > Infrastructure Engineer >

Re: Shard count and plugin questions

2014-06-04 Thread Todd Nine
ur commit log (via storing it in memory until flush) that would be ideal. Thoughts? > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com > web: www.campaignmonitor.com > > > On 5 June 2014 04:18, Todd Nine wr

Re: cross data center replication

2014-06-04 Thread Todd Nine
Hey all, Sorry to resurrect a dead thread. Did you ever find a solution for eventual consistency of documents across EC2 regions? Thanks, todd On Wednesday, May 1, 2013 5:50:00 AM UTC-7, Norberto Meijome wrote: > > +1 on all of the above. es-reindex already in my list of things to > invest

Shard count and plugin questions

2014-06-04 Thread Todd Nine
Hi All, We've been using elastic search as our search index for our new persistence implementation. https://usergrid.incubator.apache.org/ I have a few questions I could use a hand with. 1) Is there any good documentation on the upper limit to count of documents, or total index size, befor