I have had experience with such but _not_ without data loss. The reality
is that some data loss has already occurred. I am not aware of any ES
solution that will allow you to retrieve what data remains, without further
data loss, and restore the index to green status. I have seen reference to
so
Unless I am mistaken, that is already the job of the scan query, reducing
the load caused by sorting a query. There would be no reduction by
limiting the result set.
As I understand it the scan query identifies which shards has results, and
then just starts serving the first results first without
:57 AM UTC-6, Yogesh wrote:
>
> Thanks Aaron. Your post was very informative.
> Can you recommend any blogposts, articles etc. where I could read more on
> this topic?
>
> Thanks again for your help.
>
> On Tuesday, March 31, 2015 at 9:57:58 PM UTC+5:30, Aaron Mefford wrote
Can you share what you are trying to accomplish with the now()? There may
be an alternative approach It may make sense to cron modification to the
alias, such that the now value is periodically updated.
On Thursday, April 2, 2015 at 1:03:59 PM UTC-6, Stefanie wrote:
>
> Hi,
>
> I have found wh
Noticed this happening on a cluster this week which had reached 85%, the
full disk watermark.
On Thursday, April 2, 2015 at 3:29:18 PM UTC-6, Mark Walkom wrote:
>
> Take a look in your ES logs, it should have something of use.
>
> You can also try dropping the replicas to 0 for the indices that a
You likely are doing string matching because the data was ingested as a
string. Take a look at the following they should clear things up for you.
http://www.elastic.co/guide/en/elasticsearch/guide/master/mapping-intro.html
http://www.elastic.co/guide/en/elasticsearch/reference/1.x/mapping.html
h
Is there a reason not to control the limit in your code?
On Thursday, April 2, 2015 at 6:31:07 PM UTC-6, Chen Wang wrote:
>
> I want to for example, fast get 1m out of 5m records.
> I am currently using:
>
> SearchResponse scrollResp = this.client
> .prepareSearch(esQuery.indices)
>
> .addField
Thats a lot of threads for 1G of memory.
On Friday, April 3, 2015 at 4:55:20 AM UTC-6, cyrilforce wrote:
>
> Hi,
>
> Having to test the performance of the ES with some load testing it
> reaches 100% CPU utilization and following is hot threads :
>
> ES configuration
> --
Have you tried Countries.Name.Untouched?
On Friday, April 3, 2015 at 9:06:17 AM UTC-6, James Crone wrote:
>
> I have try analyzer on specific index field by creating multifield. And it
> looks like:
> "Countries" : {
>"Properties" : {
> "Name" : {
> "
y awesome.
>
> Thanks
>
> 31 Mart 2015 Salı 00:42:45 UTC+3 tarihinde Aaron Mefford yazdı:
>
>> I understand that if you do not have sufficient storage space, then you
>> cannot manage a replica on every node. However, you are not limited to the
>> size of a &qu
>From what I can see in your graphs I noticed two things. You seem to have
a spike in search requests at that time, a spike in http traffic, and a
cache eviction right at the beginning of it.
Are you certain you don't have an external user with a cron job that runs
at the top of the hour? P
You need to read up a bit on how memory is allocated in Linux.
In an ElasticSearch or Database server, this seems to be both, you want
that free column to be 0. All available free memory should be used to
cache files. In your snapshot you have 35GB of file cache listed under the
cached headi
In a single node cluster set the replica count to 0. The shards will clean
themselves up automatically. Don't forget to do regular backups as you
have no redundancy.
On Friday, March 27, 2015 at 1:15:38 PM UTC-6, avery...@insecure-it.com
wrote:
>
> I have an elasticsearch host (single) that h
f 5 shards,
DJB2 performs ideally.
On Monday, March 30, 2015 at 10:04:08 AM UTC-6, MrBu wrote:
>
> Aaron, thanks for the reply.
>
> You cant distribute all of the documents if the size of it is more than a
> usual hdd. Also that was an example I gave. I am just figuring out the
&g
"Automagic" routing happens already on hashing the document id. It sounds
like you may have a situation where your document id is creating a hot
spot. This being the case what you want is not automagic routing but more
control over the routing or a better document id. There is the ability to
I think part of what you may be missing, is the intent that ElasticSearch
be scaled out rather than up. There are other issues that occur when you
scale up instead of out, the first of which is that losing a single node of
your cluster can be disastrous. It is also generally far more expensive
t to understand how to make node client work.
>
>
> Вторник, 17 марта 2015, 11:26 -06:00 от Aaron Mefford >:
>
> This is what I use in my code, not sure how correct it is given the
> abysmal state of the the Java API documentation.
>
> import or
gt; even more reason that it would be little issue to specify the IP. While
> multicast makes it easy to stand up a cluster in an ideal situation, my
> experience has been that it leads to more problems down the road, and
> things generally work better when not using multicast. I heard the
situation, my
experience has been that it leads to more problems down the road, and
things generally work better when not using multicast. I heard the same
suggestion repeatedly at Elastic{on}.
Aaron
On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>
> I have one ph
he request volume over a three year period.
While this goes against the nature of the e logging efforts, these are some
of the ideas I had while reading about your situation.
Aaron
On Monday, March 16, 2015 at 6:42:43 PM UTC-6, Mark Walkom wrote:
>
> There's not a lot you can do he
some script?
>
>
> On Friday, March 13, 2015 at 11:52:59 AM UTC-5, Aaron Mefford wrote:
>>
>> Yes it was m1.smalls that I first noticed the EBS throttling on. Things
>> work well in bursts, but sustained EBS does not work well. It will work
>> substantially better
I saw a similar thing happen yesterday, and the post then appeared today.
It was not a new thread in this case but instead a reply. I blame Google.
On Friday, March 13, 2015 at 1:31:53 PM UTC-6, Mark Walkom wrote:
>
> I have had a look through the backend and cannot see these threads at all.
>
Well.. I think I may see your issue.
I decoded this string:
L2hvbWUvYWhhcm1vbi90ZXN0L0EgUGx1cyAtIE1lZGlhIFBsYW4gU3VtbWFyeS54bHM=
It is:
/home/aharmon/test/A Plus - Media Plan Summary.xls
Another is:
/home/aharmon/test/A Plus - Summary by Venue.pdf
I think you misunderstand the purpose or how
pposed to use Tika to index the content of documents but it doesn't
> seem to be working correctly. I base64 encode the documents but it comes
> back as null when I decode it.
> On Friday, March 13, 2015 at 11:38:38 AM UTC-5, Aaron Mefford wrote:
>>
>> Not certain wha
ments but it doesn't
> seem to be working correctly. I base64 encode the documents but it comes
> back as null when I decode it.
> On Friday, March 13, 2015 at 11:38:38 AM UTC-5, Aaron Mefford wrote:
>>
>> Not certain what you are referring to so I expect not. I have used t
I saw this kind of behavior with some older versions of Java, but I mean
old, not something I would expect to run ES, but then again your on an old
build of ES so I thought I would mention it.
>>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" g
yes this is the easy
> way. If you are ok with one of the caveats, then by all means this is the
> simplest route.
>
> Aaron
>
> On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote:
>>
>> Hey everyone,
>>
>> I have a question about rebuilding
different, if that is not the case then yes this is the easy
way. If you are ok with one of the caveats, then by all means this is the
simplest route.
Aaron
On Wednesday, March 11, 2015 at 10:47:59 AM UTC-6, mzrth_7810 wrote:
>
> Hey everyone,
>
> I have a question about rebuilding an
Yes it was m1.smalls that I first noticed the EBS throttling on. Things
work well in bursts, but sustained EBS does not work well. It will work
substantially better in an m3.medium and if you are using the new EBS SSD
volumes.
On Thu, Mar 12, 2015 at 10:30 PM, Andy Nemzek wrote:
> Thank you gu
bana could extract random
>> bits of data from "Hadoop".
>>
>> That said there are many people who have built data visualizations using
>> Hadoop and D3, the js lib that is used in Kibana. If you really want to go
>> without ElasticSearch (you don'
gt; Do you have experience with the mapper attachment?
>
> On Friday, March 13, 2015 at 11:15:18 AM UTC-5, Aaron Mefford wrote:
>>
>> Your going to have the same issue with SOLR, putting the contents in to
>> XML which is even heavier than JSON.
>>
>> I wish that I
t would be a
> good solution to my problem.
>
> thanks,
> Austin
>
> On Thursday, March 12, 2015 at 4:04:29 PM UTC-5, Aaron Mefford wrote:
>>
>> Take a look at Apache Tika http://tika.apache.org/
>> <http://www.google.com/url?q=http%3A%2F%2Ftika.apache.org%2F&
data from "Hadoop".
That said there are many people who have built data visualizations using
Hadoop and D3, the js lib that is used in Kibana. If you really want to go
without ElasticSearch (you don't), I would recommend looking for something
like that.
Aaron
On Thursday, Marc
You should be able to set the number of replicas for all previous indexes
to 0. You cannot reduce the shard count once an index is created, or
increase for that matter. You could reindex your shards.
http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html
Perhaps you are misunderstanding me. ElasticSearch does not provide a load
balancer for this purpose. You would use a typical HTTP load balancer
which could be anything as simple as Nginx, to something costly and
expensive like a NetScalar. Configuring such a loadbalancer I believe is
outsid
://www.found.no/foundation/sizing-elasticsearch/
Aaron
On Thursday, March 12, 2015 at 6:12:11 PM UTC-6, Jeferson Martins wrote:
>
> Hi,
>
> I have 5 nodes of ElasticSearch with 4 CPUs, 8 Mbs of RAM.
>
> My Index today have 1TB of data and my index have about 100GBs By day and
data?
>>
>> Regarding large scale production environments, this is why ElasticSearch
>> scales out.
>>
>> Aaron
>>
>> On Wednesday, March 11, 2015 at 12:50:25 PM UTC-6, Austin Harmon wrote:
>>>
>>> Hello,
>>>
>>> I'm tr
With the low volume of ingest, and the long duration of history, Id suggest
you may want to trim back the number of shards per index from the default
5. Based on your 100 docs per day Id say 1 shard per day. If you combined
this with the other suggestion to increase the duration of an index, t
I tried to reply earlier but seems Google lost that reply.
My suggestion would be to create a v1_new index that has the same mappings
as v1. When you are ready to migrate to v2, change indexing to go to
v1_new, change searches to cover v1 and v1_new (alias or query string),
copy v1 to v2, chan
Yes you need to include all the text you want indexed and searchable as
part of the JSON.
How else would you expect ElasticSearch to receive the data?
Regarding large scale production environments, this is why ElasticSearch
scales out.
Aaron
On Wednesday, March 11, 2015 at 12:50:25 PM UTC-6
The latest versions of Kibana are very different than the older versions.
The old version was just a bunch of javascript that needed any old
webserver to host the files. The new version is a full blown node.js
application and as such does not use Apache at all, but requires node.js.
It also
Why not load balance multiple tribe nodes, if you need multiple.
On Wednesday, March 11, 2015 at 9:41:39 AM UTC-6, Abigail wrote:
>
> Hi Mark,
>
> Thank you for your reply. Is there any existing approach for kibana to
> communicate with multiple tribe nodes? Or is it something we should
> implem
Kibana is tightly coupled with features that are available in
ElasticSearch. As those features change versions of Kibana change. For
instance the latest version of Kibana requires that you are using 1.4.4.
Unless more updates have changed that.
If you are running a version that predates .90.
Try using collectd with the UDP input plugin in Logstash with the collectd
codec.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@go
If you're deleting it all, why not just delete the whole index and
re-create it? It will certainly be faster, and less taxing to the system.
If you need to preserve a mapping, download it first so you can re-upload
it later.
--Aaron
On Wednesday, February 18, 2015 at 1:15:25 PM UTC-7,
*grumble*
That's it.
Two of the nodes are FreeBSD, the other two are Linux.
It appears the two Linux nodes 'magically' updated themselves to 1.4.3...
Thanks for the help.
-A
On Wed, Feb 18, 2015 at 9:06 AM, Todd Nine wrote:
> Hey Aaron,
> What do you get back if you try
-pending-tasks.html
>
> --
> David Pilato | Technical Advocate | Elasticsearch.com
> @dadoonet | @elasticsearchfr | @scrutmydocs
>
>
>
> Le 18 févr. 2015 à 17:44, Aaron C. de Bruyn a écrit :
>
> I did some playing last night, but was unable to figure it out.
> Lookin
.html#cat-shards
> ?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 18 févr. 2015 à 06:44, Aaron C. de Bruyn a écrit :
>
> All the servers have nearly 1 TB free space.
>
> -A
>
> On Tue, Feb 17, 2015 at 7:44 PM, David Pilato wrote:
>
All the servers have nearly 1 TB free space.
-A
On Tue, Feb 17, 2015 at 7:44 PM, David Pilato wrote:
> It's a replica?
> Might be because you are running low on disk space?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 18 févr. 20
I have one shard that continually fails to allocate.
There is nothing in the logs that would seem to indicate a problem on any
of the servers.
The pattern of one of the copies of shard '2' not being allocated runs
throughout all my logstash indexes.
Running 1.4.3 on all nodes.
Any pointers on
I hear what you’re saying. How are you loading your logs into Elasticsearch?
Logstash has a throttle filter which helps to prevent just this sort of
disk-space overconsumption from happening.
—Aaron
On Mon, Feb 16, 2015 at 3:09 PM, Gabriel Corrêa de Oliveira
wrote:
> Well, in my case,
icult problem to solve with a
*sharded*, distributed system.
You could write your own monitoring system, based on your own usage or the
suggestions I made above, and make use of the Curator API
(http://curator.readthedocs.org) to do the behind-the-scenes work.
Good luck,
--Aaron
On Sunda
Hi,
I imported a sample data:
{
"account_number": 0,
"balance": 16623,
"firstname": "Bradshaw",
"lastname": "Mckenzie",
"age": 29,
"gender": "F",
"address": "244 Columbus Place",
"employer": "Euron",
"email": "bradshawmcken...@euron.com",
"city": "Hobucken"
Yes, they are persistent.
—Aaron
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discuss
API to upload templates and update
them that way, as needed.
--Aaron
On Wednesday, January 14, 2015 at 7:58:30 AM UTC-7, Eric Howard wrote:
>
> At
> www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
>
> it states that "Index template
indicative of an overloaded cluster.
--Aaron
On Monday, July 14, 2014 4:07:48 PM UTC-6, Nikhil Mulley wrote:
>
> I saw these messages in the elasticsearch logs when the curator is failing
> to attempt the deletion of an
the error by selecting a time window that does not include
closed indices.
--Aaron
On Thursday, November 27, 2014 7:47:31 AM UTC-7, Bruno Cruz wrote:
>
> Hi,
>
> I hope this is the correct group for asking about this behavior, sorry in
> advance if it isn't but I would
e, and the
suffix everything after it.
--Aaron
On Wednesday, December 17, 2014 7:07:02 AM UTC-7, David Pilato wrote:
>
> You need to change prefix.
>
> --host localhost
> --port 9200
> -t (or --timeout) 30
> -T (or --time-unit) days
> -p (or --prefix) logstash-
>
>
if you don't use the HTTP protocol.
>
> Brian
>
> On Wednesday, December 10, 2014 3:53:30 PM UTC-5, Vagif Abilov wrote:
>>
>> Thank you Aaron, done. I've created an issue. But I'd like to find out if
>> there's a workaround for this problem. What'
Thank you for bringing this to our attention. Can you please create an
issue at https://github.com/logstash-plugins/logstash-codec-json ?
Thanks!
On Wednesday, December 10, 2014 7:13:16 AM UTC-8, Vagif Abilov wrote:
>
> Hello,
>
> We have been successfully using Logstash to parse our JSON logs
Thank you both Elvar and Jorg for your replies.
Aaron
On Thursday, December 4, 2014 7:58:07 AM UTC-5, Elvar Böðvarsson wrote:
>
> Two options
>
> 1. Have a client instance of elasticsearch on a different server or on the
> same server that does the query. That node must be set
Time-series indices can grow to 300 segments per index or more. 30 days of that
is a rather large number of segments to test, especially over TCP/IP to Amazon
S3. It tests before it can ignore.
—Aaron
On Wed, Dec 3, 2014 at 10:24 AM, Matt Hughes
wrote:
> I understand that the segments
g a master node? How to
submit a query to the cluster?
Thanks a lot in advance!
Aaron
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to el
(the “orphans” I referenced before), and that’s only useful
with the —older-than flag that is already there.
—Aaron
On Wed, Dec 3, 2014 at 8:57 AM, Matt Hughes wrote:
> Thanks for the speedy reply.
> As for 1, I understand that ES optimizes for *storage* as snapshots of the
> s
slow-down by reducing the number of segments the repositories must search
through for both hourly *and* daily snapshots.
--Aaron
On Tuesday, December 2, 2014 1:20:25 PM UTC-5, Matt Hughes wrote:
>
> As noted here --
> https://groups.google.com/forum/#!searchin/elasticsearch/snapshot$
I will include my response to the original post:
Snapshots are at the segment level. The more segments stored in the
> repository, the more segments will have to be compared to those in each
> successive snapshot. With merges taking place continually in an active
> index, you may end up with
be deleted, and
any segments still referenced will be preserved.
--Aaron
On Thursday, November 13, 2014 2:04:42 AM UTC-5, Sally Ahn wrote:
>
> Hello,
>
> I am also running into this issue. I've specified more details in this
> thread
> <https://groups.google.com/d
regex = "^" + prefix + "(" + dateregex + ")" + suffix + "$"
elif object_type == 'snapshot':
regex = "(" + "^" + snapshot_prefix + '.*' + ")"
Would you issue a pull request for this change?
--Aaron
O
I have an object mapping that uses nested objects (props in our example) in
a tag-like fashion. Each tag can belong to a client/user and when we want
to allow our users to generate query_string style searches against the
props.name.
Issue is that when we run our query if an object has multipl
I was wondering how long logs or events stay in Elasticsearch? Is there a
way we can change the default value?
Thanks,
Aaron Lee
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving e
Hi Tanuj,
This has already come up
in https://github.com/elasticsearch/curator/issues/183
A fix is already in
master:
https://github.com/untergeek/expire-logs/commit/685016295be83f14d4cad6ab9577eb1461ad9621
On Friday, October 10, 2014 5:03:35 AM UTC-5, tanuj bansal wrote:
>
> Hi All:
>
> I h
In short, you are correct. You want that flag set on every node since
you've pushed curator to every node.
--Aaron
On Monday, September 22, 2014 4:00:41 PM UTC-5, Aaron Mildenstein wrote:
>
> You are correct in understanding that indices are deleted completely,
> regardles
the
elected master.
--Aaron
On Monday, September 22, 2014 2:43:10 PM UTC-5, Matt Hughes wrote:
>
> I'm running curator in every node in an N-node ELK cluster. Is there any
> reason I *wouldn't* want to have the --master-only flag turned on?
>
> If you delete an index from th
I've noticed on a few of my clusters that some shard replicas will be
perpetually inconsistent w/ other shards. Even when all of my writes are
successful and use write_consistency = ALL and replication = SYNC.
A GET by id will return 404/missing for one replica but return the document
for the o
Hmm. What version of python are you using? I am able to use --logfile or
-l interchangeably.
I'm glad you like Curator, and I like KELTIC :) Nice acronym.
On Tuesday, August 5, 2014 11:50:06 AM UTC-6, Brian wrote:
>
> Using the most recent release (1.2.2) of Curator, I noticed that the
> do
I don't know that ES has any intelligence to support varied node sizes so I
would say yes they should be the same size. I've not looked into this so I
may be wrong.
Also I use multiple Ebs volumes in a software raid. to increase non
provisioned iops. Not necessary if you use piops.
Aa
Sorry this never got responded to. Unless your indices are hourly, and in
a format that curator recognizes, it will not delete anything.
What are your index names, or your naming schema?
--Aaron
On Thursday, May 15, 2014 8:49:00 AM UTC-5, Guillaume boufflers wrote:
>
> Hello buds !
&g
ke the other procedures Curator does.
Fortunately, restoring indices is a relatively simple thing to do with the
API (modules-snapshots.html
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html#_restore>
).
--Aaron
On Wednesday, July 30, 2014 12:10
There are some new options in the latest builds of ElasticSearch as I
understand it that replace the old S3 Gateway.
However, neither the S3 Gateway nor those others are requirements for
setting up ElasticSearch on EC2. They are only disaster recovery options
that will help you to get back u
Is there any reason that match all queries would be impacted significantly
by index size?
It seems that in the absence of any sort, query or other mechanism
requiring scoring it should just be a matter of fetching the first document
from a shard. In practice that does not seem to be the case.
Hi, all.
as we know that , in a elasticsearch cluster, different configurations
have different effects.
like that:
node.master: true and node:data: false ==> it will be a
"coordinator"
node.master: false and node.data: true ==> it will be a
"workhorse", just to store
n the
> field.
>
> A field with "store: yes" means the original content of the field values
> is stored in the Lucene data structure. It does not enable search per se.
>
> Jörg
>
>
>
> On Wed, May 7, 2014 at 6:18 PM, Aaron Troy
> > wrote:
>
>
Hi all,
I'm trying to build a side-set of data for some people by querying the
'sourceName' field of an index. This field (as you can see from the
mapping) is stored and not indexed. I'm also NOT storing the source:
- mappings: {
- doc: {
- _source: {
- enabled: f
en't written anything or optimized then maybe? Unfortunately I
> think your best bet it to fix the code Look at source filtering which
> will return something more familiar.
>
> Sorry,
>
> Nik
>
>
> On Fri, Mar 7, 2014 at 10:40 AM, Aaron Troy
> > wrote:
Hi all,
We upgraded from a previous version (0.90.7) to 1.0 recently, but need to
rollback to a 0.90 version due to our code not accounting for the field
values being returned as arrays instead of strings. Will data be affected
by the rollback, or should things be fine?
--
You received this m
Hi all,
I have a system set up with 12 nodes, 48 indexes, and 12 shards/index (I
know that number is way too high).
Anyways, I'm in the process of indexing all the data onto a new index with
far less shards, but I wanted to keep the initial cluster up for existing
users searching the data. Wel
similar tools, mainly that the source index needs to have the
original documents stored in the _source field.
Feedback and criticism is welcome.
Aaron
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this grou
87 matches
Mail list logo