I have had experience with such but _not_ without data loss. The reality
is that some data loss has already occurred. I am not aware of any ES
solution that will allow you to retrieve what data remains, without further
data loss, and restore the index to green status. I have seen reference to
so
Unless I am mistaken, that is already the job of the scan query, reducing
the load caused by sorting a query. There would be no reduction by
limiting the result set.
As I understand it the scan query identifies which shards has results, and
then just starts serving the first results first without
:57 AM UTC-6, Yogesh wrote:
>
> Thanks Aaron. Your post was very informative.
> Can you recommend any blogposts, articles etc. where I could read more on
> this topic?
>
> Thanks again for your help.
>
> On Tuesday, March 31, 2015 at 9:57:58 PM UTC+5:30, Aaron Mefford wrote
Can you share what you are trying to accomplish with the now()? There may
be an alternative approach It may make sense to cron modification to the
alias, such that the now value is periodically updated.
On Thursday, April 2, 2015 at 1:03:59 PM UTC-6, Stefanie wrote:
>
> Hi,
>
> I have found wh
Noticed this happening on a cluster this week which had reached 85%, the
full disk watermark.
On Thursday, April 2, 2015 at 3:29:18 PM UTC-6, Mark Walkom wrote:
>
> Take a look in your ES logs, it should have something of use.
>
> You can also try dropping the replicas to 0 for the indices that a
You likely are doing string matching because the data was ingested as a
string. Take a look at the following they should clear things up for you.
http://www.elastic.co/guide/en/elasticsearch/guide/master/mapping-intro.html
http://www.elastic.co/guide/en/elasticsearch/reference/1.x/mapping.html
h
Is there a reason not to control the limit in your code?
On Thursday, April 2, 2015 at 6:31:07 PM UTC-6, Chen Wang wrote:
>
> I want to for example, fast get 1m out of 5m records.
> I am currently using:
>
> SearchResponse scrollResp = this.client
> .prepareSearch(esQuery.indices)
>
> .addField
Thats a lot of threads for 1G of memory.
On Friday, April 3, 2015 at 4:55:20 AM UTC-6, cyrilforce wrote:
>
> Hi,
>
> Having to test the performance of the ES with some load testing it
> reaches 100% CPU utilization and following is hot threads :
>
> ES configuration
> --
Have you tried Countries.Name.Untouched?
On Friday, April 3, 2015 at 9:06:17 AM UTC-6, James Crone wrote:
>
> I have try analyzer on specific index field by creating multifield. And it
> looks like:
> "Countries" : {
>"Properties" : {
> "Name" : {
> "
y awesome.
>
> Thanks
>
> 31 Mart 2015 Salı 00:42:45 UTC+3 tarihinde Aaron Mefford yazdı:
>
>> I understand that if you do not have sufficient storage space, then you
>> cannot manage a replica on every node. However, you are not limited to the
>> size of a &qu
>From what I can see in your graphs I noticed two things. You seem to have
a spike in search requests at that time, a spike in http traffic, and a
cache eviction right at the beginning of it.
Are you certain you don't have an external user with a cron job that runs
at the top of the hour? P
You need to read up a bit on how memory is allocated in Linux.
In an ElasticSearch or Database server, this seems to be both, you want
that free column to be 0. All available free memory should be used to
cache files. In your snapshot you have 35GB of file cache listed under the
cached headi
In a single node cluster set the replica count to 0. The shards will clean
themselves up automatically. Don't forget to do regular backups as you
have no redundancy.
On Friday, March 27, 2015 at 1:15:38 PM UTC-6, avery...@insecure-it.com
wrote:
>
> I have an elasticsearch host (single) that h
t; magical ways that ES uses rather than lucene has its own.
>
> 30 Mart 2015 Pazartesi 18:55:49 UTC+3 tarihinde Aaron Mefford yazdı:
>>
>> "Automagic" routing happens already on hashing the document id. It
>> sounds like you may have a situation where your documen
"Automagic" routing happens already on hashing the document id. It sounds
like you may have a situation where your document id is creating a hot
spot. This being the case what you want is not automagic routing but more
control over the routing or a better document id. There is the ability to
I think part of what you may be missing, is the intent that ElasticSearch
be scaled out rather than up. There are other issues that occur when you
scale up instead of out, the first of which is that losing a single node of
your cluster can be disastrous. It is also generally far more expensive
t to understand how to make node client work.
>
>
> Вторник, 17 марта 2015, 11:26 -06:00 от Aaron Mefford >:
>
> This is what I use in my code, not sure how correct it is given the
> abysmal state of the the Java API documentation.
>
> import or
This is what I use in my code, not sure how correct it is given the abysmal
state of the the Java API documentation.
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.client.Client;
import org.elasticsearch.clien
some script?
>
>
> On Friday, March 13, 2015 at 11:52:59 AM UTC-5, Aaron Mefford wrote:
>>
>> Yes it was m1.smalls that I first noticed the EBS throttling on. Things
>> work well in bursts, but sustained EBS does not work well. It will work
>> substantially better
Well.. I think I may see your issue.
I decoded this string:
L2hvbWUvYWhhcm1vbi90ZXN0L0EgUGx1cyAtIE1lZGlhIFBsYW4gU3VtbWFyeS54bHM=
It is:
/home/aharmon/test/A Plus - Media Plan Summary.xls
Another is:
/home/aharmon/test/A Plus - Summary by Venue.pdf
I think you misunderstand the purpose or how
pposed to use Tika to index the content of documents but it doesn't
> seem to be working correctly. I base64 encode the documents but it comes
> back as null when I decode it.
> On Friday, March 13, 2015 at 11:38:38 AM UTC-5, Aaron Mefford wrote:
>>
>> Not certain wha
ments but it doesn't
> seem to be working correctly. I base64 encode the documents but it comes
> back as null when I decode it.
> On Friday, March 13, 2015 at 11:38:38 AM UTC-5, Aaron Mefford wrote:
>>
>> Not certain what you are referring to so I expect not. I have used t
Weird that was the post I made yesterday morning that just now hit the list
after vanishing.
On Thu, Mar 12, 2015 at 10:21 AM, wrote:
> I switched to using aliases about a year ago and I love it. I am able to
> rebuild in the background and make a clean cutover once the process
> completes.
>
>
Yes it was m1.smalls that I first noticed the EBS throttling on. Things
work well in bursts, but sustained EBS does not work well. It will work
substantially better in an m3.medium and if you are using the new EBS SSD
volumes.
On Thu, Mar 12, 2015 at 10:30 PM, Andy Nemzek wrote:
> Thank you gu
To second what Costin said, it is trivial to load the data into
ElasticSearch, then you have the data ready to use the full power of
Kibana. ElasticSearch is very quick to setup and scales well.
Trying to go the route of rolling your own setup to generate graphs with D3
will certainly be harder,
gt; Do you have experience with the mapper attachment?
>
> On Friday, March 13, 2015 at 11:15:18 AM UTC-5, Aaron Mefford wrote:
>>
>> Your going to have the same issue with SOLR, putting the contents in to
>> XML which is even heavier than JSON.
>>
>> I wish that I
t would be a
> good solution to my problem.
>
> thanks,
> Austin
>
> On Thursday, March 12, 2015 at 4:04:29 PM UTC-5, Aaron Mefford wrote:
>>
>> Take a look at Apache Tika http://tika.apache.org/
>> <http://www.google.com/url?q=http%3A%2F%2Ftika.apache.org%2F&
Take a look at Apache Tika http://tika.apache.org/. It will allow you to
extract the contents of the documents for indexing, this is outside of the
scope of the ElasticSearch indexing. A good tool to make these files
downloadable is also out of scope, but I'll answer to what is in scope.
You need
I don't know that ES has any intelligence to support varied node sizes so I
would say yes they should be the same size. I've not looked into this so I
may be wrong.
Also I use multiple Ebs volumes in a software raid. to increase non
provisioned iops. Not necessary if you use piops.
Aaron
Sent
There are some new options in the latest builds of ElasticSearch as I
understand it that replace the old S3 Gateway.
However, neither the S3 Gateway nor those others are requirements for
setting up ElasticSearch on EC2. They are only disaster recovery options
that will help you to get back u
Is there any reason that match all queries would be impacted significantly
by index size?
It seems that in the absence of any sort, query or other mechanism
requiring scoring it should just be a matter of fetching the first document
from a shard. In practice that does not seem to be the case.
31 matches
Mail list logo