On friday i changed java version to oracle java 7 and during weekend
elasticsearch freeze on mapping update again :-/
On Thursday, January 30, 2014 4:44:30 PM UTC+1, InquiringMind wrote:
Not sure if this is your problem, but OpenJDK 6 is very bad. OpenJDK 7 has
been known to work well.
Do not use Java 7u51, Lucene bug still not fixed
https://twitter.com/thetaph1/status/423523708708208640
It also breaks Guava which is used in Elasticsearch.
https://code.google.com/p/guava-libraries/issues/detail?id=1635
Jörg
--
You received this message because you are subscribed to the
I think it makes sense to do it like this.
The only comment I have is that you should use BulkProcessor to send your new
documents.
Not sure I will use Update API because basically you already have full _source
in response hits.
So, updating on a client level could make sense.
If you don't want
Merci beaucoup David!
Noted :)
Best regards,
Arinto
On Monday, February 10, 2014 4:38:13 PM UTC+8, David Pilato wrote:
I think it makes sense to do it like this.
The only comment I have is that you should use BulkProcessor to send your
new documents.
Not sure I will use Update API
On Thu, Feb 6, 2014 at 6:52 PM, Jun S. Kang jun.s.k...@gmail.com wrote
I am testing the aggregations and it seems like setting size=0 in terms
aggregations is not working as it should be.
(Unless it is added at 1.1.0 as it is noted on this document.
Dear All,
I am using elasticsearch in some of the my API.
I have created the index and document and have added data in elasticsearch
server from Mysql database.
I am following 3 steps that is,
1. Delete the index using
curl -X DELETE 'http://localhost:9200/adminvenue/?pretty=true'
2. Create
Hello All
I have two index.
index1 :
GET /_all/_search?pretty
{
facets: {
terms: {
terms: {
field: prdId,
size: 9,
order: count,
exclude: []
},
facet_filter: {
fquery: {
query: {
filtered: {
You can use JDBC river plugin to fetch from database directly.
https://github.com/jprante/elasticsearch-river-jdbc
On Monday, February 10, 2014 8:40:31 PM UTC+11, Vallabh Bothre wrote:
Dear All,
I am using elasticsearch in some of the my API.
I have created the index and document and have
Hi,
I use elasticsearch for logfile analysis. I use rolling indeces on a daily
basis. I use 2 elasticsearch-servers behind a loadbalancer. The data is
sent to the load balancer and then inserted on the according server. I use
1 index with 1 shard and 1 replica. So there is one file on both
I'd add another node into the cluster to allow easier quorum and prevent
split brain.
Then split the index into (at least) 3 shards to spread the load. Ideally
you want to try to get one shard per node.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email:
Hello,
Thanks to everyone for the replies.
Apologies if there was ambiguity in my original post, allow me to clarify a
few things:
VMware: VMware vSphere 5.1 (ESXi-5.1.0-20121004001-standard)
It was the entire virtual machine that stopped responding (not VMware or
the physical server ESXi
Hello,
I have question about boost field deprecation in 1.x version. What should
now be a proper way (with regards to performance) to achieve similar
functionality?
I've made some tests (using java api, in version 1.0.0.rc2) which was
comparing quering using _all field, MultiMatch and
Can't you add productName in your first documents (first index) in addition or
in place of productId?
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 10 février 2014 at 11:03:27, Nick Chang (nick.ch...@kland.com.tw) a écrit:
Hello All
I have two
Hi,
I've created a river plugin for AWS DynamoDB. It can fetch data from
DynamoDB and index into Elasticsearch.
https://github.com/kzwang/elasticsearch-river-dynamodb
Thanks,
Kevin
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To
Hi,
I am trying to match a phrase repair iphone,
with the help of explain i am getting it but not getting how many times the
hole phrase is occurring and in which field.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from
Hey,
not sure, if I made my point clear here:
The minimum_master_nodes is an important setting, if a master node election
is about to happen (this is not checked for each write of a document).
The write_consistency is a check, when a document is about to be stored.
Also note, that this check
Hey,
you need to replace the query_string query with a match query, which in
turn supports the cutoff_frequency. Also make sure you change the default
operator to AND. If you want to support multiple fields, you will need to
use the multi_field query
see
Hey,
you can easily create custom analyzers, which use the thai analyzer and the
ngram token filter. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html
--Alex
On Fri, Feb 7, 2014 at 6:57 AM, Min Cha minslo...@gmail.com wrote:
Hi folks.
Hey,
the date field does not get loaded from the source, but from a memory
structure called fielddata. On certain occasions this fielddata structure
needs to be removed and then recreated. If you fire a query after a removal
and the structure has not yet been recreated, you will get into
Hey Jorge,
you can have two exact the same inputs. However, those need to have a
different output in order to occur as different suggestions (as those are
important for the data being returned).
--Alex
On Sat, Feb 8, 2014 at 4:02 AM, Jorge Sanchez xsa...@gmail.com wrote:
Hi,
really
Hey,
you may want to use index templates to make sure, that you do not have to
check this stuff inside of your application and can just rely that the
mappign is configured as expected, see
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
--Alex
Thank you Kevin for your reply,
But i was searching for the command which will directly update the data
without deleting and creating the indexes.
Is there any command like curl -X UPDATE in elasticsearch ?
which will not add already existing data in elastic.
On Monday, February 10, 2014
Vallabh,
You can do full document updates/replace by simply using POST
http://server:9200/index/type/id. Just specify the document ID of the
document you want to update and it will replace that document.
You can also do partial updates of an existing document like this:
Hi:
I want to know why or how works ES, because it needs a lot of file
descriptors, when i doing re indexing it uses a lot. But if someone can
explain me or share some links to understand why open a lot of files and if
it degrade the performance of servers.
Regards
ps: I change on
Jörg
I think that system should work only for a two nodes setup. Otherwise, how
can we be sure that when letting elastisearch put a replica everywhere it
wants, there's going to be a copy in the nodes tagged as search. Say you
have three indexing nodes and three searching nodes, and you go
Hi,
I'm trying to force ElasticSearch embedded Java node client to use ppp0
interface instead of providing IP address (because it changes), but somehow
it doesn't work. That interface is discovered during the startup but later
ES still uses eth0 IP address.
ppp0display_name [ppp0]
I'm handling about 300,000,000 log datas with elasticsearch and kibana.
Using AWS instance with 8GB memory, and set es memory as -Xms4g -Xmx4g
It often appears with OutOfMemory error. (java heap memory)
If I use elasticsearch with Hadoop HDFS, like pre- map reducer, will it
help with memory
This might help:
http://lucene.apache.org/core/3_0_3/fileformats.html#Index%20File%20Formats
Keep in mind, each shard in ES is a Lucene index so that is while lots of
file descriptors need to be used. In other words, it's normal.
--
You received this message because you are subscribed to the
My understanding was that the scope of the JFK/Guava bug was limited and did
not affect Lucene/Solr. Elasticsearch uses Guava for collections and caching,
not sure about reflection.
--
Ivan
joergpra...@gmail.com wrote:
Do not use Java 7u51, Lucene bug still not fixed
I took the latest patch from the Lucene issue and patched the 4.6.0 tag and
started using that jar in an instance (running 0.90.8). It looks like that
instance hit the condition that previously would have caused the infinite
loop. Our logs filled up with exception messages about failing to
Hi guys,
I intend to use the elasticsearch embedded in my webapplication.
Is there a kind of optimization for rest calls (for indexing a document for
example) to not generate http traffic, when the elasticsearch is running in
same virtual machine as the service caller? (Instead of generate an
Thanks. The split brain problem aside: Is it faster for elasticsearch to
read a shard than a replica?
On Monday, February 10, 2014 11:44:41 AM UTC+1, Mark Walkom wrote:
I'd add another node into the cluster to allow easier quorum and prevent
split brain.
Then split the index into (at least)
Am unable to save my settings.
First, I found that the existing Dashboard Save is a bit unorthodox and
therefor confusing.
I don't think it's clear that the field populated by default with Marvel
Overview is an editable input field, to me it looks like a label.
And therefor, next to the
Hi,
ES is running on a ubuntu 64bit in VM environment with the following memory
configuration:
12 gb total memory
5 gb elasticsearch
~3 gb other processes
and 4 gb left for OS.
ES cluster configured with 3 nodes.
In the elasticsearch.yml bootstrap.mlockall set to true. On one of the
nodes I
Are elasticsearch nodes and client on the same version?
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 10 février 2014 at 18:13:24, redrubia (ruby.chil...@gmail.com) a écrit:
I just wondered how do you find out the exact version of Java your
How'd you find out the versions? I've looked in status and no luck
On Monday, 10 February 2014 12:18:27 UTC-5, David Pilato wrote:
Are elasticsearch nodes and client on the same version?
--
*David Pilato* | *Technical Advocate* | *Elasticsearch.com*
@dadoonet https://twitter.com/dadoonet |
Hello,
I am new to the Elasticsearch Hadoop integration and was attempting to use
an external table to index certain fields. The table creation is
successful. However, when i try to do a insert i get the following error
message:
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
Interesting fact is that your stacktraces point to a Lucene 3.x field RAM
size estimator method, which might have some yet not identified issues.
This indicates to me that you still have an index originating from an early
ES version (0.90?)
If possible, I recommend to reindex data with 1.0.0.RC2,
How'd you find this out? I looked at node status with no luck!
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To
Rondan,
If you're using REST/HTTP, you can use the Delete Index API to delete an
index easily:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-delete-index.html
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To
James,
Sounds like you want to query the parents and then score/rank them by the
sum of the userScore values of their corresponding children. You can try
something like this:
POST http://localhost:9200/index/parent_search
{
query: {
has_child: {
type: child,
score_type: sum,
THanks, yes we are using this but with java api, but exists any formula
that i can predict how many open file i will have if my indices increase
every week, in order to configure the correct number of file descriptors.
Because when i delete an index and it will create again all (index and all
You can use an alias to enable switching between two indexes. Therefore,
you can build a new index while allowing searching against the old one.
When the new one is successfully built change the alias to point to the new
one.
On Mon, Feb 10, 2014 at 1:40 AM, Vallabh Bothre
BTW, I cannot believe 7u51 did not fix the previous Lucene bug since it has
been known for a while. So frustrating. ETA for 7u60 is May. 7u25 will be
almost a year old by then.
--
Ivan
On Mon, Feb 10, 2014 at 8:15 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:
Yes, this time ES is
Jörg,
Thanks for the tip! I just updated my client code and it works great! Will
be a help during future problem determination!
Brian
*On Monday, February 10, 2014 1:07:24 PM UTC-5, Jörg Prante wrote:*
*To find out Java versions, do the following.*
*On server, execute curl
Thanks for all the answers guys!
One final thought: Theoretically, if every node has only one shard and no
replicas, each node could only search the data it has and no redudant data.
Shouldn't that have an (small) impact on the indexing/searching of the data?
--
You received this message
Hey all!
Background: I am using elasticsearch with logstash to do some log analysis.
My use-case is write-heavy, and I have configured ES accordingly. After
experimenting with different setups, I am considering the following
implementation:
*separate log processing from ES cluster*
1x
If you have an index per (some time period) then you can create the new indexes
with more shards when you have more hardware and leave the old ones with their
old number. You can also allocate more shards then noses in preparation for
getting more nodes. I believe those are the standard tactics
THanks a lot :)
On Mon, Feb 10, 2014 at 4:06 PM, Binh Ly b...@hibalo.com wrote:
Rondan,
For Java, should be something like this:
DeleteIndexResponse deleteIndexResponse = new
DeleteIndexRequestBuilder(client.admin().indices(),
INDEX_NAME).execute().actionGet();
if
Thank you! This helped a lot. Seems my ES install via Boxen was 0.90.3, but
I was importing 0.90.5 into the project
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
If you are tired of Java 7, use Java 8 ;)
https://twitter.com/thetaph1/status/410494532816760832
Oracle Java 8 FCS is already out. GA version will be ready in March.
But I know of issues with Java 8 and MVEL, so if you use MVEL scripts, be
aware...
Jörg
On Mon, Feb 10, 2014 at 8:17 PM, Ivan
I have a suggestion response being returned to me. One example is:
{
companies-1391214959789 : [ {
text : wells,
offset : 0,
length : 5,
options : [ {
text : Wells Fargo,
score : 3.0, payload : {industry_id:100,company_id:1}
}, {
text : Wells Real Estate
In recent versions, it should be compatible but those two versions are too old!
You should update both to 0.90.11
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 10 février 2014 at 21:05:54, redrubia (ruby.chil...@gmail.com) a écrit:
Thank you! This
Do not put single-node cluster (no replicas) in production. Never use
single nodes except for development and demo. Always use replica and at
least 3 nodes with minimum master nodes = 2 to avoid splitbrain in
production.
Having 17G RAM on a master-only server is more than enough of a beefy
server
Although others might be able to comment,
IMO you need to provide more information, eg
Virtualization technology
What else is running in each node.
So, for comparison, this is my lab setup. I've been testing with approx.
1GB data that has created up to 15GB of additional metadata. I've been
Yeah, unfortunately work constraints mean versioning must be the same. We
are upgrading soon though :)
On Monday, 10 February 2014 15:21:22 UTC-5, David Pilato wrote:
In recent versions, it should be compatible but those two versions are too
old!
You should update both to 0.90.11
--
Can someone help me out with this, please?
On Thursday, 6 February 2014 20:58:25 UTC+5:30, Vinod wrote:
Right now my ES search thread pool config is:
transient : {
threadpool.search.queue_size : 200,
threadpool.search.reject_policy : abort,
threadpool.search.size : 12
}
I've verified that shards are re-allocating after a cluster restart (again,
I'm using 1.0 RC1).
To test this specifically, I loaded a small dataset (can take a very long
time to verify results on a large dataset).
Easy to verify.
In a 5 node cluster, load some apache data. (I loaded only a
Thanks for the responses! After reading up on the split brain problem, I
am moving to a three-node cluster with one master-only (on logstash
server), one master/data, and one data-only server
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To
How are you seeing this, ie what monitoring are you using?
Also, what is the ~3 gb other processes exactly for?
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 11 February 2014 04:16, Vahid vhasan...@gmail.com
Hi Doru,
Based on what I've seen (but of course may be criticized by others)
I don't know if there is a sufficient reason to not use the REST API and
use something else instead.
What is your concern? Network traffic? AFAIK if localhost is bound to lo or
something similar, that isn't an issue.
Hi,
We are using ElasticSearch for navigating through our product catalog. We
have fairly simple documents like:
{
_index: catalog,
_type: product,
_id: 476,
_score: 1,
_source: {
id: 476,
I recently finished deploying an ES/Logstash cluster for a small
environment. It's a two-node local cluster but I'm getting horrible
performance and frequent crashes. I'll soon be standing up a cluster in
another environment that's roughly 10x the size of this first one so I've
got to figure
We setup our ES server on EC2 using the chef setup described here:
http://www.elasticsearch.org/tutorials/deploying-elasticsearch-with-chef-solo/
We setup auth to it and then tried to use the elasticsearch-js client to
connect to it and thus began our problems.
At first it wasn't clear what was
What's the total amount of indexed data (gb and count)? What about your
heap size, shard count, replica count, ES and java versions?
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 11 February 2014 10:03, Harry
Would it be reasonable to create an issue to request nautical miles (nm
as the abbreviation) for the DistanceUnit enumeration?
This would make it much more natural to adapt Elasticsearch for aircraft
planning / charting applications. Everything in that world is in nautical
miles and knots
Aircraft use nautical miles? You learn something new every day!
--
Ivan
On Mon, Feb 10, 2014 at 3:21 PM, InquiringMind brian.from...@gmail.comwrote:
Would it be reasonable to create an issue to request nautical miles (nm
as the abbreviation) for the DistanceUnit enumeration?
This would
One nautical mile is one minute of arc along the meridian line (one degree
of longitude), that makes it very easy to calculate distance on a chart
(independent on the vehicle used :)
http://en.wikipedia.org/wiki/Nautical_mile
On Mon, Feb 10, 2014 at 3:27 PM, Ivan Brusic i...@brusic.com wrote:
I'm using random_score to perform a search with some random sorting,
something as simple as this:
{
fields: [
id
],
query: {
function_score: {
random_score: {
seed: 773372
}
}
},
sort: [
{
_score: desc
}
]
}
As soon as I update a doc in
Well, good grief, the answer to the poster's original question is PERHAPS
(not yes!). It depends on the timing between the search request and the
(asynchronous) indexing. There is a short (but finite) amount of time after
the update request that a search will still return the previous version. At
Hi all,
Is QueryBuilder implementation thread-safe?
I plan to use one static QueryBuilder class that is shared by several
threads.
Best regards,
Arinto
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and
Is it possible to use path_match to consider sub fields as criteria in
applying the template?
For example, I have a dynamic template like this, but it isn't really what
I want:
dynamic_templates: [
{
nested_property_attributes: {
match_mapping_type: object,
None of the builders are thread safe, no. I believe they are all quite
light weight though.
On Mon, Feb 10, 2014 at 9:25 PM, Arinto Murdopo ari...@gmail.com wrote:
Hi all,
Is QueryBuilder implementation thread-safe?
I plan to use one static QueryBuilder class that is shared by several
Okay, noted.
Thanks and best regards,
Arinto
On 02/11/2014 10:42 AM, Nikolas Everett wrote:
None of the builders are thread safe, no. I believe they are all
quite light weight though.
On Mon, Feb 10, 2014 at 9:25 PM, Arinto Murdopo ari...@gmail.com
mailto:ari...@gmail.com wrote:
Hi
Here's my finding on the _timestamp field.
curl -XPOST 'localhost:9200/bam' -d '{
mappings : {
_default_ : {
_timestamp: { enabled : true }
}
}
}
}'
curl -XPOST 'localhost:9200/bam/bam/' -d '{
bam : bam
}'
curl
Is there anyone with an answer at all? I'm feeling hopeless without any
clue.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Hello
From Mongodb and MySQL.
It's a different input data.
So, I can't add ProductName to first documents.
Thanks
David Pilato於 2014年2月10日星期一UTC+8下午7時27分07秒寫道:
Can't you add productName in your first documents (first index) in
addition or in place of productId?
--
*David Pilato* |
I changed elasticsearch.yml
* index.number_of_shards: 10*
does it help to increase shards ?
As I understand about ES Architecture is a shard means a Lucien Thread.
For now, I'm more worring about memory than performance and speed.
2014년 2월 10일 월요일 오후 11시 36분 58초 UTC+9, Binh Ly 님의 말:
In
I was searching infos about ES with HDFS. What I see is, using ES with
Hadoop does not mean using HDFS as main storage for ES.
ES updates indexes to HDFS every 10 sec. as backup. Is that right?
Is there any way that I can use HDFS like main storage for elasticsearch?
I'm using AWS and the
Hi ,
how to highlight the matched word in elastic search if i am firing a query
like
{
explain: true,
query: {
match: {
name: age of kamal
}
}
}
how to highlight the result which contained age of kamal
Thanks
Navneet Mathpal
--
You received this message because you
Which process pull your data from MongoDB and MySQL and push to elasticsearch?
If you can't do that at index time, then I guess you will need to manage that
on a client level.
I mean that using response you already have, do a second call to elasticsearch
(multisearch) to get names.
BTW, if you
Why it's unexpected?
If a field is not stored, you can not retrieve it. What's wrong with that?
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 11 févr. 2014 à 04:34, Khoa Nguyen huu.khoa.ngu...@gmail.com a écrit :
Here's my finding on the _timestamp field.
curl -XPOST
hi,
i had a stable ES cluster on aws ec2 instances till a week ago.. and i
don't know whats going on and my cluster keeps getting into a bad state
every few hours. the error says OOM but i know that that is not the reason.
the instance has enough heap space left. im running ES 0.90.6 version and
Dear All,
I am using elastic for venue search in that i do have title, latitude and
longitude.
I have some what 4 lac data.
*My concern is that i wanted to search the text which will produce the
relevant result with nearest distance.*
Right now i am using geo_distance
you can see my below
Hi,
I am using a three node cluster with two ip address on each server. I want
http and transport addresses to be separated to avoid network congestion.
*http.host:* 192.168.1.113 (binded on eth0)
*transport.host:* 192.168.10.116 (binded on eth1)
and similar for all other nodes.
The nodes are
Hi all,
Is there any best practice in generating document ID in ElasticSearch?
Let's say we want to evenly distribute the data in the cluster and be able
to update the document fast.
Let's say my document is a user information with this JSON format, and I
index all the fields.
#2. Its a hash so youll be fine and get is always faster than search. A lot.
Sent from my iPhone
On Feb 10, 2014, at 10:26 PM, Arinto Murdopo ari...@gmail.com wrote:
Hi all,
Is there any best practice in generating document ID in ElasticSearch? Let's
say we want to evenly distribute
That is a misconception. To avoid split-brain, it is a good idea to set up
master-only (data-less) nodes so they can not face heap problems. But that
is not all, more important is to have at least three master eligible nodes,
in case you get network disruptions. Additionally, you should set up at
88 matches
Mail list logo