Is it technically feasible to drill down to know disk usage incurred by each
field level.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.html
provides storage cost at the index level 'store'
I would like to know the storage cost incurred by each of the
Hi,
I read it here
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html)
that the _source field needs to be enabled for Update API to work.
Does it mean that from Java or REST API, I cannot update any field defined
in the type mapping unless the _source is
There is no limit in ES.
Each type uses a certain amount of heap for caching ids and the mapping.
You can create types / mappings until heap explodes. Each modification of a
mapping is propagated through the cluster, which is not a cheap operation.
You have to test by yourself if your design
No you can't as behind the scene the full document is removed and inserted with
new values (new version).
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 30 juil. 2014 à 08:52, 'Sandeep Ramesh Khanzode' via elasticsearch
elasticsearch@googlegroups.com a écrit :
Hi,
I
Hi,
Wonder if anyone got any clue about this? maybe additional logs needed to
nail this one?
thanks.
On Tuesday, July 29, 2014 10:14:42 AM UTC+3, Idan wrote:
I have status red on marvel dashboard. If I check the the 'Shared
allocation' tab on the overview I see this error:
Oops!
Hello,
Right now I have two nodes in my ES (part of ELK stack) cluster and 1 shard
for each index. I would like to change number of shards to two for future
indices. Can I do this by changing config file and restarting logstash?
Will it change number of shards for indices created after
Typo in my previous message, here's corrected post:
Hello,
Right now I have two nodes in my ES (part of ELK stack) cluster and 1 shard
for each index. I would like to change number of shards to two for future
indices. Can I do this by changing config file and restarting ES? Will it
change
It doesn't change existing indexes only new ones.
You can either do the setting change via the API or in the config, if you
choose the latter you will need a restart.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html
Regards,
Mark Walkom
Would you be able to re-run your query and post the stack trace from the
Elasticsearch server logs. This might help to work out whats going on.
Thanks
Colin
On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:
Ok. I think I found the problem. As soon as I try to sort on the script
Also, your shard_size parameter should always be greater than the size
parameter. So if you are asking for size of 10 then I would try setting
shard_size to 20 or 30.
On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:
Would you be able to re-run your query and post the
For my test case it's the same every time. In the real query it will
change every time, but I planned to not cache this filter and have a less
granular date filter in the bool filter that would be cached. However while
debugging I noticed slowness with the date range filters even while testing
Hello,
Am trying to create an index using CSV River Plugin for ElasticSearch
https://github.com/AgileWorksOrg/elasticsearch-river-csv, my csv file
contains *String*, *long* and* date* values.
My problem is :
- ElasticSearch always consider* long* values as *string* ( with default
mapping )
Hello,
I have a project using Play framework version 1.2.7 which used ES 1.1.1.
I wanted to update it to the latest and greatest (1.3.1), but encountered
the following exception when running the unit tests within the play
framework:
An unexpected error occured caused by exception
Hi David,
I tried, as you suggested, to activate dynamic scripting and to force
groovy as a default_lang but the results stay unchanged.
And yeah, no other node on the cluster.. Here's the test's output logs:
TestClient: Loading config files...
TestClient: Creating local node...
juil. 30,
I think you are doing something wrong.
If you defined a mapping it should not be overwritten by the CSV river as far
as I know.
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 30 juillet 2014 à 10:31:07, Amirah (beldjilal...@gmail.com) a écrit:
Hello,
May be a stupid question: why did you put that filter inside a query and
not within the same filter you have at the end?
For my test case it's the same every time. In the real query it will
change every time, but I planned to not cache this filter and have a less
granular date filter in the
Thanks for the answer,
Am creating and defining my mapping ( and index) as following :
PUT /newindex/
PUT /newindex/_mapping
{
newindex : {
properties: {
MyStringValue: {type: string},
MyLongValue: {type: long},
MyDateValue:{type: date}
}
}
}
Is there a way to have a native java script accessible in integration
tests? In my integration tests I am creating a test node in the /tmp
folder.
I've tried copying the script to /tmp/plugins/scripts but that was quite
hopeful and unfortunately does not work.
Desperate for help.
Thanks
--
I might be wrong but I think that scripts should be located in config/scripts,
right?
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 30 juillet 2014 à 11:31:10, Nick T (nttod...@gmail.com) a écrit:
Is there a way to have a native java script
You should try to add groovy jar to your classpath. It is not in the
dependencies in Maven's pom.xml.
Example:
dependency
groupIdorg.codehaus.groovy/groupId
artifactIdgroovy-all/artifactId
version2.3.5/version
This is a dependency problem. Check your classpath if you have clean
dependencies to ES 1.3.1 code only.
Jörg
On Wed, Jul 30, 2014 at 10:41 AM, gregorymaertens via elasticsearch
elasticsearch@googlegroups.com wrote:
Hello,
I have a project using Play framework version 1.2.7 which used ES
Hi Colin,
I try increasing it up to 40 but nothing changes. I would post the stack
trace but I don't know how to find them.
Thanks
Valentin
On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe wrote:
Also, your shard_size parameter should always be greater than the size
Nice catch Jörg, that indeed did the trick.
@David Shouldn't groovy be bundled in the ES jar if it's the new default ?
Will it be provided by ES when i run on a live cluster ?
Thanks!
On Wednesday, July 30, 2014 11:41:23 AM UTC+2, Jörg Prante wrote:
You should try to add groovy jar to your
The ES team decided to postpone groovy as default to Elasticsearch 1.4
version.
In 1.3, mvel is still the default, so authors have some time to rewrite
their scripts if they prefer to. So I think it is ok to not include groovy
jar by default, and make this optional to those who want to switch
Hi,
I wanted to ask whether the version of cloud-aws plugin is 2.1.1 for
elasticsearch 1.3.1, by looking at the github page:
https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.3
How come the plugin version for 1.3.1 of elasticserach goes backwards? For
elasticsearch 1.2.x the
Hi,
I have tried the same approach and it worked for me, meaning to copy the
script I want to perform an integration test and run my IT.
I do the following steps
1) Setup the required paths for elasticsearch
final Settings settings
= settingsBuilder()
Ha! Right! Thanks Jörg!
I forgot that I run the same issue recently. I should add more memory to my
brain cluster :)
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 30 juillet 2014 à 12:08:58, joergpra...@gmail.com (joergpra...@gmail.com) a
écrit:
I have noticed that you mention native java script so you have implemented
it as a plugin?
if so try the following in your settings:
final Settings settings
= settingsBuilder()
...
.put(plugin.types, YourPlugin.class.getName())
Thomas
On Wednesday, 30
This looks strange to me
PUT /newindex/_mapping
{
newindex : {
properties: {
MyStringValue: {type: string},
MyLongValue: {type: long},
MyDateValue:{type: date}
}
}
}
}
What is your type name?
--
David Pilato | Technical Advocate |
Ok well, anyway i think you may want to update the docs about this cause i
think i won't be the only one facing this :)
Thanks again to both of you.
On Wednesday, July 30, 2014 12:30:09 PM UTC+2, David Pilato wrote:
Ha! Right! Thanks Jörg!
I forgot that I run the same issue recently. I
Don't use the `and` filter - use the `bool` filter instead. They have
different execution modes and the `bool` filter works best with bitset
filters (but also knows how to handle non-bitset filters like geo etc).
Just remove the `and`, `or` and `not` filters from your DSL vocabulary.
Also, not
I am also interested in this question.
I have found a fairly old code snippet [1] to calculate the cosine
similarity in lucene, but I was wondering if elasticsearch provided an
easier API to access this information.
[1]
there is a missing part ( copy paste error) /_river/
So, yes i use this
PUT /_river/newindex/_mapping
{
newindex : {
properties: {
MyStringValue: {type: string},
MyLongValue: {type: long},
MyDateValue:{type: date}
}
}
}
}
to create the mapping, my
That's the problem.
A River creates documents in another index than _river.
If I look at the river documentation, you can set it using:
index : {
index : my_csv_data,
type : csv_type,
bulk_size : 100,
bulk_threshold : 10
}
So basically, you need to define
Hi - any tips for how I should configure the logging.yml file to give me
more verbose output, including source ip address if possible, to give more
info when an index is created?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe
On Tuesday, July 29, 2014 3:27:13 PM UTC-4, Ivan Brusic wrote:
Have you changed your gateway settings?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after
It still remains a bit of black magic to me. Sometimes it works, sometimes
it
Thanks for the explanation! I'll switch over for the next time I need to
reindex.
On Tue, Jul 29, 2014 at 6:35 PM, Michael McCandless m...@elasticsearch.com
wrote:
Disabling refresh (-1) is a good choice if you are fully maximizing your
cluster's CPU/IO resources (using enough bulk client
Thanks for your response.
I am using Nest dll (.Net) to index the data in ES (in windows as a
service). How to add the geopoint to my index columns?
Regards
Madhavan.TR
On Tuesday, July 29, 2014 3:35:56 PM UTC-5, David Pilato wrote:
No you can't out of the box. If you want to use built
I don't really see the problem, i selected my newindex ( it exsists in my
mapping with my types)
PUT /newindex/
PUT /_river/newindex/_mapping
{
newindex : {
properties: {
marques: {type: string},
ventes: {type: long},
mois:{type: date}
}
}
}
PUT
Hello,
Is it any possibility to specify a parameter value to java command line
behind the JDBC River?
I think at a -Duser.timezone=Europe/Istanbul, for exemple.
When I try to create a JDBC River for an Oracle database (with jprante
plugin) I catch this error.
Thanks
--
You received this
You applied a mapping to index _river and type newindex.
This is not what I said. You need to apply your mapping to newindex index and
newindex type.
Basically something like:
PUT /newindex/
PUT /newindex/newindex/_mapping
{
newindex : {
properties: {
marques: {type: string},
ah, yes, i didn't specify the type, thank you so much for your help
On 30 July 2014 16:03, David Pilato da...@pilato.fr wrote:
You applied a mapping to index _river and type newindex.
This is not what I said. You need to apply your mapping to newindex index
and newindex type.
Basically
ah, yes, i didn't specify the type, thank you so much for your help
On Wednesday, July 30, 2014 4:04:18 PM UTC+2, David Pilato wrote:
You applied a mapping to index _river and type newindex.
This is not what I said. You need to apply your mapping to newindex index
and newindex type.
Hi
I have updated the mapping for my index.. added a column with geopoint..
locationgeo_point
when i search without geo filter for data.. i can able to see the location
information.
{
_index: offlocations_geo,
_type: officelocations,
_id: 21,
Just checking if anybody knows the answer...
On Monday, July 28, 2014 4:14:59 PM UTC-4, Arkadiy Rudin wrote:
Looks like the percolator queries are not getting recorded in any of
existing slow query logs.
Is it something that I am missing in configuration or logging for
percolator is not
Hi !
Use query.
ex :
{
query : {
filtered : {
query : {
match_all : {}
},
filter : {
geo_distance : {
distance : 50km,
city.location : {
lat : 43.4,
lon : 5.4
I am using Java client API to get aggregations back. Following is the
structure which I am dealing with.
aggregations
top_models
buckets
key : BMW
doc_count : 3
top_models
buckets
key : X5
doc_count :
Hello,
We wish to set up an entire ELK system with the following features:
- Input from Logstash shippers located on 400 Linux VMs. Only a handful
of log sources on each VM.
- Data retention for 30 days, which is roughly 2TB of data in indexed ES
JSON form (not including replica
The client has addTransportAddress(). So, I can add all cluster nodes. Is
it intended way? Or - what are those considerations must be taken into
account while adding hosts?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from
Thanks for the detailed reply.
I am a bit confused about and vs bool filter execution. I read this post
http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/ on
the elasticsearch blog. From that, I thought the bool filter would work by
basically creating a bitset for the
You should as many nodes as possible. If you enable client.transport.sniff,
then the transport client will ask the nodes it does connect to about the
other nodes in the cluster, which means you can potentially only need to
specific a single node (not ideal in case that node is down).
--
Ivan
Nope.. it did not work..got exception as
QueryParsingException[[offlocations_geo] failed to find geo_point field
[city.location
Regards
Madhavan.TR
On Wednesday, July 30, 2014 10:08:45 AM UTC-5, Joffrey Hercule wrote:
Hi !
Use query.
ex :
{
query : {
filtered : {
query :
Just FYI, if anyone else runs into the same troubles, Groovy seems to be
provided on a real cluster and it's in version 2.3.2.
On Wednesday, July 30, 2014 1:19:17 PM UTC+2, Laurent T. wrote:
Ok well, anyway i think you may want to update the docs about this cause i
think i won't be the only
On Wednesday, July 30, 2014 8:13:28 PM UTC+4, Ivan Brusic wrote:
You should as many nodes as possible. If you
enable client.transport.sniff, then the transport client will ask the nodes
it does connect to about the other nodes in the cluster, which means you
can potentially only need to
I've been implementing an ELK stack for the past year or so. I had thought
that we would have plenty of space, but recently added a log source that
increased the number of log entries a day by around 30x. That prompted me
to start looking into ways of managing ES's data storage in order to keep
The logging.xml file will only control which logging statements get
outputed, not the amount of information it may contain.
The log line in question does not have the source ip, which is long gone by
the time the service gets the request.
've got an enviroment set on Dev that should keep a log with every query
ran, but it's not writing anything. I'm using the slow-log feature for
it...
These are my thresholds on the elasticsearch.yml:
http://pastebin.com/raw.php?i=qfwnruhD
And this is my whole logging.yml:
The idea is that the cluster should be delayed when a cluster rebalance
occurs, but even with these settings, I often find that shards are moved
immediately.
Are you using the default stores throttling settings? I found them to be
quite low.
Cheers,
Ivan
On Wed, Jul 30, 2014 at 6:02 AM,
This is ES related, but, what Oracle JDBC version is this and what Oracle
Database Server version?
Jörg
On Wed, Jul 30, 2014 at 3:59 PM, George DRAGU george.gd.dr...@gmail.com
wrote:
Hello,
Is it any possibility to specify a parameter value to java command line
behind the JDBC River?
I
It is!
The issue you run into is just a Java dependency issue. Clients don't need
for example to have Groovy. That's the reason it's marked as optional
dependency.
Best.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 30 juil. 2014 à 17:55, Laurent T.
You did specify the type. But you sent the put mapping request in the wrong
index.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 30 juil. 2014 à 16:08, Amira BELDJILALI beldjilal...@gmail.com a écrit :
ah, yes, i didn't specify the type, thank you so much for your
I have a setup with multiple servers.
The file tree for each is like the following:
/data/
configs/
elastic-1.yml
logging-1.yml
scripts/
(empty)
elastic-core/ (from distribution)
bin/...
config/...
lib/...
logs/...
elastic-1/
bin --
Ups, I mean of course, this is *not* ES related ...
Jörg
On Wed, Jul 30, 2014 at 7:59 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:
This is ES related, but, what Oracle JDBC version is this and what Oracle
Database Server version?
Jörg
On Wed, Jul 30, 2014 at 3:59 PM, George
Did more experiments. If I used a real scripts directory, instead of a
symbolic link,
then no error message. But does this means that I will have to drop the
same script
into all my server's config/scripts directory ? It would be nice to use
symbolic links
for this.
Any suggestions ?
On
I've got my rivers working my parent child mapping done.
I've written some has_child querys but I'm a noob to ES is there any
way to join the data . i.e. aggs and bucketing the children ?
if so does anyone have an example.
--
You received this message because you are subscribed to the Google
Hi all,
Would it be possible to get unique items from an elastic search database
that reference 2 fields for uniqueness, all while using only elastic search
or a plug in?
*I.E.*
*Initial Data:*
{
provider: tumblr
text: I need to get this.
}
{
provider: twitter
text: I need to get this.
}
{
Hi all,
I'm trying to use the ip type in ES, but my IPs also have ports. That
doesn't seem to be supported, which was a bit of a surprise!
Does anyone know of a way to do this? Or does it sound like a good feature
to add support for to this type?
Thanks!
Chris
--
You received this message
Hi David,
Backing up indices to a repository is a great way to conserve space in your
cluster.
Curator provides a helper script called es_repo_mgr that will aid in
creation of a repository. There is more information about snapshot
creation here: modules-snapshots.html
Can you give an example what you mean by IP ports?
Transport protocols like TCP has ports, but IP (Internet addresses) is used
to address hosts on a network.
Jörg
On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net
wrote:
Hi all,
I'm trying to use the ip type in ES, but
Sorry this never got responded to. Unless your indices are hourly, and in
a format that curator recognizes, it will not delete anything.
What are your index names, or your naming schema?
--Aaron
On Thursday, May 15, 2014 8:49:00 AM UTC-5, Guillaume boufflers wrote:
Hello buds !
I've
I have a cluster with six nodes. The nodes are in different data centers,
but I don't think that matters, as the connectivity is beefy and thick. I
have turned multicast off and unicast on. Each node knows about all the
others explicitly. When I bring up a visualization of the cluster using the
I am going to use ES to implement gmail-like app, distributed over multiple
cluster nodes. Questions are:
1) Messages will be compressed, so I need to store binaries in index, not
plain text. Is it possible to store binary data with minimal overhead,
without base64 encoding ?
2) I need to
About the HTTP API, I wonder if I want to remote access a cluster on SSH
server, what should I include in my http rest command:
example as mapping:
curl -XGET ' http://localhost:9200/ index /_mapping/ type '
I tried something like below but got failed:
curl -XGET -u user_name:
You need to use SSH directly for it, curl won't work.
ssh user@host -i ~/.ssh/id_rsa.pub
Assuming you have a public key on the server.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 31 July 2014 08:47, Chia-Eng
Standard response to this is ES is not built for multi DC clustering, but
as long as you are aware you are of that then it's fine.
Have you looked at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
?
Regards,
Mark Walkom
Infrastructure
I've seen this as well Ivan, and have also had a few people on IRC comment
on the same thing - shards that are local are not simply being initialised,
but being reallocated elsewhere.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web:
I ran a query:
curl -XGET
$url/ease/RadiologyResult/90642/_mlt?routing=07009409mlt_fields=Observation.Valuemin_term_freq=1min_doc_freq=1pretty
It worked and returned several documents. But if I ran this:
curl -XGET $url/ease/RadiologyResult/_search?routing=07009409pretty -d '
{
query
You may want to look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search.html
If you are just learning ES, then check out
http://exploringelasticsearch.com/
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web:
Help! Elasticsearch was working fine, but now it's using up all its heap
space in the matter of a few minutes. I uninstalled the river and am
performing no queries. How do I diagnose the problem? 2-3 minutes after
starting, it runs out of heap space, and I'm not sure how to find out why.
Here
What java version? How much heap have you allocated and how much RAM on the
server?
Basically you have too much data for the heap size, so increasing it will
help.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 31
JDK 1.7.0_51
It has 512MB of heap, which was enough -- I've been running it like that
for the past few months, and I only have two indexes and around 300-400
documents. This is a development instance I'm running on my local machine.
This only happened when I started it today.
-tom
On
Up that to 1GB and see if it starts.
512MB is pretty tiny, you're better off starting at 1/2GB if you can.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 31 July 2014 10:28, Tom Wilson twilson...@gmail.com wrote:
Thank you for the links. Yeah, I am new to ES. (and http rest)
What I understand is that if I want to get the index documents on my SSH
server, I can SSH log in the server.
And then rest http get from localhost:9200.
Could you explain more about use SSH directly for it?
I think what I want to
Upping to 1GB, memory usage seems to level off at 750MB, but there's a
problem in there somewhere. I'm getting a failure message, and the marvel
dashboard isn't able to fetch.
C:\elasticsearch-1.1.1\binelasticsearch
Picked up _JAVA_OPTIONS: -Djava.net.preferIPv4Stack=true
[2014-07-30
Unless you are attached to the stats you have in the marvel index for today
it might be easier to delete them than try to recover the unavailable
shards.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 31 July 2014
You can also curl from your local machine to the server, without having to
SSH to it - curl -XGET http://IPADDRESS:9200/
You don't need to provide SSH credentials for that transport client example.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email:
Thanks for the information
2014-07-30 14:55 GMT+08:00 joergpra...@gmail.com joergpra...@gmail.com:
There is no limit in ES.
Each type uses a certain amount of heap for caching ids and the mapping.
You can create types / mappings until heap explodes. Each modification of a
mapping is
Hello community,
I'm having problem understanding how analyzer should work. The result is
different from what I expect. :(
I have created a custom analyzer to index phone number as below:
analysis : {
analyzer : {
phone : {
It's probably easier to do a char filter to remove all non digits. On the
other hand if you want to normalize numbers that sometimes contain area and
country code to numbers you'll probably want to do that outside of
elasticsearch or with a plugin. That gets difficult when you need to handle
non
First, put some sample data:
curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '
{
title: jumping tom,
val: 101
}'
curl -XPUT 'localhost:9200/testindex/action2/1?pretty' -d '
{
title: jumping jerry,
val: test
}'
as you can see, and the mapping is :
{
action1 : {
Hello Peter ,
You have set these variable for the API and not the query , that is why its
working - min_term_freq=1 , min_doc_freq=1
Thanks
Vineeth
On Thu, Jul 31, 2014 at 5:02 AM, Peter Li jenli.pe...@gmail.com wrote:
I ran a query:
curl -XGET
Hi Nikolas
Thank you very much for your feedback. I was hoping to be able to search
against the phone number field in normalized, original, number parts format.
If I modify the input into normalized format, then, search using
original/number parts will not return the desired result...
Or am I
As far as I understand, Java client instance is stateless, and it's methods
are pure functions (I means operating methods rather those related to
initial configuration just after instantiation). As a result, it is
sufficient to have the only client for given cluster for given JVM. Is it
true?
1 - Looks ok, but why two replicas? You're chewing up disk for what reason?
Extra comments below.
2 - It's personal preference really and depends on how your end points send
to redis.
3 - 4GB for redis will cache quite a lot of data if you're only doing 50
events p/s (ie hours or even days based
94 matches
Mail list logo