I see there are no answers yet. Any help appreciated.
But just to repeat the question:
the mvel script fails to match documents
I have a list in elasticsearch containing longs.
I submit one element in mvel script like this if
(document.contains(element)) { ... }
But it never matches.
If I
I have a mvel script (groovy looks the same) as follows:
if (!ctx._source.list.contains(document)) {ctx._source.list +=
document;} else {ctx.op = \none\};
I have a java map
map.put(id, 100L);
When I try to match the above document to an existing ES document it will
fail because it's a
Hi
Client JavaAPI
Version 1.2.1
Requests: bulk
(no parent child mapping)
Does use nested structure.
Using scan and scroll to populate thousands (maybe millions) of delete
requests. It works like this:
Scan Search (for all documents with field f = 'the value')
Scroll in batches of say 1000:
Jorg,
Thanks for the quick turnaround on putting in the fix.
What I found when I tested is that it works for test, testcopy
But when I try with myindex, myindexcopy doesn't work
I noticed in the logs when I was trying myindex that it was looking for
an index test which was a bit odd
So I
Hey Jorg,
Correct. Whew!
If I run just curl -XPOST
'localhost:9200/_push?map=\{myindex:myindexcopy\}'
it works fine.
By the way : is there any way to make this work in sense eg
POST /_push?map=\{myindex:myindexcopy\}
POST /_push
{
map: {
myindex:myindexcopy
}
}
The second one will
Jorg,
Not sure what you mean. There is a flag: createIndex=false which means :
if the index already exists do not try to create it ie it is pre-created.
Import will handle this. Will _push also ?
I have another question which affects me:
I was hoping that _push would write to the index
So just to explain what I want:
- I want to be able to push an existing index to another index which
has new mappings
Is this possible?
Preferably it wouldn't go through an intermediate file-system file: that
would be expensive and might not be enough disk available.
Thanks.
On
Okay when I try that I get this error.
It's always at byte 48
Thanks in advance
Caused by: java.lang.IndexOutOfBoundsException: Readable byte limit
exceeded: 48
at
org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.readByte(AbstractChannelBuffer.java:236)
at
By the way
Es version 1.3.4
Knapsack version built with 1.3.4
Regards.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
OK I can try that
But is there an option in the _push to have a pre created index?
I know it's possible with import createIndex=false
Would export/import be just as good?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this
Jorg,
That is exactly the kind of thing I'm looking for.
I'm having a little bit of difficulty getting it to do what I want.
I want to push an index to another index and change the mapping.
I can import / export okay but the push is having difficulty picking up the
new mappings.
The syntax
Is the way to update mapping of large index as follows
Create empty index with new mapping
Copy old data into new index
Alias new index to previous
If so, what are recommended tools?
Ideally there would be a user interface for IT people to use?
Thanks
--
You received this message because
Hi
I can see there are lots of utilities to copy the contents of an index such as
elasticdump
reindexer
streames
etc
And they mostly use scan scroll.
Is there a single curl command to copy an index to a new index?
Without too much investigation it looks like scan scroll requires repeated
I should have mentioned:
The point is to copy the data only
And then to change the mappings
Snapshot no use sorry because that brings the mappings
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop
Okay so here's the answer
Maybe my question wasn't clear.
I find the documentation isn't clear;;
Anyway to create a node in one thread :
Node node = nodeBuilder().local(true).node();
Client client = node.client();
But then to connect to it in a separate thread from code that knows nothing
I'm trying to connect to the junit embedded node in my code under test.
Normally the code under test makes a getNodeClient to 9300 to join the
cluster.
But I guess that's not an option with the embedded local(true) node.
So what is the answer?
One option is to write the original code (under
Hi
I can append to a list as follows:
PUT twitter/tweet/1
{
list: [
{
tweet_id: 1,
a: b
},
{
tweet_id: 123,
a: f
}
]
}
POST twitter/tweet/1/_update
{
script:
if (!ctx._source.list.contains(newfileinfo)){ctx._source.list +=
newfileinfo},
Any takers?
David Pilato?
On Friday, August 29, 2014 10:33:23 AM UTC+1, eune...@gmail.com wrote:
Hi
Say I have a list of elements like this:
PUT twitter/twit/1
{
list: [
{
a: b,
c: d,
* e: f*
},
{
Hey
I figured it out, but it leads me on to another question.
So the way to execute a script over a list of elements is as follows:
PUT twitter/twit/1
{
list: [
{
tweet_id: 1,
a: b
},
{
tweet_id: 123,
a: f
}
]
}
POST /twitter/twit/1/_update
{
Thanks David
On Sunday, August 31, 2014 7:48:38 PM UTC+1, David Pilato wrote:
You can't do it with elasticsearch but you could try this plugin:
https://github.com/yakaz/elasticsearch-action-updatebyquery
HTH
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 31
Hi
Say I have a list of elements like this:
PUT twitter/twit/1
{
list: [
{
a: b,
c: d,
* e: f*
},
{
1: 2,
3: 4
}
]
}
And I want to change the value of e (currently f) to
Is the clusterName set correctly?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web
What you're describing is the upsert functionality in the mvel scripting.
The upsert will create and populate when the key doesn't exist.
And the update api will add to the document if it does already exist.
--
You received this message because you are subscribed to the Google Groups
I see what you mean about MVEL
I guess the same functionality is available in the replacement. Groovy.
I will need to investigate switching to Groovy.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop
I think you're on the right track. If you just run again you should get the
update.
Does the document appear correct?
Note new option in 1.4 scripted_upsert true
Allows the document to be sent once for efficiency.
--
You received this message because you are subscribed to the Google
Hi
I don't see the answer from the second post ?
I have the same problem :
PUT tweet
PUT tweet/tweet/_mapping
{
tweet : {
_timestamp : {
enabled : true,
path : post_date,
type : date,
format : -MM-dd HH:mm:ss
}
}
}
PUT
Thanks David
So I tried to define the post_date field of type date
and then put the document with the post_date, hoping it would store the
timestamp.
PUT tweet/tweet/_mapping
{
tweet : {
_timestamp : {
enabled : true,
path : post_date
},
Hey it works: I just defined type date no special formatting
I couldn't get the _timestamp to work!
But when I just use post_date: date and inserted a valid date it worked.
Thanks!
On Friday, August 8, 2014 1:16:14 AM UTC+1, eune...@gmail.com wrote:
Thanks David
So I tried to define the
Hi,
This is something I have discovered is happening... if I have a long type
I can still store numbers such as 2, 4 and 1.2
Then when I retrieve the type is not long but string and float;(
In other words in Java when I examine my values I can have 2 (type long) ,
4 (type string) , 1.2 (type
okay so when I don't have a mapping and I do a GET _mapping it returns this:
{
test1: {
mappings: {
type2: {
properties: {
my_int: {
type: long
}
}
Is that not the same as having a mapping?
Then to check the theory I
The problem happens with java and python api when the typechecking throws
exceptions
For example when we use java to read the values mentioned :
test2/lll/1 my_int = 1 class java.lang.String
test2/lll/2 my_int = 2 class java.lang.Integer
test2/lll/3 my_int = 1.2 class java.lang.Double
*So you
Hi,
The marvel overview by default shows 20 indices (third panel)
I guess there is some way to configure this 20? To say 40?
But how to do it?
Your help appreciated.
Regards.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe
Actually it is possible!
It's a mvel thing
In mvel it's possible to
_source.doc.list.remove (elem)
The options in elasticsearch documentation do not go very deep.
Maybe another discussion group for mvel scripting questions ?
--
You received this message because you are subscribed to the
Hi
I want to delete just one element from a list
Is that possible inside a script?
From the docs it only appears possible to delete the document if it
contains an element.
The solution then I guess is to read the entire document. And delete in
memory the relevant piece.
Then write back the
Hello
can person confirm if script can delete an element of an array
I assume it's not possible
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Great answer! Thanks
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
What I am asking is
Do different design decisions apply in elasticsearch compared to relational
Is denormalized better for elasticsearch
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving
Hey,
What is the best way to design indexes in elasticsearch?
I mean in terms of normalization vs denormalization
So am I right in thinking that because elasticsearch is a document database
we don't worry about having a denormalized model?
So let's say I'm in the fruit domain and I have a
You could index the data twice
Once analysed. Once not analysed.
Then query not analysed.
And if results then display those.
Otherwise query the analysed and display those. ?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe
Hi
when I use the terms panel my results are not found because I have nested
documents.
So when I look for a.b.fieldX it doesn't find any
In DSL what I would do is
GET /indexK/_search
{
facets: {
size: {
terms: {field: fieldX},
nested: a.b
}
}
}
How can I tell Kibana
Hey...
No responses.. does that mean it's not possible ?
On Monday, May 19, 2014 2:15:06 PM UTC+1, eune...@gmail.com wrote:
Say I have documents with
{
primary: sport
secondary: swimming
}
{
primary: walking
secondary: reading
}
Is there a way in ElasticSearch that I can
Hey,.
Say I have a list of properties below,
what is the facet query that will give me the totals of where each field
occurs.
In the example below, I want to see f1: 4 etc
index/type/1
{
my_doc: {
list: [
{
f1: aaa,
Say I have documents with
{
primary: sport
secondary: swimming
}
{
primary: walking
secondary: reading
}
Is there a way in ElasticSearch that I can query
where primary = secondary ?
(without specifying any literal value for primary or secondary fields ?)
Thanks,
--
You received
Hi
I am trying to format my _timestamp to be in a human readable format
DELETE /twitter
PUT /twitter
PUT /twitter/tweet/_mapping
{ tweet :
{
_timestamp :
{ enabled : true, store : true, format: }
}
}
PUT /twitter/tweet/1
{
msg: this is the tweet
}
GET
Folks
Apologies about the multiple posts yesterday. A problem with the mobile
device I borrowed, (not android).
Anyway what I was getting at is lets say you have rows in a database
Then what I want to do is:
select col1, count(col2) group by col1
Is this an option in Elasticsearch ?
If I
Me again;)
This looks like what I need but can't get it to work...
Here's the query:
GET /articles/_search
{
aggs: {
group_by_id: {
terms: {
field: id,
size:0
},
aggs: {
sum_count: {
value_count: {
Hey: I figured out a way to do what I want:
GET /articles/_search
{
query : {
match_all : {}
},
sort : {
_script : {
script : doc['tags'].values.length,
type : number,
order : desc
}
}
}
On Tuesday, April 29, 2014
Hi
If I have a list in each document and I want to get the document id plus
the size of the list eg
POST /articles/article
{title : One, tags : [foo]}
POST /articles/article
{title : Two, tags : [foo, bar]}
POST /articles/article
{title : Three, tags : [foo, bar, baz]}
So One will have a
Thanks David but
the list is being appended during bulk indexing.
To calculate the size on each update would slow things down.
Is there no equivalent to
select id, length (tags) order by length (tags) desc;
?
Thanks.
--
You received this message because you are subscribed to the Google
After installing you need to restart elasticsearch.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this
Hey Seif,
I had this problem myself and I think it catches a lot of people.
The problem is because term query is not analyzed.
So you need something like query_string or match query to make your query
analyzed.
Cheers.
On Saturday, April 12, 2014 8:31:37 PM UTC+1, Seif Lotfy wrote:
I am
Thanks guys,
yes I already have refresh interval at -1
What I'm suggesting is that to support multiple client threads say : 50 then it
seems that 50 shards is a big help.
ie more shards equals more concurrency.
Thanks.
--
You received this message because you are subscribed to the Google
!! Hey filter/ moderator, please do not delete my post!!
Seif,
What is happening is that you need to be consistent between indexing and
searching in terms of analyzed
The problem happens because your search Id: Id-9829 is case sensitive
in the Id-9829 bit.
One fix is to search on id-9829
Hi,
I'm testing on a single node.
I find I can get better bulk indexing performance when the index has more
shards. Does that make sense ?
My own theory is that when I have multiple bulk clients, then by increasing
shards the server achieves better concurrency (?)
So if I increase the
Guys
I appreciate the suggestions
But shouldn't actionget() block ?
So there should only be 20 threads (maybe another 20 for ES)
I mean we're saying client threads are just being for each bulk request ?
How does it work for other applications?
I notice search has options singlethread no thread
Is
By the way I can successfully run 16 python processes no problem.
So the server can handle concurrent bulk requests.
The problem is with my java code as it somehow starts threads indefinitely
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To
Hi,
When running the bulk indexing with python everything works fine.. good
solid throughput for the full indexing run.
When doing the same with the Java api what is happening is that thousands
of client threads are being created (7000)
And the server stops indexing and then the client just
If it's any help, this is the error when the threads start to hang:
2014-03-28 13:34:39,845
[elasticsearch[Cerberus][transport_client_worker][T#16]{New I/O worker
#2832}] (Log4jESLogger.java:129) WARN
org.elasticsearch.netty.channel.socket.nio.AbstractNioSelector - Unexpected
exception in
You could be right: I can't test right now but this is my code:
(there may be 20 workerThreads)
As you can see, as each thread submits work, the thread will do a
client.prepareBulk() ... is that sufficient clear out the documents?
workerThread() {
Client client =
Thanks!
I still can't seem to find these settings.
Apologies in advance if I am just missing them...
indices.memory.index_buffer_size
indices.memory.min_shard_index_buffer_size
indices.memory.min_index_buffer_size
And when I run the _cluster/settings all I get is:
{
persistent: {},
Hi
I can index 70m small (1k) records in 40 minutes.
Would that performance be good/bad?
Configuration is 6 x Elasticsearch nodes each with 16GB dedicated memory.
Each node is 8 processor intel linux server
There are 6 clients running locally on each node (localhost) each running
Hi
Is there an example on how to construct a elasticsearch-py ConnectionPool?
ie is it create a list of connections and pass that to the ConnectionPool?
:arg connections: list of tuples containing the
:class:`~elasticsearch.Connection` instance and it's options
--
You received
I have what I think is an obvious question. If I tweak some settings such
as:
index.translog.flush_threshold_period
or
index.merge.policy.use_compound_file
or
index.refresh_interval
or
indices.memory.index_buffer_size
or
index.cache.field.type
or
index.gateway.snapshot_interval
Is it
Hi,
I have a similar question as the OP : what is the best way to get 1m or 30m
records indexed?
I mean I can send client.bulk batches of records but while the request is
being indexed the client is waiting: valuable seconds.
Also I have tried: python elasticsearch-py and there is a
Hey thanks,
So is there a convenient way to asynchronously call the bulk (helpers bulk
or helpers streaming_bulk) in a way that means the client isn't waiting for
the request to complete?
On Sunday, March 2, 2014 5:51:15 PM UTC, Honza Král wrote:
Hi,
the streaming_bulk function in
so excuse me Honza.. am I correct thinking there is no point from a
performance perspective calling helpers.bulk because it will just be
sliced into the chunk size by streaming bulk anyway.
It would make more sense to call helpers.streaming_bulk directly to reduce
the client side activity?
66 matches
Mail list logo