I've created a simple script to do the same. Below is the Github url
https://github.com/ajaydivakaran/es-index-cloner
On Friday, November 21, 2014 9:23:31 PM UTC+5:30, Ori P wrote:
I'm looking for a way to clone an entire index with all it's documents
into a new index with a different name.
Hi all,
I have around 20K JSON documents that have around 350 fields to be pushed
into ES (0.90.13). The mapping type is all multi valued string fields.
Currently I have a single shard and 0 replicas.
My question is whether ES(Lucene) is capable of handling these large
documents? Or does it
Hi all,
I'm using ES version 0.90.7.
My requirement is to fetch all the fields of the JSON documents that I
search/filter for. Is there a performance difference in specifying all the
field names in the search request vs not specifying the 'fields' parameter.
By not specifying the fields
Then you have an error in your program.
I tried it and it works. There is no character handling, except lowercase
filter.
See:
https://gist.github.com/jprante/4e5cd1ac2220ef8c39ca
Jörg
On Sun, Nov 23, 2014 at 4:27 AM, prachi...@gmail.com wrote:
I am using Java Transport client. Where do we
This is not large. I have 5500 fields in ~100m docs, most fields are not
analyzed / not indexed. Sure, this works perfectly.
Jörg
On Sun, Nov 23, 2014 at 10:19 AM, Ajay Divakaran ajay.divakara...@gmail.com
wrote:
Hi all,
I have around 20K JSON documents that have around 350 fields to be
Each field given in search request adds up to search performance because
the field must be searched, weighted, scored, and ranked.
I recommend using _all field, or combining fields into fewer fields with
copy_to, and use as few fields as possible for search.
Jörg
On Sun, Nov 23, 2014 at 10:56
Jorg
Thanks for your reply.
Just to clarify the 'fields' that I was referring was for retrieval not against
which to search.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
I have a web service, where every time a button is pressed from UI, it
connects to elastic search and fires a query. This is the code which is
executed every time. The issue is that after a while, intermittently, the
UI hangs.
private static final String CONFIG_CLUSTER_NAME = cluster.name;
You do not correctly instantiate TransportClient as a singleton.
From the code snippet it is not clear, but it seems that you
recreate TransportClient at each query, which is not correct.
Jörg
On Sun, Nov 23, 2014 at 4:36 PM, prachi...@gmail.com wrote:
I have a web service, where every time a
If you have a single node cluster, they are unassigned.
Jörg
On Sat, Nov 22, 2014 at 8:38 PM, Jingzhao Ou jingzhao...@gmail.com wrote:
Hi,
I checked the index stats by visiting
http://localhost:9200/raw/_stats?pretty
{
_shards : {
total : 10,
successful : 5,
failed : 0
Jorg,
Thanks for your reply.
Does the memory spike when you retrieve these documents in bulk as a result
of a query/filter result?
On Sunday, November 23, 2014 7:43:49 PM UTC+5:30, Jörg Prante wrote:
This is not large. I have 5500 fields in ~100m docs, most fields are not
analyzed / not
Hi, Jörg,
doc_values is an awesome feature even for my normal uses! I have lots of
numeric data that does not need string analyzer. I can move these data to
be managed by the OS file system.
For my memcached case, I have _source: { enabled: false } to the whole
index and enable doc_values
Hi, Jörg,
Yes, I got that now. On my development PC, I changed things in
elasticsearch.yml to
index.number_of_shards: 1
index.number_of_replicas: 0
The stats output becomes:
_shards : {
total : 1,
successful : 1,
failed : 0
},
Thanks a lot for your help!
Jingzhao
On
The 5 that were not assigned is that because you set a limit on the max
shards per node?
On Sunday, November 23, 2014 11:24:34 PM UTC+5:30, Jingzhao Ou wrote:
Hi, Jörg,
Yes, I got that now. On my development PC, I changed things in
elasticsearch.yml to
index.number_of_shards: 1
Hi, Ajay,
The 5 that were not assigned is that because you set a limit on the max
shards per node?
I don't see any limit on the max shards per node. I did a Google search and
cannot find such parameter. I cannot only see memory limits. Mind sharing
more details?
Thanks a lot for your
See the section on Total shards per node.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
On Monday, November 24, 2014 12:15:54 AM UTC+5:30, Jingzhao Ou wrote:
Hi, Ajay,
The 5 that were not assigned is that because you set a limit on the
Advice to increase indices.recovery.concurrent_streams sounds suspiciously
specific to me :-) What made you so confident that it is the bottleneck for
recovery in most cases? And how
cluster.routing.allocation.node_concurrent_recoveries
should be set?
On Sunday, November 23, 2014 6:27:40 AM
Hi!,
I have a document like this:
{
type: film,
countries: [US, ES]
}
And i insert it in elasticsearch, then i do the follow search:
GET _search?search_type=dfs_query_and_fetch
{
query: {
filtered: {
query: {
term: {type: film}
},
filter: {
query: {
No.
Use filtered query to avoid memory spikes by post filter.
Jörg
On Sun, Nov 23, 2014 at 6:22 PM, Ajay Divakaran ajay.divakara...@gmail.com
wrote:
Jorg,
Thanks for your reply.
Does the memory spike when you retrieve these documents in bulk as a
result of a query/filter result?
On
Can you rephrase your question, I do not understand
Jörg
On Sun, Nov 23, 2014 at 6:43 PM, Jingzhao Ou jingzhao...@gmail.com wrote:
My remaining question is: how can I verify the field data is needed on
disk?
--
You received this message because you are subscribed to the Google Groups
The default indices recovery performance is limited by 3 concurrent streams
and 20MB/sec. This is very slow on my machines. YMMV.
Jörg
On Sun, Nov 23, 2014 at 9:01 PM, Konstantin Erman kon...@gmail.com wrote:
Advice to increase indices.recovery.concurrent_streams sounds
suspiciously specific
Hi all,
I have a tests index with 43 million documenst. there is a string document
value for each document. (about 5-10 character value for each document)
Mapping is:
{
myindex : {
mappings : {
num_type : {
_type : {
store : true
},
Sorry about that.
I meant to ask: with doc_values set to true, how to verify that the field
data is stored in file system cache, instead of JVM heap.
Thanks,
Jingzhao
On Sunday, November 23, 2014 2:39:32 PM UTC-8, Jörg Prante wrote:
Can you rephrase your question, I do not understand
A term query will not analyze the search terms, so if your countries field
is using the default analyze, there will be no match since the standard
analyzer will lowercase the terms. Either set your field as not_analyzed or
use another query such as match.
--
Ivan
On Sat, Nov 22, 2014 at 4:35
Nobody?
No ideas why bulk upserts slow down over time?
Loading 9 million documents starts off at 2000+ per second and, by hour
three, is down to 300 per second. The whole job takes the better part of 8
hours, with this linear slowdown.
Nobody has an idea? I'm drawing a blank, myself!
--
You
FYI it is the weekend still for parts of the world, and we all enjoy our
time off :)
How many nodes do you have? What is your heap size? Are you monitoring your
system and ES, if so what does it tell you? Have you tried increasing the
bulk count?
On 24 November 2014 at 16:48, Christopher Ambler
26 matches
Mail list logo