You should try posting this on https://discuss.elastic.co/ . This email
list has been deprecated in favor of using that. There are settings that
make it function almost the same way the mailing list functioned.
On Thu, Aug 27, 2015 at 3:39 AM, Ron Sher ron.s...@gmail.com wrote:
Hi,
I'm using
Try asking this at discuss.elastic.co.
On Mon, Aug 24, 2015 at 12:47 AM, shoebalig shoeba...@gmail.com wrote:
Hi Members,
I have a cluster using N nodes, to scale up my cluster I want to add few
more nodes at runtime without any downtime. Is there anyway to add nodes in
cluster using Cluster
From a lucene context doctypes are indexed together and filtered. So it is
just like having one big index. If two doc types share the same field name
then that fields IDF will be for both.
You should test it but it's often OK.
On Jun 1, 2015 7:06 AM, Avinash Pandey avinashpandey.i...@gmail.com
Github. Stack overflow but their search isn't that nice the last time I
checked.
On May 30, 2015 2:53 PM, Flavio dep...@gmail.com wrote:
Can someone point me to great live websites using ElasticSearch?
Preferably with complex search scenarios using aggregations and many
advanced features.
What you just described should work fine. exclude._ip will move the shards
off of the nodes you exclude but queries and updates can proceed while this
is happening because the data is still on the old nodes. The updates will
make their way to the new copies via a transaction log reply mechanism.
to that old machines and
new machines will not know each other( I mean unicast variable of the
elasticsearch.yml, the old machines will know just old ones). Do you think
it can cause a problem ?
Thanks
29 Mayıs 2015 Cuma 14:45:40 UTC+2 tarihinde Nikolas Everett yazdı:
What you just described
Dedicated master nodes are super convenient if you have the it
infrastructure to host them on shared machines because they are very low
load and its useful to be able to restart the master nodes quickly. We
don't have that kind of infrastructure and our cluster is pretty large and
not having it
I get the sense that this is a good start though I haven't watched it
myself:
https://www.elastic.co/elasticon/2015/sf/elasticsearch-data-journey-life-of-a-document-in-elasticsearch
On May 25, 2015 5:45 AM, Jason Wee peich...@gmail.com wrote:
If you have that basic knowledge, perhaps the next
On Thu, May 21, 2015 at 12:49 PM, Swaraj Banerjee swa...@expectlabs.com
wrote:
When multiple nodes need to communicate to handle a query, what protocol do
they use?
If I issue a search request to an index that lives on multiple shards
(that are on separate nodes), I send the request to ES
It merges segments in response to indexes and updates so an index that
doesn't change will not have merges. You can manually optimize the index
once, when it is mostly done with updates. Once the index is optimized
further calls to optimize with the same parameters are noops.
You can't really ads
I'm RAID 0 all the way. The striping is much more complete then ES's
path.data and operations is more used to the tool around it. Software raid
in linux is fine for this. We only do two disks in RAID 0 though because we
don't like the increased failure chance. So 10 in RAID 0 is a bit much. 10
I suspect at that point they'll pop out as Longs. Its just my suspicion. I
haven't read that bit of the code.
On May 11, 2015 10:08 AM, euneve...@gmail.com wrote:
I have a mvel script (groovy looks the same) as follows:
if (!ctx._source.list.contains(document)) {ctx._source.list +=
On Mon, May 4, 2015 at 12:12 PM, leslie.hawthorn leslie.hawth...@elastic.co
wrote:
Hello everyone,
We took in feedback on moving to a Discourse based forum for about a
month, and it sounds like most of the folks who thought it might not be
optimal were people who preferred to interact with
I suspect its read only while they sort out resourcing issues. Cache hit
rate is likely quite high while readonly.
On May 4, 2015 12:38 PM, Jürgen Wagner (DVT) juergen.wag...@devoteam.com
wrote:
The site is read-only. No signups possible. Hmm...
Good luck!
--Jürgen
--
You received this
On Wed, Apr 29, 2015 at 2:53 PM, Loren lo...@siebert.org wrote:
The docs
http://www.elastic.co/guide/en/elasticsearch/guide/current/common-terms.html
mention that One of the benefits of cutoff_frequency is that you get
domain-specific stopwords for free.
It seems like the index-per-user
Yup - still looks like a bug to me. I think the right thing to do is file
it on github.
On Wed, Apr 29, 2015 at 3:20 AM, Zaid Amir redserpe...@gmail.com wrote:
Sorry for the delay was a bit occupied making sure everything worked as
expected.
So here, I created a gist of the issue and hope it
On Tue, Apr 28, 2015 at 12:43 PM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
I'm deploying ElasticSearch on a cluster with different node sizes, some
have 32GB memory, and some have 16GB. I hope more shards will be allocated
on nodes with bigger memory.
I googled a bit, there're some settings
If its not in the issues its unlikely that its planned. If it isn't planned
I think filing an issue is a good thing - just be super clear what you want
to do with examples in curl/gist form. If it is planned maybe add your
proposed usage to the issue.
Nik
On Tue, Apr 28, 2015 at 11:26 AM, Ian
You may want to write your question in json form. Like with a little arrow
saying this value is the one I want.
On Wed, Apr 22, 2015 at 9:04 AM, Kevin Reilly kmreilly...@gmail.com wrote:
Bump.
On Monday, April 20, 2015 at 2:48:51 PM UTC-4, Kevin Reilly wrote:
Hi. Are query boost values
On Wed, Apr 22, 2015 at 2:41 PM, Ed Kim edki...@gmail.com wrote:
Hi, I have a dynamic query built via java api that assembles a filtered
query depending on the parameter input. I have about a dozen filters
(mostly term filters) that may or may not be used, and had a couple
questions:
1. Is
at the individual filter level, as they will be bundled
differently depending on the params. Thanks for the clarification!
On Wed, Apr 22, 2015 at 11:53 AM, Nikolas Everett nik9...@gmail.com
wrote:
On Wed, Apr 22, 2015 at 2:41 PM, Ed Kim edki...@gmail.com wrote:
Hi, I have a dynamic query
Have you profiled it and seen that reading the source is actually the slow
part? hot_threads can lie here so I'd go with a profiler or just sigquit or
something.
I've got some reasonably big documents and generally don't see that as a
problem even under decent load.
I could see an argument for a
On Thu, Apr 16, 2015 at 10:21 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:
The time required for update depends on the peculiarities of the update
operations, the massive scripting overhead, the refresh operation, and the
segment merge activities that are related.
The number of
On Thu, Apr 16, 2015 at 10:54 AM, Mitch Kuchenberg mi...@getambassador.com
wrote:
Hey Nik, you'll have to forgive me if any of my answers don't make sense.
I've only been familiar with Elasticsearch for about a week.
1. Here's a template for my documents:
On Thu, Apr 16, 2015 at 9:40 AM, Mitch Kuchenberg mi...@getambassador.com
wrote:
I'm currently working on implementing ElasticSearch on a Django-based REST
API. I hope to be able to search through roughly 5 million documents, but
I've struggled to find an answer to a question I've had from
Yes _but_ its generally better to do those transforms on the source
application. The idea is that you'll often want to return multiple things
from the source so loading the whole thing is usually better than loading a
bunch of stored fields.
If your looking for the minimal possible amount of
I want to expand on this a bit - both copy_to and transform only modify the
_indexed_ document, not the source document. The thinking is that you can
modify the source document yourself in the source application but the
source application _can't_ modify the indexed document without modifying
the
Using inline highlighters doesn't help highlighting. No. For the most
part you should stay away from inline analyzers and use a mapping instead.
On Tue, Mar 31, 2015 at 12:02 PM, Viacheslav Shalamov sslavian...@gmail.com
wrote:
Hi all, could you help me with little problem regarding
I believe elasticsearch loads the whole indexed document into ram before
indexing. It certainly loads the whole document in ram for things like
source filtering. Lucene doesn't require this, but elasticsearch does it
because for the typical use case its fine.
On Mar 27, 2015 2:59 PM, Hao
My documents range from a couple of kilbytes to tens of megabytes and most
things work fine. Beware the plain highlighter on long string fields but
otherwise you are probably ok. Its certainly less efficient to store huge
documents because when you want to return portions of them (other than
query_string is a bit of a trap - if you write an invalid query it just
crashes. So you find yourself working around it with tons of escaping.
Its also really really powerful and shouldn't be exposed directly to end
users unless you want them to be sneaky.
For the most part I'd suggest using the
Try escaping the hash tag. It has a special meaning in the Lucene Dialect
of Regular Expression
https://lucene.apache.org/core/4_1_0/core/org/apache/lucene/util/automaton/RegExp.html?is-external=true
.
On Thu, Mar 19, 2015 at 11:44 AM, Mahesh Kommareddi
mahesh.kommare...@gmail.com wrote:
Hi,
On Tue, Mar 17, 2015 at 8:56 AM, Vlad Zaitsev vest...@gmail.com wrote:
But it seems that highlighter ignore operator: “and” and highlight any term
from queries.
Its much more than that. For the most part highlighters reduce the query
to a list of terms blindly. Some do phrases. They don't
I imagine the right way to do this is with a plugin but I'm not 100% sure.
On Tue, Mar 17, 2015 at 11:47 AM, Devaraja Swami devarajasw...@gmail.com
wrote:
I plan to store floats in the payload and boost the score
(multiplicatively) based on the average value of the payloads over the
Have a look at what curator does. I believe it optimizes but I'm not sure
how.
On Mar 6, 2015 10:22 PM, Kadaan jbaran...@gmail.com wrote:
Is there a recommended process for optimizing indices which have
transitioned to a readonly state? For instance should we optimize indices
to a single
I just released version 1.4.1 of the experimental highlighter. It fixes a
single issue that made the highlighter not work when highlighting *:
* https://github.com/wikimedia/search-highlighter/issues/9
It might take sonatype an hour or so to sync it to central.
Nik
--
You received this
On Tue, Mar 3, 2015 at 1:02 PM, Sagar Shah sagarshah1...@gmail.com wrote:
Hello everyone,
I am working on a defining a mapping in elastic search, which can have few
fields on the fly. I can define the types index using dynamic templates,
but I would like to know the difference between
I have 30GB shards and the biggest problem I have is that they take a long
time to replicate to other machines. I believe there are memory issues for
very large shards as well but I don't know them that well.
Nik
On Feb 20, 2015 7:31 PM, Prasanth R prasanth.sunr...@gmail.com wrote:
Could you
You might want to try hitting hot threads while putting your load on it and
seeing what you see. Or posting it.
Nik
On Thu, Feb 12, 2015 at 4:44 PM, Jay Danielian jay.daniel...@circleback.com
wrote:
Mark,
Thanks for the initial reply. Yes, your assumption about these things
being very
There are known big installations using OpenJDK and Oracle JDK. I don't
know any using IBM. I imagine your more likely to find something on that
JDK then others but you'll probably do ok. Certainly be sure to add the
config parameter mentioned on that page and expect to have to fiddle with
the
You are likely observing how java heap works. Use a tool like jstat to
check how much the heap is in use to see real usage. Nutshell: java never
returns memory to the OS. You tell it a min it can use and it allocates
that on startup. You tell it a max and it won't allocate more.
Memory mapping
The current default scripting language, groovy, is sandboxed. If you still
don't want to use it your only option is the get update put sequence.
On Jan 19, 2015 1:29 PM, Jason Lee pump.min...@gmail.com wrote:
I'm trying to add new values to an existing array field in a document.
I've noticed
Highlighting is complex and more hacky than you'd imagine at first glance.
Each highlighter is different and we can't tell which one you are using
without seeing your mapping. For the plain highlighter the cost is roughly
proportional to the length of the highlighted field. So in your case its
the
Yes. You can change the Dir scanned for plugins. Look at the init script
for the name of the parameter.
Or symlinks. Always your friend.
On Jan 16, 2015 7:11 PM, Jinyuan Zhou zhou.jiny...@gmail.com wrote:
Thanks,
--
You received this message because you are subscribed to the Google Groups
What about explain?
On Wed, Jan 14, 2015 at 3:24 PM, Ed Kim edki...@gmail.com wrote:
Just a friendly bump to see if anyone has any feedback. :)
On Saturday, January 10, 2015 at 10:38:34 PM UTC-8, Ed Kim wrote:
Hello all, I was wondering if anyone could offer some feedback on whether
there
On Tue, Jan 13, 2015 at 7:32 AM, Daniel Jansson daniel.jans...@dn.se
wrote:
Hi
We are performing a rolling upgrade from 1.3.2 to 1.4.2.
We have turned off reallocation.
After upgrading 2 of 3 nodes we are receiving lots of warnings/errors in
the log file:
in node running 1.4.2:
Most clients will take a list and retry on connection failure. That is
what you want.
Nik
On Tue, Jan 13, 2015 at 9:52 AM, Vasu Thota vasu@gmail.com wrote:
Thanks David.
Now, which HTTP URL of elastic-search i need to configure from my client
application which is communicating with ES
Here are the javascript dependencies:
https://github.com/elasticsearch/kibana/blob/master/bower.json
I assume its one of those.
On Mon, Jan 12, 2015 at 11:20 AM, Mauro Julián Fernández
mauroj.fernan...@gmail.com wrote:
I used Kibana for a couple of tasks in works and I like the charts it
On Thu, Jan 8, 2015 at 9:09 PM, Jeff Steinmetz jeffrey.steinm...@gmail.com
wrote:
Is there a better way to do this?
Please see this gist (or even better yet, run the script locally see the
issue).
https://gist.github.com/jeffsteinmetz/2ea8329c667386c80fae
You must have scripting enabled in
, Nikolas Everett wrote:
On Thu, Jan 8, 2015 at 9:09 PM, Jeff Steinmetz jeffrey@gmail.com
wrote:
Is there a better way to do this?
Please see this gist (or even better yet, run the script locally see the
issue).
https://gist.github.com/jeffsteinmetz/2ea8329c667386c80fae
You must have
}
}
}'
On Thursday, January 8, 2015 at 9:15:28 PM UTC-8, Nikolas Everett wrote:
Source is going to be pretty sloe, yeah. If its a one off then its
probably fine but if you do it a lot probably best to index the count.
On Jan 9, 2015 12:04 AM, Jeff Steinmetz jeffrey@gmail.com wrote:
Thank you
There are two ways to perform regex matching with Elasticsearch and both
require multi-fields
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/mapping-multi-field-type.html
.
The first way is to create a not_analyzed subfield like on the link above
and query it like
That is a ton of data to keep open. Can you squish it somehow?
On Tue, Jan 6, 2015 at 3:24 PM, Mark Walkom markwal...@gmail.com wrote:
The best way is to add more nodes.
There isn't much you can do with that amount of data!
On 7 January 2015 at 06:09, David Mavashev crypti...@gmail.com
The max length restriction is per token so its unlikely you'll see it
unless use not_analyzed fields. You can work around it by setting the
ignore_above option on the string type. That'll just throw away the token.
Nik
How does this MAX_LENGTH restriction impact on a custom_all field where we
may
Logging.yml is a funky wrapper around log4j.properties style log4j
configuration so that is why you don't see as much documentation on it.
Do you see log lines smashed together and cut apart randomly? That'd be a
bug.
Its customary for logs to be single lines except for stack traces which
://manning.com/synhershko/
On Wed, Dec 31, 2014 at 5:38 PM, Nikolas Everett nik9...@gmail.com
wrote:
Highlighting isn't a nice pretty thing - its kind of a hacky. There are
three highlighters built in
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request
Simplest way might be to push an update to the old versions of the
documents to mark them as old and do aggregations filtering those out.
There isn't a great way to deduplicate, really.
On Thu, Jan 1, 2015 at 11:50 PM, Kshitij Gupta kshi...@vnera.com wrote:
Hi,
I am working on a system where
On Wed, Dec 31, 2014 at 8:37 AM, N Bijalwan ahcir...@gmail.com wrote:
I am trying to update a document to capture page visit or hitcount which
has id containing http:// say
http://shashankp254.wordpress.com/about/feed/
That is probably a bad idea. Partial updates don't exist at the level of
Highlighting isn't a nice pretty thing - its kind of a hacky. There are
three highlighters built in
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html
to Elasticsearch and they all work differently. You should try all of them
and see if they do
, Nikolas Everett wrote:
On Wed, Dec 31, 2014 at 8:37 AM, N Bijalwan ahci...@gmail.com wrote:
I am trying to update a document to capture page visit or hitcount which
has id containing http:// say http://shashankp254.
wordpress.com/about/feed/
That is probably a bad idea. Partial updates don't
Use the analyze API to get a view into how your analysis chain (tokenizer
and filters) affect text.
The index itself is all jumbled together with all the documents and there
isn't a good way to dig the data for a single document out of it.
On Dec 31, 2014 10:36 PM, Bruno Kamiche
Your best bet is to look at github issues and pull requests tagged for the
next release.
Elasticsearch the company has a roadmap for elasticsearch the open source
project but it isn't public.
Nik
On Dec 29, 2014 6:57 AM, PrasathRajan prasanth.sunr...@gmail.com wrote:
Hi All,
Does
No it doesn't. Highlighting is way weirder to implement then it probably
should be so concepts like score don't match over too well. They do weigh
segments but that wight isn't the same beast as a document score. Its much
more heuristicy. None of them support a minimum weight cutoff.
You could
IcedTea isn't a JVM version. Give us `java -version`. It looks like that
version of IcedTea could be OpenJDK 7u71 which is generally fine (we use it
under plenty of loaf). It could also be jamvm or cacao or zero/shark. Those
probably won't work. Lots of folks suggest oraclejdk so you may as well
Setting index.load_fixed_bitset_filters_eagerly to false fixed
everything for now.
I could argue that not running Gentoo in production is crazy, but it
really depends on your personal preferences :)
On Saturday, December 27, 2014 4:34:02 PM UTC+3, Nikolas Everett wrote:
IcedTea isn't a JVM
Transform doesn't change the source, just how it is indexed. I made it that
way because I figured I'd you want to change the source you can do it on
the application feeding elasticsearch. Transform is a way to index stuff
but leave it out of the source. Its copy_to on steroids.
Another reason
If you need an example CirrusSearch is the name of the plugin that uses
elasticsearch for MediaWiki. I can't attest to the code quality but it
certainly gets the job done.
Nik
On Dec 25, 2014 2:54 AM, Jason Zhang moc...@gmail.com wrote:
Here's the official Elasticsearch PHP client
I think the key part of the question here is about filters? Filters are
always up to date modulo refresh interval. Its pretty efficient because
Lucene's segments are immutable so once a filter has been applied to a
segment you can cache its results and merge it with the deletes list to
have for
On Wed, Dec 24, 2014 at 2:03 PM, Mark Walkom markwal...@gmail.com wrote:
You should drop your heap to 31GB, over that and you lose some performance
and actual heap stack due to uncompressed pointers.
I believe the magic number is 32GB:
General rule:
- Use source filtering unless you can't. Source filtering works if the
field is in the document you indexed. Fields is required if you want to
load a stored field. You only _need_ to store fields if they are synthetic
like from word count or from transform.
Advanced thing I've never
Does source fallback? I remember trying and getting nothing.
On Dec 22, 2014 7:33 AM, Itamar Syn-Hershko ita...@code972.com wrote:
Fields are used to pull data from stored fields whereas source filtering
is targeting _source. At the moment both fallback on each other, so the
differences is in
I'd add a new field and check for it. Or do a search that won't find
anything unless it took effect. The document is stored untransformed so
just fetching the document won't show you anything.
On Dec 22, 2014 12:35 PM, Nick Wood nwood...@gmail.com wrote:
Hello,
I'm trying to implement a
I think you need type: custom inside analyzer: {default:{}}.
On Dec 21, 2014 5:08 PM, Ilya Kantor ilia...@gmail.com wrote:
Please let me know what I'm doing wrong or where to look/debug.
1. I git cloned https://github.com/asyncee/elasticsearch-russian-config/
2. Downloaded elasticsearch-1.4.2
Check what curator is doing with your index. Its probably fiddling with
index.routing.allocation.include and index.routing.allocation.exclude.
When you create the new index just set it pick up the ssd tag. You'll have
to make sure that curator knows how to strip that tag when the time comes
to
On Fri, Dec 19, 2014 at 12:51 PM, Gill Singh parmvirgil...@gmail.com
wrote:
Hi, I am new here, just joined this group!
We are looking for a new Search Engine for our Intranet site. Can
ElasticSearch be used for Crawling, Indexing and Searching Intranet type
sites? We will need to crawl/index
You have to reenable allocation after the node comes back and wait for the
shards to initialize there.
On Fri, Dec 19, 2014 at 3:23 PM, iskren.cher...@gmail.com wrote:
I'm maintaining a small cluster of 9 nodes, and was trying to perform
rolling restart as outlined here:
I believe so.
On Fri, Dec 19, 2014 at 3:39 PM, iskren.cher...@gmail.com wrote:
On Friday, December 19, 2014 12:31:33 PM UTC-8, Nikolas Everett wrote:
You have to reenable allocation after the node comes back and wait for
the shards to initialize there.
So this means the tutorial
I think aggregating 32 shards on one node is a bit degenerate. I imagine
its more typical to aggregate across one of two shards per node. Don't get
me wrong, you can totally have nodes store and query ~100 shards each
without much trouble. If aggregating across a bunch of shards per node
were a
On Wed, Dec 17, 2014 at 6:03 PM, Ye D y...@volarvideo.com wrote:
cluster.routing.allocation.exclude._ip: ip1, ip2
I use this one and I'm pretty sure its worked for me in the past.
Nik
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To
On Dec 17, 2014 11:20 PM, Swaraj Banerjee swaraj...@gmail.com wrote:
Hi all, I have a an ES cluster hosted on amazon with ~ 7000 indexes (most
of which are sparsely populated 100 docs). Up till today, creating or
deleting an index in the cluster took ~3 seconds. All of a sudden, creating
or
Search consumes O(offset + size) memory and O(ln(offset +
size)*(offset+size) CPU. Scan scroll has higher overhead but is O(size) the
whole time. I don't know the break even point.
The other thing is that scroll provides a consistent snapshot. That means
it consumes resources you shouldn't let
Look at multifields. They let you send the field once and analyze it
multiple times. You also might want to use keyword ananlyzer and lowercase
filter rather than not_analyzed. Folks are used to case insensitivity.
Nik
Is there a way to do exact and full text searches without having to create
We solve problems like this in two ways:
Adding queueing or concurrent request limits.
Queueing buys retries for free and can absorb temporary shocks. You can
also get things like priority, backlog monitoring, and manual backlog
grooming. I think logstash already supports this, but I don't know
Striping raid is viable for 2 or 3 disks because of the redundancy.
Software raid works fine for me. Hardware raid enables battery backed write
behind but I don't know how important that is with ssds. Either way, we go
2xSSDs per server with os in mirrored raid and data striped.
Depending on your
Best way to do it is on the client side I believe. You could probably
abuse transforms to just blow up when you see something you don't like. I
don't _think_ they have the ability to manipulate the operation (to make it
noop) though. If they do there certainly aren't any tests to make sure
that
The only thing to keep in mind is that if the node is down you should just
retry on another one. The client might handle that for you, I dunno. its
important though because you don't want to lose 1/4 of your traffic when
you restart a node.
Nik
On Thu, Dec 11, 2014 at 3:11 PM, Nick Canzoneri
Yes. If you want noop script updates you have to do something else. There
are docs on the script page.
On Dec 11, 2014 3:45 PM, Loren lo...@siebert.org wrote:
The documentation
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html#docs-update
for detect_noop
Its never been a problem for me.
Normally for time series data you handle this by creating a new index every
day. For non-time series data I basically do this:
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/
It has the advantage of letting me change the mapping and
I just finished releasing the wikimedia extra
https://github.com/wikimedia/search-extra Elasticsearch plugin (versions
1.3.0 and 1.4.0). This release adds two things:
1. Elasticsearch 1.4.0 support (in the 1.4.0 version)
2. A new ```safer``` query (in 1.4.0 and 1.3.0 versions). This query
A small change costs as much as a large one. Your best bet is to batch
multiple updates for the same document together if possible. Also make sure
that your updates actually change something. Sending the exact same
document with the same ID still does an update.
On Dec 12, 2014 12:24 AM, Jinal
What are you looking to measure? The indexes don't really have a per
document size because they, well, are indexes. The documents do taken up
some space on disk but they are compressed.
On Dec 10, 2014 6:02 AM, Jojo Juju tv.in.con...@gmail.com wrote:
Hi,
I'm fairly new to ES and I wonder if
:
Compressed size of a document on disk would be enough. We use store level
compression not the per document.
Would this be then actually possible?
Thanks
On Wednesday, December 10, 2014 1:27:49 PM UTC+1, Nikolas Everett wrote:
What are you looking to measure? The indexes don't really have
On Mon, Dec 8, 2014 at 9:11 AM, Sushmitha Chakka
sushmi...@sigmoidanalytics.com
Hi,
I have an index with 6 Crores of records. My usecase is to read the
entire index, check each record, whether it is present in new index or
not.If not I have to index into new index. I used scan and scroll
I'm not sure what is up but remember that post_ids in the script is a list
not a set. You might be growing it without bounds.
On Dec 8, 2014 2:49 PM, Christophe Verbinnen djp...@gmail.com wrote:
Hello,
We have a small cluster with 3 nodes running 1.3.6.
I have an index setup with only two
Our you can always transform in you client application. The advantage of
transform is that it is done _post_ source like copy_to. Meaning is you
like the original format for disk space and highlighting purposes you
should use transform. If you don't, transform in your app.
Nik
On Sat, Dec 6,
Also, its usually better to use a match query if you want to analyze the
query rather than query_string. Query string exposes a huge array of
syntax which is both useful and terribly dangerous. Users can write
regexes and huge range queries and fuzzy queries that use much much more
cpu and ram
I've never found myself wanting shard splitting. I always have an analysis
update I want to apply when I want to reshard data anyway so I just scan
from one index into one with new settings.
I do find the the FAQ a bit odd though. Elasticsearch allows you to do lots
of inefficient things and that
That works ok if you are inserting but updates and deletes become more
complex. Scoring can get a bit funky too because your shards don't have
roughly equal frequencies. All and I'll I'd argue the adding more indecies
behind and alias is only sometimes a solution to the problem.
Nik
On Fri,
On Fri, Dec 5, 2014 at 11:49 AM, Michele Palmia micpal...@gmail.com wrote:
Hi all,
I need to set up a system that provides spellchecking functionality on
user searches, similar to what Google does with its well known *did-you-mean
*suggestions.
The *term suggester* works very well for
On Fri, Dec 5, 2014 at 12:43 PM, Nikolas Everett nik9...@gmail.com wrote:
On Fri, Dec 5, 2014 at 11:49 AM, Michele Palmia micpal...@gmail.com
wrote:
Hi all,
I need to set up a system that provides spellchecking functionality on
user searches, similar to what Google does with its well
1 - 100 of 320 matches
Mail list logo