whole purpose is to remove stopword from appearing in term facets
On Wednesday, 15 January 2014 19:10:49 UTC+11, Khoa Nguyen wrote:
*Environment setup*
Mac OSX
ES 0.90.7 installed via homebrew
*Steps*
update config
#
Yep, this seems definitely related to the linked issue. I am looking
forward to testing your fix, hopefully in 0.90.11 :)
Regards,
-- JB.L
2014/1/14 Martijn v Groningen martijn.v.gronin...@gmail.com
So this is related to:
https://github.com/elasticsearch/elasticsearch/issues/4703
On 14
Thanks :) The work around for now is I guess to use the has_parent query
instead of the has_parent filter.
On 15 January 2014 09:17, Jean-Baptiste Lièvremont
jean-baptiste.lievrem...@sonarsource.com wrote:
Yep, this seems definitely related to the linked issue. I am looking
forward to
Jörg,
Thanks for your response, I knew the bulk request was an option, however is
there any performance impact using that method to send one off updates? I
just don't want to call the bulk request for single item additions/updates
if it is going to adversely affect performance. Any thoughts
Hi Eugene,
Thanks for your comments - I'll do my best to explain where I am coming
from, and to address some of the issues you have raised.
Firstly, where I'm coming from: the data I'm holding and searching against
needs to be 100% backed up because it needs to be audited in the future.
For
Bulk request is in ES the same as single request, it is just concatenated
so a bulk can be sent from a client over the network in a single step,
saving a lot of roundtrips.
So if you send a single request, for ES it does not matter if you choose
single operation or bulk operation.
Jörg
--
You
Hi,
This is a floating point rounding error. There are many numbers that
floating points cannot represent accurately and if you assign 1.9 to a
float, the value that will be stored is actually closer to 1.8998.
Since the range query internally works on doubles, which have better
precision
Hi,
Elasticsearch supports having several indices stored into the same cluster.
These indices can even have different number of shards/replicas. However,
having large numbers of indices can trigger issues as well, I recommend
that you watch
Im using ElasticHQ to monitor our cluster and I am noticing that our Search
Fetch times are about 5X more then query times, which seems unreasonable
(20ms vs ~4ms)
Im also noticing a 1.44 mb swap space on one of the nodes (the other node
is at 0)
We're using a 2 node cluster, 2 shards 1
What is the query type you are using?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html
--
Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action
It is easy to query ElasticSearch but exchanging with others is not as
easy as I would like to have it.
I would like to have the possibillity to save queries to ElasticSearch
in a way similar to the percolator API:
url -XPUT localhost:9200/_query/myqueryname -d '{
query : {
term
We are doing a query then fetch serach type
On Wednesday, January 15, 2014 4:48:58 PM UTC+2, Itamar Syn-Hershko wrote:
What is the query type you are using?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html
--
Itamar Syn-Hershko
I believe this makes sense then - search is entirely in-memory operation
(after a certain warm up), and fetch involves disk IO. Try keep your docs
as small as possible and don't disable _source, but other than that you're
bounded by the speed of your disks.
--
Itamar Syn-Hershko
suppose we have a type Recipe with field ingredients that stores a JSON
string array. a couple of Recipe docs' ingredients values may therefore be:
1) [ apples, oranges]
2) [ apples ]
what query would return docs whose ingredients contain solely apples
(thus only return #2 from the above set)?
joergpra...@gmail.com joergpra...@gmail.com writes:
4. You have to write a program that traverses your folders, picks up each
document, and extracts fields from the document to get them indexed.
Or you might use es-nozzle [1], which traverses your folders and indexes
documents into
I'm curious about elasticsearch cluster architecture and I didn't find any
documentation about it.
In particulary I'm interested about how replica nodes works, replica node
receive operation log from master and performe the same operation (like in
mongodb replica set) or the replica copy the
Hi,
I need to override Lucene Default Similarity Class which is used by
Elasticsearch for indexing and searching. On searching net, I found some
similar implementations which are doing similar things. My difficulty is
that I have no idea of how to actually implement this in my code. I found
If such latency worries you, use in-memory structures to pull the original
documents based on the IDs of the results. Your dataset is small enough to
do that, really. But I suspect this is premature optimization that you are
trying to do.
--
Itamar Syn-Hershko
http://code972.com | @synhershko
One thought occurred to me. Perhaps:
1. Build the token count into your ingredients field. Here's how:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#token_count
2a. Pre-analyze your query arguments and remove the duplicates. For
example,
I have 5 nodes in my cluster (I have set replicas to 0 for now), the status
is not moving from red:
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
cluster_name : my-cluster,
status : red,
timed_out : false,
number_of_nodes : 5,
number_of_data_nodes : 5,
It depends.
When you have no replica allocated (default with only one node), then replica
is first copied over the network and then transaction log is replayed for
remaining operations.
When the replica is allocated, each operation (transaction log) is replayed on
each replica.
About
Is it possible to index/search documents within a group of shards ?
While indexing, if I provide the routing parameter, the document should be
indexed on a specific shard, but on which one? what happens if we continue
to index with the same routing param? the shard is always getting bigger?
Is
The plugin is outdated.
You can override similarity by standard ES 0.90+
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-similarity.html
Jörg
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe
Does anyone know how to query all documents that has nested document that
match a certain query.
Say you have a blog post with comments that are nested and you want to find
all blog post comments that a comment contains help and following comments
after that contains thank you in insertion
I have read about that module but the thing is I have to set idf to 1.
Actually, I want common terms to give more weight in my search. Now, when I
googled about it, it says overriding the default similarity class is the
only way to achieve it. That's the reason why I'm looking for these plugins.
Unicast is a nightmare for large ES deployments, with provisioning and
failures all the time. I'm used to DHCP/TFTP/PXE in my DC thanks to RedHat
so why should I waste time setting up hostnames or count hosts for ES?
Jörg
On Wed, Jan 15, 2014 at 5:17 PM, InquiringMind
Hey
The release candidate 1 of the 1.0.0 series is now available, featuring
many scale and scalability as well field data improvements and experimental
federated search.
Release info: http://www.elasticsearch.org/downloads/1-0-0-RC1
Blog: http://www.elasticsearch.org/blog/1-0-0-rc1-released/
IMHO I think ES can still be smart enough to calculate that formula
dynamically since it knows when servers are being added to the cluster,
correct? As for the node crash it's still a crash and if user wants to
decomission the node then the better way would be to explicitly run a
decomission
Jörg
I avoided multicast and preferred unicast based on many discussions in the
newsgroups and other sites. In particular, the ElasticSearch Preflight
Checklisthttp://asquera.de/opensource/2012/11/25/elasticsearch-pre-flight-checklist/.
Within this checklist, the sections entitled DISCOVERY
I just upgraded from 90.3 (using tar.gz file) to 90.10 using the rpm.
My 90.3 was stored in /opt/elasticsearch, and the /opt is my largest
partition, so I need to change this back if I can. Also, now Kibana is not
looking at the indexes from /opt/elasticsearch, but I assuke once I move
Why don't you use function score if you only want to tweak parameters?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
Changing similarity is a massive change, it's exchanging the whole
algorithm. Not recommended unless you can't get what
What does total.store.size_in_bytes from the indices stats
APIhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.htmlmeasure?
Disk space for the Lucene index?
Disk space for the _source data?
Disk space for logs and other metadata?
Does it count shards?
Does it
I was focused on how to use a custom similarity and not how to actually
tackle the problem. In this respect, I agree with Jörg. Also, I believe the
today's release (1.0.0.RC1) has additional scripting options:
Wow! And, many, many thanks to the detailed Breaking Changes list. That
really helps with planning ahead!
Brian
On Wednesday, January 15, 2014 12:35:49 PM UTC-5, Alexander Reelsen wrote:
Hey
The release candidate 1 of the 1.0.0 series is now available, featuring
many scale and scalability
Is it possible to perform the query as embedded type and retrieves it as
nested type? I'll explain. I have the following mapping.
{
page : {
properties : {
number : {type : integer, store : yes, index:analyzed},
file_name : {type : string, store : yes, index:no},
line : {
type : object,
If the calculation includes _source data, does it use the uncompressed or
compressed size?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
The preflight checklist is misleading in several statements:
- you can not accidentally join clusters because of multicast. It happens
when using the default cluster name
- a lot of chatter with no use is just pure ignorance of network
technology. Multicast was designed for zero config, and I
Hi All,
I have Terabytes of data across 5 DataCenters and I am looking at setting
up separate ElasticSearch Cluster in each DataCenters, so I will have 5
cluster storing all of my data. I need to have an interface that can query
across all these cluster and show me the aggregated result. I
Thanks, token_filter worked. Also, w/o modifying mapping, using script
filterhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-script-filter.htmlquery
worked (query below). However, in both cases, ES seems to treat
multi-element array and array element with multiple
Thank you very much, Jörg. Your explanation is clear and concise and
greatly improves my understanding!
Brian
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Hello,
So it appears a few APIs have changed in 1.0.0RC1 (which is bizarre
considering I tested on master just last week and everything worked, but
whatever...) Any ideas when the documentation will be updated to reflect
the changes? I'm having a hard time mapping old API calls to the new
3) [ apples and oranges] //length =3, even with filter query (why not 1)?
...not an issue in my particular use case since all ingredients are
1-word, but I would like to understand how to address case #3 above.
From my experience with querying, ES slurps all values of an array into one
Use the master docs
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/index.html
Looks like your call should be:
/_nodes/4oL7COyQTNiQPa4xZ76Pfg/stats?all=trueplugin=true
Thanks,
Matt Weber
On Wed, Jan 15, 2014 at 12:31 PM, Roy Russo royru...@gmail.com wrote:
Hello,
The data directory pathname is stored within the elasticsearch.yml
configuration file.
The path name to the configuration file may be specified passing the
following option to the startup script, where *path-to-config* is the name
of the directory into which you store your customized
Just getting started with elasticsearch. Once we set it up on two
machines, they automatically detected each other, formed a cluster, and
began replication (though it didn't complete). Since then, they never
connect.
So far I haven't been able to find any successful fixes or resources to
Yep. Thanks!
I discovered the new URL and the updated docs shortly after posting - they
looked so similar, I thought the docs hadn't changed.
Looks like the new paths actually make more sense, ala REST standards.
On Wednesday, January 15, 2014 3:41:07 PM UTC-5, Matt Weber wrote:
Use the
IHO, I think there is no perfect solution for any of the complex
network issues, however if we know of a config that reduces the risk
signficantly then I think we should adopt that as a default config.
On Wed, Jan 15, 2014 at 11:17 AM, joergpra...@gmail.com
joergpra...@gmail.com wrote:
The
I guess the usual question need to be answered: Are the Java versions
exactly the same on all systems?
And just to be sure, are the Elasticsearch versions exactly the same on all
systems?
*Does anyone know the issue or any resources that walk through how to
troubleshoot these issues?
Using quorum consensus (another name for the 'minimum_master_node'
approach) as default is not possible, since the quorum count is only known
by the admin.
There are perfect solutions for consensus but they are not easy to
implement, see Byzantine fault tolerance
Hi Itamar,
Thank you for reply. So, in Kibana I should point to tribe ip to do
multicluster search ? Also, do I need to use the latest codebase for this ?
On Wednesday, January 15, 2014 11:20:19 AM UTC-8, Itamar Syn-Hershko wrote:
Take a look at the tribe node feature - was first released
Hi Jason,
Did you end up implementing this using range filters or was there another
solution? I've just started looking at ElasticSearch and am equally stumped
at how bounding_box shouldn't be much faster.
On Monday, May 13, 2013 6:43:15 PM UTC-5, Jason wrote:
The server is almost certainly
Hi Ivan,
Can you please tell me how can I add the above mentioned classes in my
elasticsearch classpath ?
Thanks
On Wed, Jan 15, 2014 at 11:22 PM, Ivan Brusic i...@brusic.com wrote:
I have looked into that similarity plugin in the past and as far as I can
tell, it is partially wrong.
Hi Jörg,
Let's say I want to modify my _score in such a way that it doesn't take
into consideration idf while calculating score. How can I achieve this
using function score ? Can you give me some example usage of function score
which takes only tf into consideration.
Thanks
On Thu, Jan 16,
This? http://www.elasticsearch.org/blog/client-for-node-js-and-the-browser/
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 16 janv. 2014 à 06:18, ZenMaster80 sabdall...@gmail.com a écrit :
I am not very clear on how to do this, I have the following scenario:
My data/docs
Thanks a lot Adrien for the info. That will be helpful.
Regards
Prasad
On Wed, Jan 15, 2014 at 8:07 PM, Adrien Grand
adrien.gr...@elasticsearch.com wrote:
Hi,
Elasticsearch supports having several indices stored into the same
cluster. These indices can even have different number of
Great, I also found this helpful, by simple making ajax calls.
http://www.elasticsearch.org/tutorials/javascript-web-applications-and-elasticsearch/
On Thursday, January 16, 2014 1:00:44 AM UTC-5, David Pilato wrote:
This?
http://www.elasticsearch.org/blog/client-for-node-js-and-the-browser/
56 matches
Mail list logo