Hi
Is there an example on how to construct a elasticsearch-py ConnectionPool?
ie is it create a list of connections and pass that to the ConnectionPool?
:arg connections: list of tuples containing the
:class:`~elasticsearch.Connection` instance and it's options
--
You received
One more observation in above scenario.
1) The records returned by the mobile is 287 so if I set the size (in
query) as 287 it is working fine but in case of =287 it is throwing the
exception.
2) The records returned by the samsung is 191 so if I set the size as
191 it is working fine also in
Heya,
We are pleased to announce the release of the Elasticsearch Thrift transport
plugin, version 2.0.0.
The thrift transport plugin allows to use the REST interface over thrift on top
of HTTP..
https://github.com/elasticsearch/elasticsearch-transport-thrift/
Release Notes -
Hey,
wow. I only read what I wanted to read and skipped the first paragraph.
sorry for that.
The clean up thread basically sleeps the specified interval, only when the
interval is updated and the old sleep interval has been finished, the new
settings are applied. This means you might need to
Hi,
we try to keep documentation for our rivers updated but I noticed that the
general river documentation page [1] is probably no longer available on
Elaticsearch.org site?
Is this intentional? I can see twitter river is still pointing to it as
well [2]
[1]
Hi,
I'm trying to play with scripting (in java) and that's not obvious to me
1/ The smallest example don't compile due to doc().field not found
public class Ex2ScriptFactory implements NativeScriptFactory {
@Override
public ExecutableScript newScript (@Nullable MapString,Object params) {
OK, I can see the doc page is still in sources:
https://github.com/elasticsearch/elasticsearch/blob/master/docs/river/index.asciidoc
and available at
http://www.elasticsearch.org/guide/en/elasticsearch/rivers/current/index.html
But for some reason it is not reachable via guide search.
On Tue,
This has been a problem with other documentation in the past, not that it
explains what the problem is.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 4 March 2014 20:36, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi all,
As there is no support of regex in Index templates as of now, in
case if I am applying the simple matching pattern as *
which will match all index, mapping will be created for indices which
might not have that field in the future. Will there be any
performance issue in
Hello, could u please explain what is wrong.
http://pastebin.com/dMZudCsj
my config:
...
analysis:
analyzer:
default:
type: custom
tokenizer: standard
filter: [mam_ngram, lowercase]
filter:
mam_ngram:
Quick correction. I remembered precomputing prior to population of the
index wouldn't work for me in this case because there wouldn't be the term
frequency data for the full corpus.
On Tuesday, March 4, 2014 11:56:04 AM UTC+2, Kevin B wrote:
As background I have some Lucene based code which
This is a bit of an odd one, although not really for those of us who work
in the eCommerce world I suppose.
I work for one of the largest children's retailers in the UK. We're
currently knocking together a demonstration of our eCommerce platform with
everything running off Elasticsearch rather
Hi,
I have been trying to configure ElasticSearch to query/search files in file
system on a Windows 7 Operating system. I have installed ElasticSearch and
its up and running. I have been trying to install fsriver plugin by
following instructions mentioned in link
Hi Alex,
The Scan API indeed works based on the order of documents in the lucene
segments. This is the most efficient way to get bulks of data which is the
intended use case of this API.
Once 1.1 is released you'd be able to efficiently scroll while maintaining
order (see this
issue:
Hello,
I'm having problems starting elasticsearch after an error. I was bulk
indexing a new type when the disk got full and server returned this error:
PHP Fatal error: Uncaught exception
'Elastica\Exception\Bulk\ResponseException' with message 'Error in one or
more bulk request actions:
I have a query similar to the following SQL:
select sum(count+displayCount) as total from day_inc_count group by video_id
My problem is how to implement the above using aggregation in REST or Java
API.
I don't understand aggregations in elasticsearch 1.0 very much. Hope
somebody could help.
The message is pretty obvious. Your node is running out of heap memory...
Increase it.
On Tuesday, March 4, 2014 1:36:51 PM UTC, Prashy wrote:
Hi ES users,
I am getting the following exception while indexing huge amount of
data(say
~5GB) to ES node.
Exception:
1) /*array index out
I tried increasing the heap value by 2GB as well by ES_MAX_MEM: 2g but it
gave the same error.
--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Error-array-index-out-of-bounds-java-lang-OutOfMemoryError-Java-heap-space-tp4050914p4050916.html
Sent from the
Adding a bit more to my rather short answer. Both exceptions essentially
mean the same thing. I would follow the basic heap allocation advice.
Allocate 50% of your system RAM to ES as catastrophic things happen when ES
runs out of RAM. Leave the other 50% to the system. So if you have a server
you also don't give us much information on how you're trying to index this
3gb of information. Are you using the bulk API? Are you refreshing after
every index action? etc...
On Tuesday, March 4, 2014 1:40:58 PM UTC, Prashy wrote:
I tried increasing the heap value by 2GB as well by
Just wanted to know that as I was using 1GB as Heap Size I was getting an
error. So I increased it to 2GB (Heap , system has 4GB) so in that scenario
also I got the error at same point.
So if I increased the memory from 1 GB to 2 GB, at least it should process
one more record in compare to
Hi,
I am trying to implement middle word search, as Completion Suggester
dochttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
said,
preserve_position_increments: if disabled and using stopword analyzer, you
could get a string starting with
*Tweet Index definition : *
tweet: {
tweet: {
index: {
number_of_shards: 2,
number_of_replicas: 1
}
},
mappings: {
blog: {
_all: {
enabled: False
Thx.
Is a segment a single file with multiple documents? Or is it multiple files
that together form a segment? In other terms I don't fully understand why
the notion of segment exists?
Does the fact that I have a high number in the throttling KPI mean that I
have a problem in performance and
Hello,
I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2
months now and everything has worked out pretty good and we are ready to
move it to production. Before building out the infrastructure, I want to
make sure my shard/node/index setup is correct as that is the main
Isaac, this
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-merge.html)
gives a good explanation on what segments actually are. It also gives you
the list of settings to do with merges and maybe you can find some
optimisations in there somewhere.
On
Based on my experience, I think you may have an issue with OOM trying to
keep a month of logs with ~10gb ram / server.
Say, for instance, 5 indexes a day for 30 days = 150 indexes. How many
shards per index/replicas?
I ran some tests with 8GB assigned to my 20x ES data nodes, and after a ~7
Hello,
I'm facing a problem bulk indexing 5k documents in 24 different indices
(i18n). I'm using elasticsearch 1.0.1 with all default settings. I've read
that a thread per index is used, that would mean I'm using 24 bulk threads
at one time. Am I right? if so, why I'm getting this rejection
Zach,
Thanks for the information. With my POC, I have 2 10 gig VMs and I'm
keeping 7 days of logs with no issues but that is a fairly large jump and I
could see where it may pose an issue.
As far as the 150 indexes, I'm not sure on the shards per index/replicas.
That is the part that I'm the
On Tue, Mar 4, 2014 at 4:01 PM, isaac hazan isaac.yann.ha...@gmail.comwrote:
Thx.
Is a segment a single file with multiple documents? Or is it multiple
files that together form a segment? In other terms I don't fully understand
why the notion of segment exists?
The simple answer is that a
Well, I found the answer : using dynamic template.
person : {
dynamic_templates : [
{
nested_template: {
match : fr|en,
match_pattern : regex,
mapping : {
type : nested
}
}
},
{
template_fr : {
path_match : fr.*,
match_mapping_type : string,
mapping : {
type : multi_field,
fields : {
{name} :
I think you can print a QueryBuilder or a SearchRequestBuilder but not a
SearchRequest.
QueryBuilder qb = QueryBuilders.queryString(my text);
logger.info(Your query is : {}, qb);
// this works as well
On a toString() I get the typical unimplemented toString class reference :
org.elasticsearch.action.search.SearchRequest@142daa2e
I am using elasticsearch 1.0.0, wonder if something has changed from 0.90
or 1.0.1?
Thanks for the response
On Monday, March 3, 2014 8:46:25 PM UTC-6, amit.soni
Threads are pooled, they are not used per index.
The queue length of 50 works in almost any case. 50 is also safe to protect
a node before being overwhelmed by too many documents. If not, think about
the bulk request size, and if your cluster is powerful enough for
processing the transmitted
Hopefully you can find a way to make things work with less code. It would
be great if payloads were more of a first class citizen in Elasticsearch,
but it is up to the Lucene layer to handle analysis. I really need to play
around with the new text scoring abilities.
--
Ivan
On Mon, Mar 3, 2014
I understand, but I must be missing something.
5k documents * 24 indices = 120k requests, How am i supposed to bulk index
them? I've tried to set index.refresh_interval to -1 to speed up the
process but still same result. splitting the bulk in different sizes didn't
work either.
On 4 March 2014
Great! Thanks David!
On Tuesday, March 4, 2014 10:53:24 AM UTC-6, David Pilato wrote:
I think you can print a QueryBuilder or a SearchRequestBuilder but not a
SearchRequest.
QueryBuilder qb = QueryBuilders.queryString(my text);
logger.info(Your query is : {}, qb);
// this works as well
Without being able to look at source code, it is difficult if not
impossible to find issues.
index.refresh_interval: -1 must be set to the respective index,
preferably using the cluster update API (conf file or index creation
settings is also possible but not a good place for temporary settings).
If you only need the filter part you can use a constant score query that
only contains a filter, in that case you wouldn't need the match_all query
anymore.
On Friday, February 28, 2014 10:04:35 AM UTC+1, xzer LR wrote:
AFAIK, There are three types search I can perform on elasticsearch:
1.
This is what I'm doing:
for each 24 locales:
POST /current_contenidos_[LOCALE] # creates index
PUT /current_contenidos_es_es/_settings # sets refresh_interval = -1
POST /_bulk HTTP/1.1 200 2386284 (12 times [mixed requests for 24 indices])
for each 24 locales:
PUT
I'm using python, I just sent the log so you could figure out what I'm
doing.
I'm not evaluating the bulk response but logging them so I can see the
'rejected execution' error in most of them.
On 4 March 2014 19:13, joergpra...@gmail.com joergpra...@gmail.com wrote:
So you use plain HTTP API?
Logging is not enough, you should care for the number of active requests
sent and the bulk request responses that came back. So you can control the
number of concurrent bulk requests that are active at a time, and if you do
so, you can limit this number, before exceeding the bulk queue size of 50.
Ok, I'm gonna play with that, but it seems too complicated just for bulk
indexing taking into account the low number of documents
Thanks
El 04/03/2014 19:53, joergpra...@gmail.com joergpra...@gmail.com
escribió:
Logging is not enough, you should care for the number of active requests
sent and
Judging by the output, the genre field is analyzed using the default
analyzer. Others can help debug if you provide your mapping. It is best to
use the get mapping API [1] since it shows what you actually have instead
of what you supplied at index creation.
Depending on your use case, you might
Thanks Ivan, I've fixed issue. It was just in wrong mapping.
On Tuesday, March 4, 2014 8:29:00 PM UTC+1, Ivan Brusic wrote:
Judging by the output, the genre field is analyzed using the default
analyzer. Others can help debug if you provide your mapping. It is best to
use the get mapping API
Sorry to have taken so long to reply. So I went ahead and followed your
link. I'd been there before, but decided to give it a deeper look. I
found actually, however, that bigdesk told me how many max open files the
process was using and from there I was able to determine that my settings
in
My initial suggestion would be to set your templates to 3 shards, 1
replica. With three data nodes, you'd have two shards per index, at 5
indexes/day, that's 10 shards per day per index per node. 3 nodes/10
shards per day/30 days is 900 shards. I don't know any 'cutoff' per se,
but 900 may
Looks like your index mapping is invalid, you might have to delete the
index and then reload your data.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 4 March 2014 23:01, Andrés acanadac...@gmail.com wrote:
Probably you should check why your cluster is yellow - is that a single
node only?
Bulk indexing with a green cluster should work flawlessly.
Jörg
On Tue, Mar 4, 2014 at 8:02 PM, Jose Gargallo jgarga...@gmail.com wrote:
Ok, I'm gonna play with that, but it seems too complicated just for bulk
I have what I think is an obvious question. If I tweak some settings such
as:
index.translog.flush_threshold_period
or
index.merge.policy.use_compound_file
or
index.refresh_interval
or
indices.memory.index_buffer_size
or
index.cache.field.type
or
index.gateway.snapshot_interval
Is it
My response looks like:
{error:SearchPhaseExecutionException[Failed to execute phase [query],
all shards failed; shardFailures {[pb-DdzK0S9uOU6yh9kg3zQ][trial][6]:
RemoteTransportException[[IP_POC_NODE1][inet[/10.226.22.13:9700]][search/phase/query]];
nested: SearchParseException[[trial][6]:
Ok, I thought may be an old jvm version was causing this, but this one is
pretty recent.
I took a better look at indexing percolator queries and there is indeed a
substantial difference in execution time comparing to indexing a regular
document. When I disabled the size calculation (in the code)
On Wed, Feb 26, 2014 at 12:21:26AM +0100, joergpra...@gmail.com wrote:
Jörg,
sorry for the late answer.
Maybe you can set up an example of your sort as a demo, so that the error
can be reproduced?
It turned out that this behaviour was caused by me, since documents contained
the wrong
Unfortunately, there is no 1 place to get all settings. However, you can
usually get them as follows:
1) Cluster settings: curl localhost:9200/_cluster/settings?pretty
2) Index settings: curl localhost:9200/foo/_settings?pretty
3) Node settings: curl localhost:9200/_nodes?pretty
So for
I'd probably just aggregate all the fields you are interested in into 1
field and then match/AND on that 1 field. You probably can do a copy_to to
accomplish the aggregation to a single field:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#copy-to
You'd have to turn dynamic back to true. Dynamic = false means ignore the
new field and don't index it. If a field is not indexed, then you cannot
search for it.
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group
Are you using the BulkRequestBuilder? If so, create a new one for each bulk
operation (and let the de-referenced old one be garbage collected);
otherwise you'll be filling it up and times will drop as seen. At least,
that's what I do, and it runs like the blazes for the entire 97M document
Thank you! I haven't tried it yet, but reading the documentation makes me
understand that it will solve my needs.
Is it possible to keep the boosting applied to the individual fields when
computing the score ?
Should I keep the original query and add the match/AND you're talking
about, as a
I've found a strange behavior where if I have an extra field in the
percolator named type then range filters on my percolator don't work
properly. See the following test case:
https://gist.github.com/julesbravo/9357887
If I change the type field on lines 26 and 74 to atype OR change the type
Ivan,
Yes, ES stores all strings in UTF-8 encoding.
Referring to your 3 POST commands, the first two succeeded because in the
first one, you presented the data in the UTF-8 encoding and it was
accepted. In the second one, you presented the same name but in using the
\u notation which is
Jörg,
Just to clarify: The links below point to OpenJDK, not to the Oracle JDK? I
only ask because the version and build numbers seem to track those in
Oracle's JDK.
For what it's worth, I am currently running ES 1.0.0 GA with the latest
Oracle JDK 7u51 on Mac Mavericks and on Linux, and I
What version of ES are you using? I seem to recall reading about ambiguity
surrounding the use of type as a field name in one of the version's
release notes, but I cannot find it quickly now.
However, ES seems to overuse the word type, referring to both the
document type (roughly analogous to
Brian,
Using 1.0.0. You're probably right but I don't want to go back and change
code in 1000 places if I can avoid it.
On Tuesday, March 4, 2014 4:07:26 PM UTC-8, InquiringMind wrote:
What version of ES are you using? I seem to recall reading about ambiguity
surrounding the use of type as
The vectorization issue is not constrained to OpenJDK, and is still present
in 7u45: https://twitter.com/thetaph1/status/423523708708208640
On Tue, Mar 4, 2014 at 3:52 PM, InquiringMind brian.from...@gmail.comwrote:
Jörg,
Just to clarify: The links below point to OpenJDK, not to the Oracle
Guys, I'm running the following query to test out one of the use cases:
curl -X GET '0:9200/segmentation/animal/_search?pretty' -d
'{ query : {
has_child : { type : visit,
query : {
bool : {
must : [
{term : { _parent : 119000148-5661691 }},
{ range : {
Hello,
I've got an entity document which looks like this:
{
id: 'id',
type: 'person',
tuples: [
{
key: 'nameFirst',
value: 'john',
type: 'string'
},
key: 'age',
value: '38',
type: 'int'
},
I forgot to mention, I need the ability for the user to specify they only
care about keys for the entity.type === 'person' (or any type for that
matter).
On Tuesday, March 4, 2014 11:13:27 PM UTC-5, Corey Nolet wrote:
Hello,
I've got an entity document which looks like this:
{
id:
Am Dienstag, 4. März 2014 12:27:33 UTC+1 schrieb Nitesh Earkara:
Hi,
I have been trying to configure ElasticSearch to query/search files in
file system on a Windows 7 Operating system. I have installed ElasticSearch
and its up and running. I have been trying to install fsriver plugin by
Hi Roland,
Thanks for the suggestion. I am getting error while trying to install using
the way you suggested.
Below is the error message
C:\Users\ner\Downloads\elasticsearch-1.0.1(2)\elasticsearch-1.0.1\binplugin
-install fr.pilato.elasticsearch.river/fsriver/0.4.0 -url
I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line JSON record but i am unable to give JSON file
as input to curl XPUT
On Tuesday, March 4, 2014 9:32:55 PM UTC-8, siva mannem wrote:
I installed ES(at the location /usr/lib/elasticsearch/) on our gateway
server and i am able to run some basic curl commands like XPUT and XGET to
create some indices and retrieve the data in them.
i am able to give single line
Hi gkwelding,
I am using the bulk API for indexing the data. And also refresh parameter is
not set.
So what could be the issue for that exception.
Let me know if you require any other input for the same.
--
View this message in context:
The zip download available on github is not what you want. The format
required for plugins is different from the source download found on github.
Since it appears that you do not have download access, as Roland mentioned,
your last option is to clone the project in git and build it yourself with
Am Donnerstag, 20. Februar 2014 04:37:34 UTC+1 schrieb Roland Pirklbauer:
Am Samstag, 5. Januar 2013 12:16:39 UTC+1 schrieb Jörg Prante:
Later while searching, you can direct your search client to the index
library, and all searches to the identifier field will be mapped
correctly to
I think you have to reindex.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5 mars 2014 à 07:36, Michael Huang opensource5...@gmail.com a écrit :
I have elasticsearch 1.0beta1. I want to upgrade to elasticsearch 1.0.1. What
is the steps to upgrade it. is it simply
When Elasticsearch gives you the answer (actionGet()), Elasticsearch has your
doc, whatever could happen after that. It does not mean that your doc is
searchable yet. This will happen about 1 second later. So querying immediatly
won't give you the result you are expecting.
Bulk API is
That say, I'm wondering if you set your id as document id, it would be more
efficient to use multiget API in that case.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5 mars 2014 à 06:49, Roland Pirklbauer powidl1...@gmail.com a écrit :
Am Donnerstag, 20. Februar 2014
77 matches
Mail list logo