Awesome! Getting this on my priority to-do list to try out!
On Tuesday, August 5, 2014 2:50:47 PM UTC-7, Jörg Prante wrote:
Just released - stored procedures are available in JDBC plugin 1.3.0.4
https://github.com/jprante/elasticsearch-river-jdbc/
Jörg
--
You received this message
I would prefer having a script file which do everything you need than storing
mappings in config/
I find scripts more flexible. You can create index with specific settings, add
mapping, inject some data...
My 2 cents.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5
Hi Jörg,
Thanks, really appreciate the response and the link. I will do a small PoC
with the approach given therein.
Since we are pulling data from an index, I am assuming we will be limited
the first time by disk speed.
In the cache, if the data for the field that is cached has some updates
Hi,
For the below scenario:
Assume that I am allocating exactly similar indices (with different name)
to different ElasticSearch nodes. Every index can have multiple shards.
At some time, I add another node to the existing cluster. Now, I use the
index template to create the same mapping
Hey,
CORS is not about doing anything secure from a data point of view, but
about telling the browser how to behave. Does not have any impact on the
elasticsearch side. See http://www.html5rocks.com/en/tutorials/cors/
--Alex
On Mon, Aug 4, 2014 at 7:01 PM, joergpra...@gmail.com
Hi,
I think second aggs use sum instead of terms, in likes_sum.
2014-08-06 14:32 GMT+09:00 Tihomir Lichev shot...@gmail.com:
You can use aggregations:
{
aggs: {
user_likes: {
terms: {
field: user_id
},
aggs: {
likes_sum: {
terms: {
Hi Thanks very much, I resolve it using :
curl -XPUT http://localhost:9200/_template/not_analyzed_template; -d'
{
template: test*,
mappings: {
_default_: {
dynamic_templates: [
{
template_1: {
mapping: {
Hey,
looks like you have a slow network connection or the .org was not reachable
when you tried it. Can you try to download the RPM directly and see if it
works? Use
http://packages.elasticsearch.org/elasticsearch/1.3/centos/elasticsearch-1.3.1.noarch.rpm
- which works for me good at the moment
Hey Alex,
Thanks for turning back.
I am able to download those rpm through direct link. Its failing only if i
tried using yum repository. Yum repository able to download till 99% post
that its failing with slow connections.
Rgrds,
Ram.
On Wednesday, August 6, 2014 12:59:45 PM UTC+5:30,
hi
i want to disable _source ,at the same time i want to use dynamic
mapping。so can i set default mapping for field store to let field with
stored?
wish you replay,thanks
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from
Have a look here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_dynamic_templates
You can set default properties for your fields
06 август 2014, сряда, 10:51:23 UTC+3, huangs...@gmail.com написа:
hi
i want to disable _source ,at the same
Hi,
If I have three different indices with the same schema mapping for a type,
can I use the SearchRequestBuilder (or any other class) to simultaneously
query all three indices and have ElasticSearch perform aggregations/sorts
on the results from all three?
Thanks,
Sandeep
--
You received
There is no one size fits all, no strict measure for RAM, CPU cores,
shard/node. This all depends on your testing results and your requirements.
Do not trust other test results more than your own.
You can index 2G with Elasticsearch in a few minutes, using commodity
hardware. Do not expect
Im getting the exact same problem... ES version 1.2.1... If i use
something like this:
{
from : 0,
size : 10,
query : {
match_all : { }
}
}
All fine... However, if i have a sort on it, eg:
{
from : 0,
size : 10,
query : {
match_all : { }
},
sort : [ {
Hello,
I want to use Elasticsearch or only indexing and searching E-Mails. We want
to store the meta-info within Elasticsearch, keeping the content/body of
every Mail in an mySQL database. So Elasticsearch shall have a reference to
the mail body.
Is that possible and how?
Regards
Michael
--
Have a look at the JDBC plugin. With that plugin, you can push metadata
from MySQL to Elasticsearch.
https://github.com/jprante/elasticsearch-river-jdbc
Jörg
On Wed, Aug 6, 2014 at 1:21 PM, aseknop...@gmail.com wrote:
Hello,
I want to use Elasticsearch or only indexing and searching
I followed below link and I was able to setup ELK stack on my test
environment, but below link requires more modifications.
How to add filters like extension,geoip and many more filters on Windows
platform machines. Also, I need better logstash.conf for parsing IIS logs,
event logs, all types
++Link:
http://community.ulyaoth.net/threads/how-to-install-logstash-on-a-windows-server-with-kibana-in-iis.17/
On Wednesday, August 6, 2014 6:25:50 PM UTC+5:30, Dinesh Bandaru wrote:
I followed below link and I was able to setup ELK stack on my test
environment, but below link requires
Hi All,
I followed below link and I was able to setup ELK stack on my test
environment, but below link requires more modifications. ++Link:
http://community.ulyaoth.net/threads/how-to-install-logstash-on-a-windows-server-with-kibana-in-iis.17/
How to add filters like extension,geoip and many
Using this plugin would lead to a migration from mysql data into
Elasticsearch.
So let me reformulate my question:
My infrastructure is like this:
clientElasticsearch
|
|
mySQL
So I have a client which generates an index and some metadata for a
mail(header and body).
What I don't understand is why you generate an index and want to store it
in elasticsearch. You could use the plugin as Jörg suggested, transfer you
data to elasticsearch, set index:true for the fields you want and set
store:false in the mapping. This way you get an index build by
So how you can distinguish the first email from any thread ?
Do you have some additional parameter ?
06 август 2014, сряда, 16:56:10 UTC+3, Mark Fletcher написа:
Thanks for your response. If I do as you suggested, a subject match will
return all the messages in that thread (because they all
Each thread has a unique integer id (so, every message in a given thread
has a particular thread id). And each email has a unique integer id as
well.
On Wednesday, August 6, 2014 6:59:36 AM UTC-7, Tihomir Lichev wrote:
So how you can distinguish the first email from any thread ?
Do you have
So you should be able to use aggregation to get the first email from each
thread.
Kind of :
{
aggs: {
threads: {
terms: {
field: thread_id
},
aggs: {
first_email: {
min: {
field: email_id
}
}
}
}
}
}
06 август 2014,
Hi Folk,
The question is, what client are you using out there?
Here at company X we have java applications using elasticsearch. We have
many java applications, different java applications and they use the
transport client. This decision was made for developers, given the ease of
use that the
Hi,
Could someone clarify me the next:
When I have the ES cluster, consisting from 2 machines, how should I send
the bulk index requests to them.
1. Do I understand right that I can send everything to any node I have,
then it would be spreaded for indexing among the cluster automatically?
2.
Thanks again for your response. I don't have much experience with
aggregations, but wouldn't that just give me a set of thread id's ordered
by how many messages are in each thread? In my results, it's possible to
have a match on a message body be ranked higher than a match on a subject.
Using this
I haven't tested such aggregation, but I think the way I wrote it should
give you the oldest email that match the request from each thread. Not sure
how they will be sorted ...
06 август 2014, сряда, 17:31:56 UTC+3, Mark Fletcher написа:
Thanks again for your response. I don't have much
Aaron,
Well, now I feel a little foolish. Perhaps it was from my initial attempt
to put --logfile at the end of the command instead of before the action:
$ curator delete --older-than 8 --logfile /tmp/curator.log
usage: curator [-h] [-v] [--host HOST] [--url_prefix URL_PREFIX] [--port
PORT]
Here is my experience. Yours may vary.
I also use the TransportClient. And then I wrap our business rules behind
another server that offers an HTTP REST API but talks to Elasticsearch on
the back end via the TransportClient. This server uses Netty and the LMAX
Disruptor to provide low-resource
JDBC plugin is not for migration.
It can be configured to select the data from the RDBMS you want. You can
fetch the metadata fields and index them into Elasticsearch with a simple
SQL select statement.
Jörg
On Wed, Aug 6, 2014 at 3:48 PM, Andrej Rosenheinrich
andrej.rosenheinr...@unister.de
This worked perfectly! Thank you for your help.
On Wednesday, August 6, 2014 3:49:57 AM UTC-4, Tihomir Lichev wrote:
Thanks! You're absolutely right. Copy/paste error :)
{
aggs: {
user_likes: {
terms: {
field: user_id
},
aggs: {
likes_sum: {
curl -XPUT http://localhost:9200/_snapshot/myRepository/myIndex_`date
+%Y-%m-%d`?wait_for_completion=true
This cron job runs daily which backs up my index to AWS S3, each day the
snapshot has a different name.
I want to make sure that I am not duplicating a 10GB index for example
everyday
hello
my context : spark, spark-shell 1.0.1 jdk1.7 scala 2.10.4, ES-Hadoop
2.1.0 ( nighly build)
my problem:
- unable to send RDDs from spark to ES
i got a NoClassDefFoundError see below (
org/codehaus/jackson/annotate/JsonClass)
jackson Jars to add to spark shell?
philippe
best
sorry for the mistake : -- unable to read from ES and create RDDS
On Wednesday, August 6, 2014 6:32:02 PM UTC+2, Phil gib wrote:
hello
my context : spark, spark-shell 1.0.1 jdk1.7 scala 2.10.4, ES-Hadoop
2.1.0 ( nighly build)
my problem:
- unable to read from ES and create RDDS
i
A few days after the upgrade to 1.3.1 we experienced our first corrupted
shard in a 2 node cluster:
[2014-08-06 15:54:28,815][WARN ][indices.cluster ]
[FiveAces.Coffee.Web_IN_0] [streamentry5][4] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
Since version 1.0, there should be fewer binary protocol issues between any
nodes, including the clients, making rolling upgrades doable. Older clients
should be able to interact with newer server nodes, but the inverse is not
always the case.
--
Ivan
On Wed, Aug 6, 2014 at 8:47 AM, Brian
Hi Phil,
Glad to see the work in es-hadoop master is being picked up even without any
public announcement of it :)
The issue has been fixed in master [1] and already pushed to Maven - can you
please update and try again?
FTR: The issue seems to be caused by multiple versions of Jackson which
I should mention that there is a primary shard 4 on the other node, just
need to understand why it's not auto recovering here what I can do to
manually remove the corrupted shard to have the primary replicated to this
node.
On Wednesday, August 6, 2014 12:44:41 PM UTC-4, Nariman Haghighi
Well. It is incremental.
But let's say you have saved old Lucene segments and that old segments has been
merged in the meantime to a new bigger one, the next snapshot will copy the new
BIG segment and remove the old ones.
It means that old data will be copied twice in this scenario.
Makes
Thanks, it makes sense in this case. I don't think I can prevent something
like that from happening?
On Wednesday, August 6, 2014 1:29:40 PM UTC-4, David Pilato wrote:
Well. It is incremental.
But let's say you have saved old Lucene segments and that old segments has
been merged in the
Hi,
Need some advise.
I have indexed documents, each document has internal id that also indexed
as just another indexed field, this id is not used as indexed document id
(_id).
There could be situation when same document is indexed more than once (each
of the indexed instances will have
No. I don't think so.
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 6 août 2014 à 20:04, IronMike sabdall...@gmail.com a écrit :
Thanks, it makes sense in this case. I don't think I can prevent something like
that from happening?
On Wednesday, August 6, 2014 1:29:40
I searched this topic but some of the answers were still vague to me.
My goal is to index html docs but have the html stripped for the indexing,
at the same time, I would like _source to have the original html document
for display purposes.
//My doc format:
{
content: html Hello this is an
1. Correct.
2. Also correct. The analysis chain only affects how the terms are indexed
and placed in the inverted index. The original document remains as is.
3. Not sure since I have never done highlighting. Highlighting might not
depend on the source since the term positions/offsets are used, but
Perhaps the top hits aggregation can help:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html
--
Ivan
On Wed, Aug 6, 2014 at 11:21 AM, slavag slav...@gmail.com wrote:
Hi,
Need some advise.
I have indexed documents,
Hi,
I have a need to retrieve documents (of type bucket) which have at least
2 nested objects (of type products) inside them (details of my mapping
and documents are on the gist below).
https://gist.github.com/anonymous/4f06c9322186ce9d4708
As far as I've searched, I did not find a way to
Still interested to know your view on the issue.
On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:
Hi,
Could someone clarify me the next:
When I have the ES cluster, consisting from 2 machines, how should I send
the bulk index requests to them.
1. Do I understand right that
Heya,
We are pleased to announce the release of the Elasticsearch Google Compute
Engine cloud plugin, version 2.3.0.
The Google Compute Engine (GCE) Cloud plugin allows to use GCE API for the
unicast discovery mechanism..
https://github.com/elasticsearch/elasticsearch-cloud-gce/
Release
Heya,
We are pleased to announce the release of the Elasticsearch Twitter River
plugin, version 2.3.0.
The Twitter river indexes the public twitter stream, aka the hose, and makes it
searchable.
https://github.com/elasticsearch/elasticsearch-river-twitter/
Release Notes -
Hi, Thanks for the reply.
I'm trying to define top hits aggregation but getting error : Parse
Failure [Could not find aggregator type [top_hits] in [single_result]]]; }]
This is my aggregation definition, first bucket is grouped by id and the
nested bucket is grouped by date and then I want to
1. Yes, it is spread automatically
2. No
The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.
Jörg
On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pa...@kredito.de wrote:
Still
This has been added in 1.3.0:
https://github.com/elasticsearch/elasticsearch/pull/6124
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 6 août 2014 à 23:49:25, slavag (slav...@gmail.com) a écrit:
Hi, Thanks for the reply.
I'm trying to define top hits
Ooo, my bad, sorry.
In the top_hits explanation page
:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html
There was top_docs mentioned, but can't find any other reference to that
aggregator, how can I use it ?
Thanks.
Sorry, I meant to specify the version, but I forgot. If you do upgrade,
here is another explanation of top hits:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/top-hits.html
--
Ivan
On Wed, Aug 6, 2014 at 2:59 PM, David Pilato da...@pilato.fr wrote:
This has been added in
I'll definitely upgrade.
Thanks
On Thursday, August 7, 2014 1:07:01 AM UTC+3, Ivan Brusic wrote:
Sorry, I meant to specify the version, but I forgot. If you do upgrade,
here is another explanation of top hits:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/top-hits.html
I have found quite a few simliar emails about capacity planning. Although
it make sense that there are a lot of variables/factors, it would be great
for new users to have some sort of baseline, which could be simple , just
single type of indices, not too heavy load. Maybe there are already
Thanks for the information
2014-08-01 0:55 GMT+08:00 Ivan Brusic i...@brusic.com:
All types eventually belong to the same Lucene index and Lucene cannot
handle different types for the same field name. Avoid using the same name
across types if the field type is different.
I've tried to upgrade the version of ES yesterday (from 1.1.1 to 1.3.1) and
I have an Issue when I try to use a complexe query with Aggregation.
Parse Failure [Expected [START_OBJECT] under [filter], but got a
[START_ARRAY
I've this bug since the version 1.2.0 (I've tried all the version
Hello guys,
Could you help me why docs.count below is not increasing? it's stack at
2307764. while the docs.deleted keeps increasing.
i'm just running a php script to ingest the csv source data to ES.
[login@machine elasticsearch]$ curl 'localhost:9200/_cat/indices?v'
health index pri
(I move the topic from github issue to here)
I want to know the theory of ES aggregation.
Maybe, it is one of them:
(1) like a Database. compute when the aggregation query comes.
(2) like Storm. When a data comes, it aggregate once. You don't need
aggregate when query comes. The aggregation
I still want to know some basic theory about that efficient manner.
Thank you very much!
在 2014年8月7日星期四UTC+8下午12时00分31秒,Tong Liu写道:
(I move the topic from github issue to here)
I want to know the theory of ES aggregation.
Maybe, it is one of them:
(1) like a Database. compute when the
I still want to know some basic theory about that efficient manner.
On Thursday, August 7, 2014 12:00:31 PM UTC+8, Tong Liu wrote:
(I move the topic from github issue to here)
I want to know the theory of ES aggregation.
Maybe, it is one of them:
(1) like a Database. compute when the
63 matches
Mail list logo