Re: How to transform SQL columns using JDBC River?

2014-08-06 Thread Christopher Ambler
Awesome! Getting this on my priority to-do list to try out! On Tuesday, August 5, 2014 2:50:47 PM UTC-7, Jörg Prante wrote: Just released - stored procedures are available in JDBC plugin 1.3.0.4 https://github.com/jprante/elasticsearch-river-jdbc/ Jörg -- You received this message

Re: Storing Elasticsearch configuraton and deploying new clusters

2014-08-06 Thread David Pilato
I would prefer having a script file which do everything you need than storing mappings in config/ I find scripts more flexible. You can create index with specific settings, add mapping, inject some data... My 2 cents. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 5

Re: Route query so that data for a shard is localized

2014-08-06 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi Jörg, Thanks, really appreciate the response and the link. I will do a small PoC with the approach given therein. Since we are pulling data from an index, I am assuming we will be limited the first time by disk speed. In the cache, if the data for the field that is cached has some updates

Move specific shard to a different index

2014-08-06 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi, For the below scenario: Assume that I am allocating exactly similar indices (with different name) to different ElasticSearch nodes. Every index can have multiple shards. At some time, I add another node to the existing cluster. Now, I use the index template to create the same mapping

Re: How to disable DELETE API

2014-08-06 Thread Alexander Reelsen
Hey, CORS is not about doing anything secure from a data point of view, but about telling the browser how to behave. Does not have any impact on the elasticsearch side. See http://www.html5rocks.com/en/tutorials/cors/ --Alex On Mon, Aug 4, 2014 at 7:01 PM, joergpra...@gmail.com

Re: Group by field and then sum the groups

2014-08-06 Thread Jun Ohtani
Hi, I think second aggs use sum instead of terms, in likes_sum. 2014-08-06 14:32 GMT+09:00 Tihomir Lichev shot...@gmail.com: You can use aggregations: { aggs: { user_likes: { terms: { field: user_id }, aggs: { likes_sum: { terms: {

Re: How to forbid the analyzing for a certain data type (e.g. string)

2014-08-06 Thread panfei
Hi Thanks very much, I resolve it using : curl -XPUT http://localhost:9200/_template/not_analyzed_template; -d' { template: test*, mappings: { _default_: { dynamic_templates: [ { template_1: { mapping: {

Re: not able to install elasticsearch using reporsitory

2014-08-06 Thread Alexander Reelsen
Hey, looks like you have a slow network connection or the .org was not reachable when you tried it. Can you try to download the RPM directly and see if it works? Use http://packages.elasticsearch.org/elasticsearch/1.3/centos/elasticsearch-1.3.1.noarch.rpm - which works for me good at the moment

Re: not able to install elasticsearch using reporsitory

2014-08-06 Thread Ramkumar Nagaraj
Hey Alex, Thanks for turning back. I am able to download those rpm through direct link. Its failing only if i tried using yum repository. Yum repository able to download till 99% post that its failing with slow connections. Rgrds, Ram. On Wednesday, August 6, 2014 12:59:45 PM UTC+5:30,

can i set default mapping for field store?

2014-08-06 Thread huangshanjay
hi i want to disable _source ,at the same time i want to use dynamic mapping。so can i set default mapping for field store to let field with stored? wish you replay,thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from

Re: can i set default mapping for field store?

2014-08-06 Thread Tihomir Lichev
Have a look here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_dynamic_templates You can set default properties for your fields 06 август 2014, сряда, 10:51:23 UTC+3, huangs...@gmail.com написа: hi i want to disable _source ,at the same

Aggregate results over multiple indices

2014-08-06 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi, If I have three different indices with the same schema mapping for a type, can I use the SearchRequestBuilder (or any other class) to simultaneously query all three indices and have ElasticSearch perform aggregations/sorts on the results from all three? Thanks, Sandeep -- You received

Re: System Requirements for ElasticSearch stack

2014-08-06 Thread joergpra...@gmail.com
There is no one size fits all, no strict measure for RAM, CPU cores, shard/node. This all depends on your testing results and your requirements. Do not trust other test results more than your own. You can index 2G with Elasticsearch in a few minutes, using commodity hardware. Do not expect

Re: Sorting Problem, ClassCastException.

2014-08-06 Thread Ian Harrigan
Im getting the exact same problem... ES version 1.2.1... If i use something like this: { from : 0, size : 10, query : { match_all : { } } } All fine... However, if i have a sort on it, eg: { from : 0, size : 10, query : { match_all : { } }, sort : [ {

leave content in mySQL and use ElasticSearch only for Index

2014-08-06 Thread aseknoppik
Hello, I want to use Elasticsearch or only indexing and searching E-Mails. We want to store the meta-info within Elasticsearch, keeping the content/body of every Mail in an mySQL database. So Elasticsearch shall have a reference to the mail body. Is that possible and how? Regards Michael --

Re: leave content in mySQL and use ElasticSearch only for Index

2014-08-06 Thread joergpra...@gmail.com
Have a look at the JDBC plugin. With that plugin, you can push metadata from MySQL to Elasticsearch. https://github.com/jprante/elasticsearch-river-jdbc Jörg On Wed, Aug 6, 2014 at 1:21 PM, aseknop...@gmail.com wrote: Hello, I want to use Elasticsearch or only indexing and searching

How to install ELK stack, Logforwarder(nxlog),Redis on Windows ?

2014-08-06 Thread Dinesh Bandaru
I followed below link and I was able to setup ELK stack on my test environment, but below link requires more modifications. How to add filters like extension,geoip and many more filters on Windows platform machines. Also, I need better logstash.conf for parsing IIS logs, event logs, all types

Re: How to install ELK stack, Logforwarder(nxlog),Redis on Windows ?

2014-08-06 Thread Dinesh Bandaru
++Link: http://community.ulyaoth.net/threads/how-to-install-logstash-on-a-windows-server-with-kibana-in-iis.17/ On Wednesday, August 6, 2014 6:25:50 PM UTC+5:30, Dinesh Bandaru wrote: I followed below link and I was able to setup ELK stack on my test environment, but below link requires

How to install ELK stack, Logforwarder(nxlog),Redis on Windows ?

2014-08-06 Thread Dinesh Bandaru
Hi All, I followed below link and I was able to setup ELK stack on my test environment, but below link requires more modifications. ++Link: http://community.ulyaoth.net/threads/how-to-install-logstash-on-a-windows-server-with-kibana-in-iis.17/ How to add filters like extension,geoip and many

Re: leave content in mySQL and use ElasticSearch only for Index

2014-08-06 Thread aseknoppik
Using this plugin would lead to a migration from mysql data into Elasticsearch. So let me reformulate my question: My infrastructure is like this: clientElasticsearch | | mySQL So I have a client which generates an index and some metadata for a mail(header and body).

Re: leave content in mySQL and use ElasticSearch only for Index

2014-08-06 Thread Andrej Rosenheinrich
What I don't understand is why you generate an index and want to store it in elasticsearch. You could use the plugin as Jörg suggested, transfer you data to elasticsearch, set index:true for the fields you want and set store:false in the mapping. This way you get an index build by

Re: Searching email

2014-08-06 Thread Tihomir Lichev
So how you can distinguish the first email from any thread ? Do you have some additional parameter ? 06 август 2014, сряда, 16:56:10 UTC+3, Mark Fletcher написа: Thanks for your response. If I do as you suggested, a subject match will return all the messages in that thread (because they all

Re: Searching email

2014-08-06 Thread Mark Fletcher
Each thread has a unique integer id (so, every message in a given thread has a particular thread id). And each email has a unique integer id as well. On Wednesday, August 6, 2014 6:59:36 AM UTC-7, Tihomir Lichev wrote: So how you can distinguish the first email from any thread ? Do you have

Re: Searching email

2014-08-06 Thread Tihomir Lichev
So you should be able to use aggregation to get the first email from each thread. Kind of : { aggs: { threads: { terms: { field: thread_id }, aggs: { first_email: { min: { field: email_id } } } } } } 06 август 2014,

transport client? really?

2014-08-06 Thread Luis García Acosta
Hi Folk, The question is, what client are you using out there? Here at company X we have java applications using elasticsearch. We have many java applications, different java applications and they use the transport client. This decision was made for developers, given the ease of use that the

elasticsearch cluster spreading the bulk tasks

2014-08-06 Thread Pavel P
Hi, Could someone clarify me the next: When I have the ES cluster, consisting from 2 machines, how should I send the bulk index requests to them. 1. Do I understand right that I can send everything to any node I have, then it would be spreaded for indexing among the cluster automatically? 2.

Re: Searching email

2014-08-06 Thread Mark Fletcher
Thanks again for your response. I don't have much experience with aggregations, but wouldn't that just give me a set of thread id's ordered by how many messages are in each thread? In my results, it's possible to have a match on a message body be ranked higher than a match on a subject. Using this

Re: Searching email

2014-08-06 Thread Tihomir Lichev
I haven't tested such aggregation, but I think the way I wrote it should give you the oldest email that match the request from each thread. Not sure how they will be sorted ... 06 август 2014, сряда, 17:31:56 UTC+3, Mark Fletcher написа: Thanks again for your response. I don't have much

Re: Some observations with Curator

2014-08-06 Thread Brian
Aaron, Well, now I feel a little foolish. Perhaps it was from my initial attempt to put --logfile at the end of the command instead of before the action: $ curator delete --older-than 8 --logfile /tmp/curator.log usage: curator [-h] [-v] [--host HOST] [--url_prefix URL_PREFIX] [--port PORT]

Re: transport client? really?

2014-08-06 Thread Brian
Here is my experience. Yours may vary. I also use the TransportClient. And then I wrap our business rules behind another server that offers an HTTP REST API but talks to Elasticsearch on the back end via the TransportClient. This server uses Netty and the LMAX Disruptor to provide low-resource

Re: leave content in mySQL and use ElasticSearch only for Index

2014-08-06 Thread joergpra...@gmail.com
JDBC plugin is not for migration. It can be configured to select the data from the RDBMS you want. You can fetch the metadata fields and index them into Elasticsearch with a simple SQL select statement. Jörg On Wed, Aug 6, 2014 at 3:48 PM, Andrej Rosenheinrich andrej.rosenheinr...@unister.de

Re: Group by field and then sum the groups

2014-08-06 Thread Cameron Barker
This worked perfectly! Thank you for your help. On Wednesday, August 6, 2014 3:49:57 AM UTC-4, Tihomir Lichev wrote: Thanks! You're absolutely right. Copy/paste error :) { aggs: { user_likes: { terms: { field: user_id }, aggs: { likes_sum: {

Is the snapshot incremental?

2014-08-06 Thread IronMike
curl -XPUT http://localhost:9200/_snapshot/myRepository/myIndex_`date +%Y-%m-%d`?wait_for_completion=true This cron job runs daily which backs up my index to AWS S3, each day the snapshot has a different name. I want to make sure that I am not duplicating a 10GB index for example everyday

ES + spark 1.0.1 - unable to send RDDs to ES

2014-08-06 Thread Phil gib
hello my context : spark, spark-shell 1.0.1 jdk1.7 scala 2.10.4, ES-Hadoop 2.1.0 ( nighly build) my problem: - unable to send RDDs from spark to ES i got a NoClassDefFoundError see below ( org/codehaus/jackson/annotate/JsonClass) jackson Jars to add to spark shell? philippe best

Re: ES + spark 1.0.1 - unable to send RDDs to ES

2014-08-06 Thread Phil gib
sorry for the mistake : -- unable to read from ES and create RDDS On Wednesday, August 6, 2014 6:32:02 PM UTC+2, Phil gib wrote: hello my context : spark, spark-shell 1.0.1 jdk1.7 scala 2.10.4, ES-Hadoop 2.1.0 ( nighly build) my problem: - unable to read from ES and create RDDS i

Recovering From Corrupted Shard Following Upgrade to 1.3.1

2014-08-06 Thread Nariman Haghighi
A few days after the upgrade to 1.3.1 we experienced our first corrupted shard in a 2 node cluster: [2014-08-06 15:54:28,815][WARN ][indices.cluster ] [FiveAces.Coffee.Web_IN_0] [streamentry5][4] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:

Re: transport client? really?

2014-08-06 Thread Ivan Brusic
Since version 1.0, there should be fewer binary protocol issues between any nodes, including the clients, making rolling upgrades doable. Older clients should be able to interact with newer server nodes, but the inverse is not always the case. -- Ivan On Wed, Aug 6, 2014 at 8:47 AM, Brian

Re: ES + spark 1.0.1 - unable to send RDDs to ES

2014-08-06 Thread Costin Leau
Hi Phil, Glad to see the work in es-hadoop master is being picked up even without any public announcement of it :) The issue has been fixed in master [1] and already pushed to Maven - can you please update and try again? FTR: The issue seems to be caused by multiple versions of Jackson which

Re: Recovering From Corrupted Shard Following Upgrade to 1.3.1

2014-08-06 Thread Nariman Haghighi
I should mention that there is a primary shard 4 on the other node, just need to understand why it's not auto recovering here what I can do to manually remove the corrupted shard to have the primary replicated to this node. On Wednesday, August 6, 2014 12:44:41 PM UTC-4, Nariman Haghighi

Re: Is the snapshot incremental?

2014-08-06 Thread David Pilato
Well. It is incremental. But let's say you have saved old Lucene segments and that old segments has been merged in the meantime to a new bigger one, the next snapshot will copy the new BIG segment and remove the old ones. It means that old data will be copied twice in this scenario. Makes

Re: Is the snapshot incremental?

2014-08-06 Thread IronMike
Thanks, it makes sense in this case. I don't think I can prevent something like that from happening? On Wednesday, August 6, 2014 1:29:40 PM UTC-4, David Pilato wrote: Well. It is incremental. But let's say you have saved old Lucene segments and that old segments has been merged in the

Search result only with unique value of the specific field

2014-08-06 Thread slavag
Hi, Need some advise. I have indexed documents, each document has internal id that also indexed as just another indexed field, this id is not used as indexed document id (_id). There could be situation when same document is indexed more than once (each of the indexed instances will have

Re: Is the snapshot incremental?

2014-08-06 Thread David Pilato
No. I don't think so. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 6 août 2014 à 20:04, IronMike sabdall...@gmail.com a écrit : Thanks, it makes sense in this case. I don't think I can prevent something like that from happening? On Wednesday, August 6, 2014 1:29:40

Stripping html for indexing only?

2014-08-06 Thread IronMike
I searched this topic but some of the answers were still vague to me. My goal is to index html docs but have the html stripped for the indexing, at the same time, I would like _source to have the original html document for display purposes. //My doc format: { content: html Hello this is an

Re: Stripping html for indexing only?

2014-08-06 Thread Ivan Brusic
1. Correct. 2. Also correct. The analysis chain only affects how the terms are indexed and placed in the inverted index. The original document remains as is. 3. Not sure since I have never done highlighting. Highlighting might not depend on the source since the term positions/offsets are used, but

Re: Search result only with unique value of the specific field

2014-08-06 Thread Ivan Brusic
Perhaps the top hits aggregation can help: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html -- Ivan On Wed, Aug 6, 2014 at 11:21 AM, slavag slav...@gmail.com wrote: Hi, Need some advise. I have indexed documents,

Query for nested objects count

2014-08-06 Thread Paulo Correa
Hi, I have a need to retrieve documents (of type bucket) which have at least 2 nested objects (of type products) inside them (details of my mapping and documents are on the gist below). https://gist.github.com/anonymous/4f06c9322186ce9d4708 As far as I've searched, I did not find a way to

Re: elasticsearch cluster spreading the bulk tasks

2014-08-06 Thread Pavel P
Still interested to know your view on the issue. On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote: Hi, Could someone clarify me the next: When I have the ES cluster, consisting from 2 machines, how should I send the bulk index requests to them. 1. Do I understand right that

[ANN] Elasticsearch Google Compute Engine cloud plugin 2.3.0 released

2014-08-06 Thread Elasticsearch Team
Heya, We are pleased to announce the release of the Elasticsearch Google Compute Engine cloud plugin, version 2.3.0. The Google Compute Engine (GCE) Cloud plugin allows to use GCE API for the unicast discovery mechanism.. https://github.com/elasticsearch/elasticsearch-cloud-gce/ Release

[ANN] Elasticsearch Twitter River plugin 2.3.0 released

2014-08-06 Thread Elasticsearch Team
Heya, We are pleased to announce the release of the Elasticsearch Twitter River plugin, version 2.3.0. The Twitter river indexes the public twitter stream, aka the hose, and makes it searchable. https://github.com/elasticsearch/elasticsearch-river-twitter/ Release Notes -

Re: Search result only with unique value of the specific field

2014-08-06 Thread slavag
Hi, Thanks for the reply. I'm trying to define top hits aggregation but getting error : Parse Failure [Could not find aggregator type [top_hits] in [single_result]]]; }] This is my aggregation definition, first bucket is grouped by id and the nested bucket is grouped by date and then I want to

Re: elasticsearch cluster spreading the bulk tasks

2014-08-06 Thread joergpra...@gmail.com
1. Yes, it is spread automatically 2. No The bulk queue up is where the shards are. So check your shard distribution. They should be equal on each node for an index. Otherwise your system load is unbalanced. Jörg On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pa...@kredito.de wrote: Still

Re: Search result only with unique value of the specific field

2014-08-06 Thread David Pilato
This has been added in 1.3.0:  https://github.com/elasticsearch/elasticsearch/pull/6124 --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 6 août 2014 à 23:49:25, slavag (slav...@gmail.com) a écrit: Hi, Thanks for the reply. I'm trying to define top hits

Re: Search result only with unique value of the specific field

2014-08-06 Thread slavag
Ooo, my bad, sorry. In the top_hits explanation page : http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html There was top_docs mentioned, but can't find any other reference to that aggregator, how can I use it ? Thanks.

Re: Search result only with unique value of the specific field

2014-08-06 Thread Ivan Brusic
Sorry, I meant to specify the version, but I forgot. If you do upgrade, here is another explanation of top hits: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/top-hits.html -- Ivan On Wed, Aug 6, 2014 at 2:59 PM, David Pilato da...@pilato.fr wrote: This has been added in

Re: Search result only with unique value of the specific field

2014-08-06 Thread slavag
I'll definitely upgrade. Thanks On Thursday, August 7, 2014 1:07:01 AM UTC+3, Ivan Brusic wrote: Sorry, I meant to specify the version, but I forgot. If you do upgrade, here is another explanation of top hits: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/top-hits.html

Re: System Requirements for ElasticSearch stack

2014-08-06 Thread 熊贻青
I have found quite a few simliar emails about capacity planning. Although it make sense that there are a lot of variables/factors, it would be great for new users to have some sort of baseline, which could be simple , just single type of indices, not too heavy load. Maybe there are already

Re: Elasticsearch still scan all types in a index even if I specify a type

2014-08-06 Thread panfei
Thanks for the information 2014-08-01 0:55 GMT+08:00 Ivan Brusic i...@brusic.com: All types eventually belong to the same Lucene index and Lucene cannot handle different types for the same field name. Avoid using the same name across types if the field type is different.

Parse Failure [Expected [START_OBJECT] under [filter], but got a [START_ARRAY

2014-08-06 Thread Vincent Gross
I've tried to upgrade the version of ES yesterday (from 1.1.1 to 1.3.1) and I have an Issue when I try to use a complexe query with Aggregation. Parse Failure [Expected [START_OBJECT] under [filter], but got a [START_ARRAY I've this bug since the version 1.2.0 (I've tried all the version

doc.deleted keeps increasing in indices

2014-08-06 Thread vjbangis
Hello guys, Could you help me why docs.count below is not increasing? it's stack at 2307764. while the docs.deleted keeps increasing. i'm just running a php script to ingest the csv source data to ES. [login@machine elasticsearch]$ curl 'localhost:9200/_cat/indices?v' health index pri

Can aggregation use a prepared result of real-time Map Reduce task?

2014-08-06 Thread Tong Liu
(I move the topic from github issue to here) I want to know the theory of ES aggregation. Maybe, it is one of them: (1) like a Database. compute when the aggregation query comes. (2) like Storm. When a data comes, it aggregate once. You don't need aggregate when query comes. The aggregation

Re: Can aggregation use a prepared result of real-time Map Reduce task?

2014-08-06 Thread Tong Liu
I still want to know some basic theory about that efficient manner. Thank you very much! 在 2014年8月7日星期四UTC+8下午12时00分31秒,Tong Liu写道: (I move the topic from github issue to here) I want to know the theory of ES aggregation. Maybe, it is one of them: (1) like a Database. compute when the

Re: Can aggregation use a prepared result of real-time Map Reduce task?

2014-08-06 Thread Tong Liu
I still want to know some basic theory about that efficient manner. On Thursday, August 7, 2014 12:00:31 PM UTC+8, Tong Liu wrote: (I move the topic from github issue to here) I want to know the theory of ES aggregation. Maybe, it is one of them: (1) like a Database. compute when the