Exact Whole Word Search

2014-08-25 Thread Maria John Franklin
Hi Friends, How to search exact whole word in particular column using ElaticSearch? Please explain with sample code.. Thanks, Franklin, -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving

Re: Exact Whole Word Search

2014-08-25 Thread Maria John Franklin
On Monday, August 25, 2014 12:17:20 PM UTC+5:30, Maria John Franklin wrote: Hi Friends, How to search exact whole word in particular column using ElaticSearch? Please explain with sample code.. Thanks, Franklin, Hi Friends, I have two values in database. One is Mango another

How to index Office files? *.txt and *.pdf are working...

2014-08-25 Thread Dirk Bauer
Hi, using elasticsearch-1.3.2 with Plug-in - name: mapper-attachments version: 2.3.1 description: Adds the attachment type allowing to parse difference attachment formats jvm: true site: false on Windows 8 for evaluation purpose. JVM - version: 1.7.0_67 vm_name: Java HotSpot(TM)

Re: Using elasticsearch as a realtime fire hose

2014-08-25 Thread Jim Alateras
What kind of events do you think of? Single new document indexed? Batch of docs indexed? Node-wide? Or cluster wide? event on whenever a document is added to an index cluster wide You mention Redis, for something like publish/subscribe pattern, you'd have to use a persistent

Re: Boost the first word in a multi-word query

2014-08-25 Thread Jérémy
Thank you so much for the warning, I was about to make that mistake ;-) On Mon, Aug 25, 2014 at 5:23 AM, vineeth mohan vm.vineethmo...@gmail.com wrote: Hello Jeremy , Just a word of caution. Its not reccomonded to expose query_string to end user. Rather another version of it used instead -

Curator and disable shard allocation

2014-08-25 Thread Klaus Kleber
Hey Guys, We have set up a ES-cluster on a very powerful machine, where 1 node have the SSD assigned and the other nodes the slow HDD´s. We want to reallocate the indices after 7 days to the HDD´s which is very easy using the curator-tool. Logstash is configured to send the indexed events to

How to get the field infomation when _all and _source was set disabled

2014-08-25 Thread Wang Mingxing
Hi, I created an index, which was named test_all, and it has a table : type1. I want to test the usage of _all and _source. Now , I change their status to false. The mapping as follows: $ curl -XGET 'localhost:9200/test_all/_mapping/type1?pretty' { test_all : { mappings : { type1 : { _all : {

Re: Error running ES DSL in hadoop mapreduce

2014-08-25 Thread Adrien Grand
Thanks Sona, This stack trace indicates a bug in the cardinality aggregation. I just opened an issue for it: https://github.com/elasticsearch/elasticsearch/issues/7429 In order to help me understand/reproduce this bug, could you please provide the mappings of your ExamRowKey and body_part

Re: Need some advice to build a log central.

2014-08-25 Thread Sang Dang
Hi Vineeth Mohan, My log central will contain 2 type of log, one is for log to debug/monitor, other is for stats. I have 2 ways to achieve it: #1 , I use only ES, it's ok to log for debug/monitor (using kibana). To do stats, I will build some extra api (base on filter/facet/agregration...)

Re: DOS attack Elasticsearch with Mappings

2014-08-25 Thread Adrien Grand
Hi Joshua, Was the issue tied to the byte size of the mappings or the fact that they contained lots of fields? I'm asking because there was a performance inefficiency in versions 1.3.0 that caused every field introduction to perform in quadratic time[1]. It probably doesn't solve your problem

Re: Boost the first word in a multi-word query

2014-08-25 Thread Jérémy
Hum I didn't notice the change of behavior of the + sign. I prefer the how query string handle that. Is there a way to have a must be present operator for simple string query? Cheers, Jeremy On Mon, Aug 25, 2014 at 9:33 AM, Jérémy mer...@gmail.com wrote: Thank you so much for the warning, I

Re: How to index Office files? *.txt and *.pdf are working...

2014-08-25 Thread David Pilato
From my experience, this should work. Indexing Word docs should work as Tika support office docs. Not sure what you are doing wrong. Try to send a match all query and ask for field file.file. Also, you could set mapper plugin to TRACE mode in logging.yml and see if it tells something

Re: Query pre-processing before execution?

2014-08-25 Thread Pawel
Hi Joerg, You are right about analyzer. I have also a little different case or maybe I missed something (and analyzer-way can also handle my case). I'd like to process a query and add additional filter to each of queries. To build this filter external service should be queried to fetch additional

Re: Sustainable way to regularly purge deleted docs

2014-08-25 Thread Adrien Grand
I left some comments inline: On Sat, Aug 23, 2014 at 5:08 PM, Jonathan Foy the...@gmail.com wrote: I was a bit surprised to see the number of deleted docs grow so large, but I won't rule out my having something setup wrong. Non-default merge settings are below, by all means let me know if

Re: Query pre-processing before execution?

2014-08-25 Thread joergpra...@gmail.com
I do not fully understand what an external filter service is but I remember such a question before. It does not matter where the filter terms come from, you can set up your application, and add filter terms at ES language level from there. This is the most flexible and scalable approach. It is

Re: Error running ES DSL in hadoop mapreduce

2014-08-25 Thread Sona Samad
Thanks Adrien. The ExamRowKey and body_part are Strings uploaded from csv file using LogStash to ElasticSearch. - how reproducible is it? Ie. if you run this query 10 times, how many of these queries will write such lines to the logs? This query returns the error each time its run in the

Re: Elastic search dynamic number of replicas from Java API

2014-08-25 Thread joergpra...@gmail.com
An example of a server-side cluster state listener is in JDBC river plugin https://github.com/jprante/elasticsearch-river-jdbc/blob/master/src/main/java/org/xbib/elasticsearch/action/river/jdbc/state/RiverStateService.java I use it to augment the cluster state with river state info. Jörg On

Re: Error running ES DSL in hadoop mapreduce

2014-08-25 Thread Sona Samad
One more update on the issue: I tried changing the query using 'sum' { size:0, aggs: { group_by_BodyPart: { terms: { field: body_part, size: 5, order : { examcount : desc } }, aggs : { examcount : { sum : { field : ExamRowKey } }

Re: Topics/Entities with relevancy scores and searching

2014-08-25 Thread Clinton Gormley
On 24 August 2014 19:46, Scott Decker sc...@publishthis.com wrote: Have you done this? any concerns to performance with this sort of scoring, or, it is just as fast if you were doing base lucene scoring if we override the score function and just use our own? -- we will of course try it and

stuck thread problem?

2014-08-25 Thread Patrick Proniewski
Hello, I'm running an ELK install for few months now, and few weeks ago I've noticed a strange behavior: ES had some kind of stuck thread consuming 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted ES, and it all came back to normal, until it started again 2 weeks later

Re: Query pre-processing before execution?

2014-08-25 Thread Otis Gospodnetic
Hi, On Monday, August 25, 2014 11:40:53 AM UTC+2, Jörg Prante wrote: I do not fully understand what an external filter service is but I remember such a question before. It does not matter where the filter terms come from, you can set up your application, and add filter terms at ES

Re: Completion mapping type throws a misleading error on null value

2014-08-25 Thread Gérald Quintana
Hello, I am experiencing the same problem: I am indexing a field (libelleVoie) with a mapping of type completion, but this field is sometimes null, and then I get this error: [2014-08-25 13:16:49,500][DEBUG][action.bulk ] [gqa-es-node-1] [fiche_immeuble][4] failed to execute bulk

Re: stuck thread problem?

2014-08-25 Thread Mark Walkom
(You should really set Xms and Xmx to be the same.) But it's not faulty, it's probably just GC which should be visible in the logs. How much data do you have in your cluster? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web:

Re: stuck thread problem?

2014-08-25 Thread Patrick Proniewski
On 25 août 2014, at 13:51, Mark Walkom ma...@campaignmonitor.com wrote: (You should really set Xms and Xmx to be the same.) Ok, I'll do this next time I restart. But it's not faulty, it's probably just GC which should be visible in the logs. How much data do you have in your cluster? NAME

Re: Completion mapping type throws a misleading error on null value

2014-08-25 Thread Gérald Quintana
Replacing null by (espace) does the job. Gérald Le lundi 25 août 2014 13:51:12 UTC+2, Gérald Quintana a écrit : Hello, I am experiencing the same problem: I am indexing a field (libelleVoie) with a mapping of type completion, but this field is sometimes null, and then I get this error:

Re: How to index Office files? *.txt and *.pdf are working...

2014-08-25 Thread Dirk Bauer
Hi David, thx for your help, but it's still not working. What I did: The query { query: { match: { _all: test } } } delivers all my indexed document (also the '.doc / *.docx files) and I can see the base64 stuff in the file.file field. So this looks good to me. Then I

Re: One large index vs. many smaller indexes

2014-08-25 Thread Chris Neal
Thanks Adrien! Very much appreciate your time and help. Chris On Mon, Aug 25, 2014 at 3:44 AM, Adrien Grand adrien.gr...@elasticsearch.com wrote: I meant tens of shards per node. So if you have N nodes with I indices which have S shards and R replicas, that would be (I * S * (1 + R)) / N.

Node is trying to talk to itslef

2014-08-25 Thread Eugene Strokin
Hello, I have an old cluster of ES 0.20.1, and it worked fine until recently one node got disconnected for unknown reason (probably network failure), and after restart it tries to join master but sends request to itself and failing with such message: [2014-08-25 13:59:11,768][INFO ][transport

Re: Parent/Child query performance in version 1.1.2

2014-08-25 Thread Mark Greene
Hi Adrien, Thanks for reaching out. We actually were exited to see the performance improvements stated in the 1.2.0 release notes so we upgraded to 1.3.2. We saw some performance improvement but it wasn't orders of magnitude and queries are still running very slow. We also tried your

When will LogStash exceed the queue capacity and drop messages?

2014-08-25 Thread Shih-Peng Lin
I am using LogStash to collect the logs from my service. The volume of the data is so large (20GB/day) that I am afraid that some of the data will be dropped at peak time. So I asked question

Re: Parent/Child query performance in version 1.1.2

2014-08-25 Thread Clinton Gormley
Something else to note: parent-child now uses global ordinals to make queries 3x faster than they were previously, but global ordinals need to be rebuilt after the index has refreshed (assuming some data has changed). Currently there is no way to refresh p/c global ordinals eagerly (ie during the

Re: Simple howto stunnel for elastcisearch cluster.

2014-08-25 Thread John Smith
And yay native API clients are nodes also, which allows them to become proxies. So then you need to stunnel protect them also. Rinse and repeat lol So... 1- For port 9300 bind to localhost 2- Put stunnel infront of port 9300 and configure all nodes same way to have cluster node coms in SSL. 3-

Re: Parent/Child query performance in version 1.1.2

2014-08-25 Thread Mark Greene
Hey Clinton, Thanks for the heads up on what's on the horizon. That definitely sounds like a drastic improvement. That being said, my fear here is that even with that improvement, this data model (parent/child) doesn't seem to that performant with a moderate amount of documents. In order for

Re: inconsistent paging

2014-08-25 Thread Ron Sher
Thanks for the answer and sorry for the duplicate (posted from a different source by mistake) On Monday, August 18, 2014 11:02:47 AM UTC+3, Adrien Grand wrote: Hi Ron, The cause of this issue is that Elasticsearch uses Lucene's internal doc IDs as tie-breakers. Internal doc IDs might be

Re: Json Data not getting parsed when sent to Elasticsearch

2014-08-25 Thread Didjit
bump. Anyone? Thank you, Chris On Sunday, August 24, 2014 10:32:23 AM UTC-4, Didjit wrote: Pretty simple (below). . I just added to json codec and tried again and received the same results. Thank you! elasticsearch { host = localhost cluster = cjceswin node_name = cjcnode codec = json

Indexing large number of files each with a huge size

2014-08-25 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi, I am trying to index documents, each file approx ~10-20 MB. I start seeing memory issues if I try to index them all in a multi-threaded environment from a single TransportClient on one machine to a single node cluster with 32GB ES server. It seems like the memory is an issue on the client

Reduce Number of Segments

2014-08-25 Thread Chris Decker
All, I’m looking for advice on how to reduce the number of segments for my indices because in my use case (log analysis), quick searches are more important than real-time access to data. I've turned many of the knobs available within ES, and read many blog postings, ES documentation, etc.,

Re: Indexing large number of files each with a huge size

2014-08-25 Thread joergpra...@gmail.com
Can you show the program how you index? Before tuning heap sizes or batch sizes, it is good to check if the program works correct. Jörg On Mon, Aug 25, 2014 at 7:00 PM, 'Sandeep Ramesh Khanzode' via elasticsearch elasticsearch@googlegroups.com wrote: Hi, I am trying to index documents,

set connect_timeout of elasticsearch php client

2014-08-25 Thread Niv Penso
Hey, i want to configure a a small timeout between my elasticsearch php client to the my elasticsearch server. i tried to pass some parameters to the guzzle client but it seems this doesn't work. here is the code: $params = array(); $params['hosts'] = $hosts;

TimeZone for logging

2014-08-25 Thread IronMan2014
How do I change logging timestamps to EST. appender: console: type: console layout: type: consolePattern conversionPattern: [%d{ISO8601}][%-5p][%-25c] %m%n file: type: dailyRollingFile file: ${path.logs}/${cluster.name}.log datePattern:

Re: Reduce Number of Segments

2014-08-25 Thread Michael McCandless
Which version of ES are you using? Versions before 1.2 have a bug that caused merge throttling to throttle far more than requested such that you couldn't get any faster than ~8 MB / sec. See https://github.com/elasticsearch/elasticsearch/issues/6018 Tiered merge policy is best. Mike McCandless

Can't open file to read checksums

2014-08-25 Thread Casper Thrane
Hi! We get the following errors, on two of our nodes. And after that our cluster doesn't work. I have no idea what it means. [2014-08-25 17:46:39,323][WARN ][indices.store] [p-elasticlog03] Can't open file to read checksums java.io.FileNotFoundException: No such file

Query Visualizer

2014-08-25 Thread Ryan Henszey
Greetings all Awhile back I wrote a query visualizer to help with debugging large programmatically generated queries. Figured I would share it here in case anyone else could benefit from it. Its not so much an app as it is just a page right now. github:

how to disable default-mapping.json for a new index?

2014-08-25 Thread asanderson
I've got a couple dozen or so indexes for which I've defined config/default-mapping.json that includes dynamic_templates and properties which works fine; however, I now have a new index for which I do not want the default-mapping.json to apply. i.e. I just want to use the default

Thousand of shards

2014-08-25 Thread Casper Thrane
Hi! I am new to ES, and the system we are using is setup by an external consultant. The cluster is very unstable. I have tried to run this: -bash-4.1$ curl -XGET 'http://localhost:9200/_cluster/health?pretty=true' { cluster_name : elasticsearch, status : green, timed_out : false,

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-25 Thread tony . aponte
It's as big as my ES_HEAP_SIZE parameter, 30g. Tony On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote: How big is it? Maybe i can have it anyway? I pulled two ancient ultrasparcs out of my closet to try to debug your issue, but unfortunately they are a pita to work with (dead

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-25 Thread tony . aponte
I have no plugins installed (yet) and only changed es.logger.level to DEBUG in logging.yml. elasticsearch.yml: cluster.name: es-AMS1Cluster node.name: KYLIE1 node.rack: amssc2client02 path.data: /export/home/apontet/elasticsearch/data path.work: /export/home/apontet/elasticsearch/work

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-25 Thread tony . aponte
I was able to trim the heap size and, consequently, the core file down to about 530m. Tony On Monday, August 25, 2014 3:41:14 PM UTC-4, tony@iqor.com wrote: It's as big as my ES_HEAP_SIZE parameter, 30g. Tony On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote: How big

Is it possible to register a RestFilter without creating a plugin?

2014-08-25 Thread Jinyuan Zhou
Thanks, -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit

Native client strictly as client.

2014-08-25 Thread John Smith
Using 1.3.2 Just to be sure... If using the Native client APIs... If creating a node client essentially that client becomes a node in the cluster and you can also proxy through it (as i see in the logs it's actually binds 9300 and 9200)? If using the transport client then it's strictly a

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-25 Thread tony . aponte
I captured a WireShark trace of the interaction between ES and Logstash 1.4.1. The error occurs even before my data is sent. Can you try to reproduce it on your testbed with this message I captured? curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y Contests of file 'y: {

Re: Thousand of shards

2014-08-25 Thread Mark Walkom
That sort of shard count is ok on your cluster as you have 17 nodes :) Can you give us more details on what sort of hardware you run on, your java, ES and OS versions and releases? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web:

Shards

2014-08-25 Thread Markus Wiesenbacher
Hi folks, I am using a single Node-Cluster (v1.3.2) on my PC, and I was wondering that there are always 5 shards in the file-system (separate Lucene-indices), no matter how many I configure in in elasticsearch.yml or programmatically with Java-API (loadFromSource with JSON-String). Do I

Kibana server-size integration with R, Perl, and other tools

2014-08-25 Thread Brian
Is there some existing method to integrate processing between the Kibana/ Elasticsearch response JSON and the graphing? For example, I have a Perl script that can convert an Elasticsearch JSON response into a CSV, even reversing the response to put the oldest event first (for gnuplot

Building an ERP with Elasticsearch. Am I crazy?

2014-08-25 Thread Raphael Waldmann
Hi, First I would like to thanks all of you for Elastic. I am thinking in use it in a ERP that I am building. What do you think about this? Am I crazy? Has someone face this? I really don't think that I am comfy enough to do this, change the problems that I already know, for new problems that

Re: When will LogStash exceed the queue capacity and drop messages?

2014-08-25 Thread Mark Walkom
You should really ask this on the Logstash list - https://groups.google.com/forum/#!forum/logstash-users Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 26 August 2014 00:49, Shih-Peng Lin shihpeng@gmail.com

Re: Reduce Number of Segments

2014-08-25 Thread Chris Decker
Mike, Thanks for the response. I'm running ES 1.2.1. It appears the issue that you reported / corrected was included with ES 1.2.0. *Any other ideas / suggestions? *Were the settings that I posted sane? Thanks!, Chris On Monday, August 25, 2014 1:52:46 PM UTC-4, Michael McCandless

Re: Need some advice to build a log central.

2014-08-25 Thread Sang Dang
Hello All, I have selected #2 as my solution. I write data to ES, and use kibana+ to realtime monitor. For stats, I use Hive. Each project, I will create a index, for each type of log I will put in a ES Type, ex: ProjectXlog_debug log_error Stats_API

Re: Building an ERP with Elasticsearch. Am I crazy?

2014-08-25 Thread xiehaiwei
On Tuesday, August 26, 2014 6:46:12 AM UTC+8, Raphael Waldmann wrote: Hi, First I would like to thanks all of you for Elastic. I am thinking in use it in a ERP that I am building. What do you think about this? Am I crazy? Has someone face this? I really don't think that I am comfy enough

How to get the count of matched search term in all the fields in each resulting document

2014-08-25 Thread Manoj Acharya
Hello, We have a web application, where we are providing a global search feature through Elasticsearch. One can use this feature to search any text across all the documents (and on all fields). We are using '_all' in fields while querying elastic search. This yields us the desired results