Re: A simple example of how to use the Java API?
Did you look at answer branch? Answers are on answers branch: https://github.com/elasticsearchfr/hands-on/tree/answers David Le 24 mars 2015 à 12:58, 'newbo' via elasticsearch elasticsearch@googlegroups.com a écrit : Hey David, I'm looking at the code now and I see lots of TODOs and nulls, but no actual indexing. Perhaps that code needs to be cleaned up? On Thursday, November 15, 2012 at 9:11:02 AM UTC-5, David Pilato wrote: Hi Ryan, Have a look at this: https://github.com/elasticsearchfr/hands-on Answers are on answers branch: https://github.com/elasticsearchfr/hands-on/tree/answers Some slides (in french but you can understand code) are here: http://www.slideshare.net/mobile/dadoonet/hands-on-lab-elasticsearch I'm waiting for ES team to merge my Java tutorial: https://github.com/elasticsearch/elasticsearch.github.com/pull/321 It will look like this: https://github.com/dadoonet/elasticsearch.github.com/blob/9f6dae922247a81f1c72c77769d1dce3ad09dca5/tutorials/_posts/2012-11-07-using-java-api-basics.markdown HTH -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92c7a0f5-ef47-45c2-b121-8c0302fa8c71%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F32F4B0D-3A65-45C6-BC95-19608C8AE62B%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
ElasticSearch aggregation
Hi All, I have and index which contains details of job opening with some fields like [create_date,Open_positions,skils_required,project_start_date,project_end_date] Now my concern is to find the top 10 skills in a year based on quarter. A skill will be considered among the top if it has more number of openings in partcular quarter and number of openings are not decreasing in consecutive quartar. This is really urgent any help will be highly appreciated. Regards, Raghav Salotra -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4de2a55b-dfbb-435e-8233-9acf7d67cae6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: encountered monitor.jvm warning
what is the new gen setting? how much is the system memory in total? how many cpu cores in the node? what query do you use? jason On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain huss...@novacom.mygbiz.com wrote: Hi all, we today got (for the first time) warning messages which seem to indicate a memory problem: [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:15,372][WARN ][monitor.jvm ] [Danger] [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] [3.7gb]-[4.9gb]/[8.5gb]} [2015-03-24 09:08:18,192][WARN ][monitor.jvm ] [Danger] [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [4.9gb]-[6gb]/[8.5gb]} [2015-03-24 09:08:21,506][WARN ][monitor.jvm ] [Danger] [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [6gb]-[7.2gb]/[8.5gb]} We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other settings are defaults. From previous posts related to this issue, it is said that field data cache may be a problem. Requesting /_nodes/stats/indices/fielddata says: { cluster_name: my_cluster, nodes: { ILUggMfTSvix8Kc0nfNVAw: { timestamp: 1427188716203, name: Danger, transport_address: inet[/192.168.110.91:9300], host: xxx, ip: [ inet[/192.168.110.91:9300], NONE ], indices: { fielddata: { memory_size_in_bytes: 6484, evictions: 0 } } } } } Running top results in: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 12735 root 20 0 15.8g 10g0 S 74 13.2 2485:26 java Any ideas what to do? If possible I would rather avoid increasing ES_HEAP as there isn't that much free memory left on the host. Regards, Abid -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itz65SQBZu8NOTXgEo6OGp0LdGbTbzvfhp-NPSR%3DaNVXAw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: corrupted shard after optimize
Thanks for the CheckIndex info, that worked! It looks like only one of the segments in that shard has issues: 1 of 20: name=_1om docCount=216683 codec=Lucene3x compound=false numFiles=10 size (MB)=5,111.421 diagnostics = {os=Linux, os.version=3.5.7, mergeFactor=7, source=merge, lucene.version=3.6.0 1310449 - rmuir - 2012-04-06 11:31:16, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.6.0_26, java.vendor=Sun Microsystems Inc.} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [31 fields] test: field norms.OK [20 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: index=216690, numBits=216683 java.lang.AssertionError: index=216690, numBits=216683 at org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:252) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:932) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1325) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:631) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) test: stored fields...OK [3033562 total field count; avg 14 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:646) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) This is on ES 1.3.4, but the index I was running optimize on was likely created back in 0.9 or 1.0. On Tuesday, March 24, 2015 at 5:27:04 AM UTC-4, Michael McCandless wrote: Hmm, not good. Which version of ES? Do you have a full stack trace for the exception? To run CheckIndex you need to add all ES jars to the classpath. It's easiest to just use a wildcard for this, e.g.: java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex ... Make sure you have the double quotes so the shell does not expand that wildcard! Mike McCandless On Mon, Mar 23, 2015 at 9:50 PM, mjd...@gmail.com javascript: wrote: I did an optimize on this index and it looks like it caused a shard to become corrupted. Or maybe the optimize just brought the shard corruption to light? On the node that reported the corrupted shard I tried shutting it down, moving the shard out and then restarting. Unfortunately the next node that got that shard then started with the same corruption issues. The errors: Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][indices.cluster ] [Meteorite II] [1-2013][0] failed to start shard Mar 24 01:40:17 localhost org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [1-2013][0] failed to fetch index version after copying it over Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[1-2013][0] Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut: org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]] I tried using CheckIndex, but had this issue: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [Pulsing41, SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, FST41, FSTOrd41, Lucene40, Lucene41] When running with: java -cp /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex I'm not a java programmer so after I tried other classpath combinations I was out of ideas. Any tips? Looking at _cat/shards the replica is currently marked unassigned while the primary is initializing. Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com
Re: PUT gzipped data into elasticsearch
Are you using Java API? Jörg On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com wrote: Hi, some of our data are big ones (1 - 10 MB), and if there are milions of those, it causes us trouble in our internal network. We would like to compress these data in generation time, and send over the network to elasticsearch in compressed state. I tried to find some solution, but usually I found only retrieving gzipped data. Is there a way, how to put compressed data into elasticsearch? Thanks, Marcel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Install Kibana4 as a plugin
Hi, Is it possible to install Kibana 4 as a plugin to Elasticsearch 1.5.0? If so do I need to configure anything afterwards? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Nested list aggregation
I have documents like this in my index: { time: 2015-03-24T08:24:55.9056988, msg: { corrupted: false, stat: { fileSize: 10186, stages: [ { stage: queued, duration: 420 }, { stage: validate, duration: 27 }, { stage: cacheFile, duration: 87 }, { stage: sendResult, duration: 1332 } ] } } } I'd like to calculate sum(msg.stat.stages.duration) grouped by msg.stat.stages.stage. I tried the following: { size: 0, aggs: { 1: { terms: { field: msg.stat.stages.stage }, aggs: { 2: { nested: { path: stat.stages }, aggs: { 3: { sum: { field: stat.stages.duration } } } } } } }, query: { match_all: {} } } and got: { took: 6, timed_out: false, _shards: { total: 5, successful: 5, failed: 0 }, hits: { total: 1, max_score: 0, hits: [] }, aggregations: { 1: { doc_count_error_upper_bound: 0, sum_other_doc_count: 0, buckets: [ { 2: { doc_count: 0 }, key: cachefile, doc_count: 1 }, { 2: { doc_count: 0 }, key: queued, doc_count: 1 }, { 2: { doc_count: 0 }, key: sendresult, doc_count: 1 }, { 2: { doc_count: 0 }, key: validate, doc_count: 1 } ] } } } which is not what I expected. Any ideas? Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/22173f85-6d94-4b29-a9f5-b13c46a4850d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Filter cache and field data caches
Hi All, We observe the relatively high heap usage : 70-75% per node. Our heap size is 32G. Each node has 2 shards (one primary , one replica). Each shard size is approximately 43G The size of the field data cache is ~22G and there are zero evictions from this cache The size of the filter cache is ~28G and there are many evictions Is it normal having such large cache sizes? We think it affects our queries performance. On an empty cache a query latency is 2-3 minutes for the first run. Second run takes 10 seconds (and after some time again 2 min) Does it those 2-3 min take to Elastic to rebuild/refresh the caches? Why it's so long? Please help us to understand how Elastic creates filter cache and field data cache. When does Elastic create these caches and how does it affect the queries performance? Thank you in advance, Vladi -- This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this on behalf of the addressee you must not use, copy, disclose or take action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply email and delete this message. Thank you. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d51b34e-5252-44d8-ac56-df4c37b6a2b6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: PUT gzipped data into elasticsearch
Hi Joe, no, we post data into elasticsearch using Logstash. Actually, the complete way is: APP - Logstash - RMQ - Logstash - ES. And when I include F5 VIP's, there are 6 services on the way to process these data... Marcel On Tue, Mar 24, 2015 at 1:50 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: Are you using Java API? Jörg On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com wrote: Hi, some of our data are big ones (1 - 10 MB), and if there are milions of those, it causes us trouble in our internal network. We would like to compress these data in generation time, and send over the network to elasticsearch in compressed state. I tried to find some solution, but usually I found only retrieving gzipped data. Is there a way, how to put compressed data into elasticsearch? Thanks, Marcel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/iWT19g3_Viw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Marcel Matus -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Query conatining multiple fields
Hi, This is a part of source in elasticsearch index. _source: { InvoiceId: a, InvoiceNum: b, LastUpdatedBy: FINUSER1, SetOfBooksId: 1, LastUpdateDate: 2015-03-20 18:31:16.592, VendorId: 5, InvoiceAmount: 1200, InvoiceCurrencyCode: USD, AmountPaid: NotFound, PaymentCurrencyCode: USD, } I need to query on multiple inputs using the JAVA API, for example say I have provided InvoiceId InvoiceNum as a and b . How can I query using these multiple inputs? I want a single response for this. One thing I came across was MultiSearchResponse sr = client.prepareMultiSearch().add(srb1).add(srb2).execute().actionGet(); where SearchRequestBuilder srb1 = client.prepareSearch(dbdata2).setTypes(invoices2).setQuery(QueryBuilders.matchQuery(InvoiceId, a)); SearchRequestBuilder srb2 = client.prepareSearch(dbdata2).setTypes(invoices2).setQuery(QueryBuilders.matchQuery(InvoiceNum,b)); but these return individual results. Is there any way I can get a single output? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4105c392-462a-4902-a762-c92dc92663ea%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: A simple example of how to use the Java API?
Hey David, I'm looking at the code now and I see lots of TODOs and nulls, but no actual indexing. Perhaps that code needs to be cleaned up? On Thursday, November 15, 2012 at 9:11:02 AM UTC-5, David Pilato wrote: Hi Ryan, Have a look at this: https://github.com/elasticsearchfr/hands-on Answers are on answers branch: https://github.com/elasticsearchfr/hands-on/tree/answers Some slides (in french but you can understand code) are here: http://www.slideshare.net/mobile/dadoonet/hands-on-lab-elasticsearch I'm waiting for ES team to merge my Java tutorial: https://github.com/elasticsearch/elasticsearch.github.com/pull/321 It will look like this: https://github.com/dadoonet/elasticsearch.github.com/blob/9f6dae922247a81f1c72c77769d1dce3ad09dca5/tutorials/_posts/2012-11-07-using-java-api-basics.markdown HTH -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/92c7a0f5-ef47-45c2-b121-8c0302fa8c71%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Change/update of the datatype of latitude and longitude from double to geo_points
1: you can not « update » an existing field and transform it from double to geo_points 2: you want coordinates to be your geo_point, not latitude and longitude to be each of it a geo_point. 3: your put mapping request is incorrect. Have a look at http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html The type should be part of your JSON content. HTH -- David Pilato - Developer | Evangelist elastic.co @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 24 mars 2015 à 10:39, nidhi chawhan nidhi.chaw...@gmail.com a écrit : Hi Team, I wanted to update the datatype of latitude and longitude fields from double to geo_points . Index name :- visit.hippo_en Type :- visit:PlaceDocument PUT /visit.hippo_en/visit:PlaceDocument/_mapping { location: { properties :{ coordinates: { properties:{ coordinates: { properties:{ latitude: { type: geo_point }, longitude: { type: geo_point } } } } } } } } then getting below error { error: MapperParsingException[Root type mapping not empty after parsing! Remaining fields: [location : {properties={coordinates={properties={coordinates={properties={latitude={type=geo_point}, longitude={type=geo_point}}}]], status: 400 } when trying to update through java API below is the code CreateIndex createIndex = new CreateIndex.Builder(visitHippoIndexName + UNDERSCORE + lang).build(); jestClient.execute(createIndex); String geoPointMapping = \n + \location\: {\n + \properties\ :{\n + \coordinates\: {\n + \properties\:{\n + \coordinates\: {\n + \properties\:{ \n + \latitude\: {\n + \type\: \geo_point\\n + \n + },\n + \longitude\: {\n + \type\: \geo_point\\n + \n + }\n + }\n + }\n + \n + }\n + }\n + }\n + }; if(nodeType.equals(visit:PlaceDocument)){ PutMapping keyMapping = new PutMapping.Builder(visitHippoIndexName + UNDERSCORE + lang, nodeType, geoPointMapping).build(); jestClient.execute(keyMapping); } Index index = new Index.Builder(source).index(visitHippoIndexName + UNDERSCORE + lang).type(nodeType).id(handleNode.getIdentifier()).build(); jestClient.execute(index); It gives me below exception 2015-03-24 10:27:13,787][DEBUG][action.admin.indices.mapping.put] [Carnivore] failed to put mappings on indices [[visit.hi po_en]], type [visit:PlaceDocument] rg.elasticsearch.index.mapper.MapperParsingException: malformed mapping no root object found at org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:338) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:183) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:444) at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:505) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(Prio itizedEsThreadPoolExecutor.java:153)
PUT gzipped data into elasticsearch
Hi, some of our data are big ones (1 - 10 MB), and if there are milions of those, it causes us trouble in our internal network. We would like to compress these data in generation time, and send over the network to elasticsearch in compressed state. I tried to find some solution, but usually I found only retrieving gzipped data. Is there a way, how to put compressed data into elasticsearch? Thanks, Marcel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Bulk UDP deprecated?
See https://github.com/elastic/elasticsearch/pull/7595 This feature is rarely used. Removing it will help reduce the moving parts of Elasticsearch and focus on the core. If there is demand, I can jump in and move the bulk UDP code to a community-supported plugin for ES 2.0 For syslog, I have implemented a similar plugin: https://github.com/jprante/elasticsearch-syslog Jörg On Tue, Mar 24, 2015 at 12:14 PM, Steve Freegard s...@fsl.com wrote: Hi all, Can anyone tell me why the Bulk UDP service is being deprecated? I'm using it currently as it's a handy way to send non-critical data to ES without the associated overheads of doing it via the standard API and without affecting the performance of the application sending the data (as it's essentially fire-and-forget). Kind regards, Steve. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/mergv2%24qpm%241%40ger.gmane.org. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFv2Wyc4SQYQFbWqe_629twmT_3hU3CubunLsB%3D5Tnpjg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: corrupted shard after optimize
Hmm, not good. Which version of ES? Do you have a full stack trace for the exception? To run CheckIndex you need to add all ES jars to the classpath. It's easiest to just use a wildcard for this, e.g.: java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex ... Make sure you have the double quotes so the shell does not expand that wildcard! Mike McCandless On Mon, Mar 23, 2015 at 9:50 PM, mjdu...@gmail.com wrote: I did an optimize on this index and it looks like it caused a shard to become corrupted. Or maybe the optimize just brought the shard corruption to light? On the node that reported the corrupted shard I tried shutting it down, moving the shard out and then restarting. Unfortunately the next node that got that shard then started with the same corruption issues. The errors: Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][indices.cluster ] [Meteorite II] [1-2013][0] failed to start shard Mar 24 01:40:17 localhost org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [1-2013][0] failed to fetch index version after copying it over Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[1-2013][0] Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut: org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]] I tried using CheckIndex, but had this issue: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [Pulsing41, SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, FST41, FSTOrd41, Lucene40, Lucene41] When running with: java -cp /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex I'm not a java programmer so after I tried other classpath combinations I was out of ideas. Any tips? Looking at _cat/shards the replica is currently marked unassigned while the primary is initializing. Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHUQPhMOJWkN9p_En%2BWDM98bEDHSWTi36B_TcQsZSw%2BBKorYQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
encountered monitor.jvm warning
Hi all, we today got (for the first time) warning messages which seem to indicate a memory problem: [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:15,372][WARN ][monitor.jvm ] [Danger] [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] [3.7gb]-[4.9gb]/[8.5gb]} [2015-03-24 09:08:18,192][WARN ][monitor.jvm ] [Danger] [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [4.9gb]-[6gb]/[8.5gb]} [2015-03-24 09:08:21,506][WARN ][monitor.jvm ] [Danger] [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [6gb]-[7.2gb]/[8.5gb]} We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other settings are defaults. From previous posts related to this issue, it is said that field data cache may be a problem. Requesting */_nodes/stats/indices/fielddata *says: { cluster_name: my_cluster, nodes: { ILUggMfTSvix8Kc0nfNVAw: { timestamp: 1427188716203, name: Danger, transport_address: inet[/192.168.110.91:9300], host: xxx, ip: [ inet[/192.168.110.91:9300], NONE ], indices: { fielddata: { memory_size_in_bytes: 6484, evictions: 0 } } } } } Running top results in: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 12735 root 20 0 15.8g 10g0 S 74 13.2 2485:26 java Any ideas what to do? If possible I would rather avoid increasing ES_HEAP as there isn't that much free memory left on the host. Regards, Abid -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Change/update of the datatype of latitude and longitude from double to geo_points
Hi Team, I wanted to update the datatype of latitude and longitude fields from double to geo_points . Index name :- visit.hippo_en Type :- visit:PlaceDocument PUT /visit.hippo_en/visit:PlaceDocument/_mapping { location: { properties :{ coordinates: { properties:{ coordinates: { properties:{ latitude: { type: geo_point }, longitude: { type: geo_point } } } } } } } } then getting below error { error: MapperParsingException[Root type mapping not empty after parsing! Remaining fields: [location : {properties={coordinates={properties={coordinates={properties={latitude={type=geo_point}, longitude={type=geo_point}}}]], status: 400 } when trying to update through java API below is the code CreateIndex createIndex = new CreateIndex.Builder(visitHippoIndexName + UNDERSCORE + lang).build(); jestClient.execute(createIndex); String geoPointMapping = \n + \location\: {\n + \properties\ :{\n + \coordinates\: {\n + \properties\:{\n + \coordinates\: {\n + \properties\:{ \n + \latitude\: {\n + \type\: \geo_point\\n + \n + },\n + \longitude\: {\n + \type\: \geo_point\\n + \n + }\n + }\n + }\n + \n + }\n + }\n + }\n + }; if(nodeType.equals(visit:PlaceDocument)){ PutMapping keyMapping = new PutMapping.Builder(visitHippoIndexName + UNDERSCORE + lang, nodeType, geoPointMapping).build(); jestClient.execute(keyMapping); } Index index = new Index.Builder(source).index(visitHippoIndexName + UNDERSCORE + lang).type(nodeType).id(handleNode.getIdentifier()).build(); jestClient.execute(index); It gives me below exception 2015-03-24 10:27:13,787][DEBUG][action.admin.indices.mapping.put] [Carnivore] failed to put mappings on indices [[visit.hi po_en]], type [visit:PlaceDocument] rg.elasticsearch.index.mapper.MapperParsingException: malformed mapping no root object found at org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:338) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:183) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:444) at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:505) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(Prio itizedEsThreadPoolExecutor.java:153) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) i am giving the root object which is location in this case. But still error. Please advise how to update the double object to geopoints? Thanks, Nidhi -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/776927e3-e551-4269-bf17-962134a4b17a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Unable to preserve special characters in search results of ElasticSearch.
If I escape that character '/' I will get the data irrelevant to the search term eg: if a/b, a/c, a/d, a(b), a(c), a-b, ab is my data My requirement is if I enter the term a( then I have to get only a( as a result but not ab,a-b,... I would like to get the results without escaping them, the result need to preserve the special character '/' , any query which is relevant to that will be helpful and the search term need to search in different fields but no need to match in all fields , any one field match result is required. On Monday, March 23, 2015 at 7:50:08 PM UTC+5:30, Periyandavar wrote: Hi I think You need to escape those spl char in search string, like { query: { bool: { must: [ { query_string: { fields: [ msg ], query: a\\/ } } ] } } } -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072418.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0c8b7093-5dae-4ea5-9b60-bf5353f6d985%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Bulk UDP deprecated?
Hi all, Can anyone tell me why the Bulk UDP service is being deprecated? I'm using it currently as it's a handy way to send non-critical data to ES without the associated overheads of doing it via the standard API and without affecting the performance of the application sending the data (as it's essentially fire-and-forget). Kind regards, Steve. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/mergv2%24qpm%241%40ger.gmane.org. For more options, visit https://groups.google.com/d/optout.
Query String query is not accepting analyzer which defined in settings..
Hi Team, I kept the following settings for an index facebook.. PUT /facebook { settings : { analysis: { analyzer: { wordAnalyzer: { type: custom, tokenizer: whitespace, filter: [ word_delimiter_for_phone ] } }, filter: { word_delimiter_for_phone: { type: word_delimiter, catenate_all: true, generate_number_parts : false, split_on_case_change: false, generate_word_parts: false, split_on_numerics: false, preserve_original: true } } } }, mapping:{ face: { properties: { id: { type: string }, name: { type: string } } } } } I would like to retrieve the data by preserving the special characters in the search term as it is, which is having special characters like -, /, ( , ) , for this I have used the above analyzer and filter settings. When am using query : GET facebook/face/_search { query: { bool: { must: [ { query_string: { fields: [ id,name ], query: a/, analyzer:wordAnalyzer } } ] } } } Showing error as: error: SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[0IYtu1aLQ2qxfwLnbCFTHg][facebook][0]: SearchParseException[[facebook][0]: from[-1],size[-1]: Parse Failure [Failed to parse source I would like to preserve the special characters while searching how can I achieve this? do I need to change my query (or ) may I get any query to search the term in different fields by preserving the special characters. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb395416-f991-45c7-b24b-960438f2d625%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
question for tokenizer and synonym
I am having requirement similar to below: - search for chevy should map to search for chevrolet. I am using synonyms to accomplish this. - search for BMW 6 series should map to 61i, 62i, 63i and so on. I am planning to use synonyms to accomplish this too. Field that contains either (chevrolet or 61i or 62i or 63i) is having index analyzer as: synonym_analyzer : { type: custom, tokenizer: standard, filter: [ lowercase, synonym_filter ] } My issue with above standard tokenizer is that I cannot map BMW 6 series to 61i, 62i, 63i. I cannot use keyword tokenizer as the search could be: chevy red or BMW 6 series v8. I am open to changing tokenizers and other elements if required. Please advise. Thanks, Prateek -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2891dc4d-9826-4d83-a1e0-aeeae73f47c4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Help with mapping update
Greetings, I am trying to update mapping to make certain fields not_analyzed. The index gets created from default template, now I'm trying to add the following mapping to it (long) and get an error. The index is empty. I'm probably missing something very obvious. Thank you, David $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json {error:RemoteTransportException[[ / Luis Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: NullPointerException; ,status:500}(dk)dk- updated mapping: { mappings : { {mytype : { _ttl : { enabled : true, default : 17280 }, properties : { @timestamp : { type : date, format : dateOptionalTime }, @version : { type : string }, file : { type : string }, host : { type : string }, json : { properties : { .buffer.size : { type : long }, action : { type : string,index : not_analyzed }, [ ] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
Try deleting the index and recreating using the updated mapping? Also as a suggestion, avoid using TTL, it's very resource intensive and if you are using time based data then you should use time based indices. On 25 March 2015 at 08:28, David Kleiner david.klei...@gmail.com wrote: Greetings, I am trying to update mapping to make certain fields not_analyzed. The index gets created from default template, now I'm trying to add the following mapping to it (long) and get an error. The index is empty. I'm probably missing something very obvious. Thank you, David $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json {error:RemoteTransportException[[ / Luis Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: NullPointerException; ,status:500}(dk)dk- updated mapping: { mappings : { {mytype : { _ttl : { enabled : true, default : 17280 }, properties : { @timestamp : { type : date, format : dateOptionalTime }, @version : { type : string }, file : { type : string }, host : { type : string }, json : { properties : { .buffer.size : { type : long }, action : { type : string,index : not_analyzed }, [ ] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8rkdzDc-VXV1wx%2BoXBPAiAnoiHf6AkAETXYFc1Rvi8jg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: ElasticSearch data directory on Windows 7
I understand the link are valid but it doesn't explain the directory size. How can 2.2GB data shrunk to 77KB? (this led me to believe that ES using some other location to put data.) On Tuesday, 24 March 2015 18:35:30 UTC+11, Mark Walkom wrote: Actually as Windows users don't have an installer they need to use the zip, so the path in the link are valid. On 24 March 2015 at 17:38, Ravi Gupta gupta...@gmail.com javascript: wrote: @Mark the link you posted is for linux paths. I assumed windows path will be on similar lines and expected index data to be on below path: C:\ES home\data\elasticsearch\nodes\0\indices\idd I see the path looks like it contains all the index data but I am puzzled why the size of data directory is only 77 KB. There is a file of size 2 KB and it seems to contain some data that looks encoded (it appears my application data) C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state My question is how can 2.2GB size data is compressed into 2KB unless I am looking at wrong directory. On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote: Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/ current/setup-dir-layout.html#_zip_and_tar_gz On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com wrote: Hi , I have downloaded and installed ES on windows 7 and imported 2million+ records into it. (from a csv of size 2.2GB). I can search records using fiddler and it works as expected. I was hoping to see size of below directory to increase few GB but the size is only 77 kb. C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data This means ES stores data in some other directory on disk. Can you please tell me the directory location where ES stores the data? Regards, Ravi -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ef3acf56-d725-468c-89d5-5f07d2c38f35%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
Thank you Mark, I'll keep trying, will also try elasticdump to load custom mapping or play with templates. Cheers, David On Tuesday, March 24, 2015 at 3:44:55 PM UTC-7, Mark Walkom wrote: Try deleting the index and recreating using the updated mapping? Also as a suggestion, avoid using TTL, it's very resource intensive and if you are using time based data then you should use time based indices. On 25 March 2015 at 08:28, David Kleiner david@gmail.com javascript: wrote: Greetings, I am trying to update mapping to make certain fields not_analyzed. The index gets created from default template, now I'm trying to add the following mapping to it (long) and get an error. The index is empty. I'm probably missing something very obvious. Thank you, David $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json {error:RemoteTransportException[[ / Luis Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: NullPointerException; ,status:500}(dk)dk- updated mapping: { mappings : { {mytype : { _ttl : { enabled : true, default : 17280 }, properties : { @timestamp : { type : date, format : dateOptionalTime }, @version : { type : string }, file : { type : string }, host : { type : string }, json : { properties : { .buffer.size : { type : long }, action : { type : string,index : not_analyzed }, [ ] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7ffd5cda-ca1f-443e-b476-27f1d3765c1f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ES: Ways to work with frequently updated fields of document
Best best is to try to separate hot and cold/warm data so you are only updating things which *need* to be updated. It may make sense to split this out into different systems, but this is really up to you. On 25 March 2015 at 04:28, Александр Свиридов ooo_satu...@mail.ru wrote: I have forum. And every topic has such field as viewCount - how many times topic was viewed by forum users. I wanted that all fields of topics were taken from ES (id,date,title,content and viewCount). However, this case after every topic view ES must reindex entire document again - I asked the question about particial update at stack - http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed . It means that if topic is viewed 1000 times ES will index it 1000 times. And if I have a lot of users many documents will be indexed again and again. This is first strategy. The second strategy, as I think is to take some fields of topic from index and some from database. At this case I take viewAcount from DB. However, then I can store all fields in DB and use index only as INDEX - to get ids of current topic. Are there better strategies? What is the best way to solve such problem? -- Александр Свиридов -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9P6L49jbDpOccA_z-ckKVFjz1o%2BXWWttN%2BYneL0YBfDg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
I got the index template to work, turns out it's really straightforward. I now do have .raw fields and all the world is green. I'll add the template to my other index so it gets geoip mapping when I roll to April. Cheers, David On Tuesday, March 24, 2015 at 2:28:38 PM UTC-7, David Kleiner wrote: Greetings, I am trying to update mapping to make certain fields not_analyzed. The index gets created from default template, now I'm trying to add the following mapping to it (long) and get an error. The index is empty. I'm probably missing something very obvious. Thank you, David $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json {error:RemoteTransportException[[ / Luis Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: NullPointerException; ,status:500}(dk)dk- updated mapping: { mappings : { {mytype : { _ttl : { enabled : true, default : 17280 }, properties : { @timestamp : { type : date, format : dateOptionalTime }, @version : { type : string }, file : { type : string }, host : { type : string }, json : { properties : { .buffer.size : { type : long }, action : { type : string,index : not_analyzed }, [ ] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4114d717-ae65-493f-8f7f-c2cec8567e2a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: encountered monitor.jvm warning
Few filters should be fine but aggregations... hm You could use dump stack trace and/or heap dump if it happen again and start to analyze from there. Try as well, http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html hth jason On Tue, Mar 24, 2015 at 11:59 PM, Abid Hussain huss...@novacom.mygbiz.com wrote: All settings except from ES_HEAP (set to 10 GB) are defaults, so I actually am not sure about the new gen setting. The host has 80 GB memory in total and 24 CPU cores. All ES indices together sum up to ~32 GB where the biggest indices are of size ~8 GB. We are using queries mostly together with filters and also aggregations. We solved the problem with a restart of the cluster. Are there any recommended diagnostics to be performed when this problem occurs next time? Regards, Abid Am Dienstag, 24. März 2015 15:24:43 UTC+1 schrieb Jason Wee: what is the new gen setting? how much is the system memory in total? how many cpu cores in the node? what query do you use? jason On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain hus...@novacom.mygbiz.com wrote: Hi all, we today got (for the first time) warning messages which seem to indicate a memory problem: [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:15,372][WARN ][monitor.jvm ] [Danger] [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] [3.7gb]-[4.9gb]/[8.5gb]} [2015-03-24 09:08:18,192][WARN ][monitor.jvm ] [Danger] [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [4.9gb]-[6gb]/[8.5gb]} [2015-03-24 09:08:21,506][WARN ][monitor.jvm ] [Danger] [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [6gb]-[7.2gb]/[8.5gb]} We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other settings are defaults. From previous posts related to this issue, it is said that field data cache may be a problem. Requesting /_nodes/stats/indices/fielddata says: { cluster_name: my_cluster, nodes: { ILUggMfTSvix8Kc0nfNVAw: { timestamp: 1427188716203, name: Danger, transport_address: inet[/192.168.110.91:9300], host: xxx, ip: [ inet[/192.168.110.91:9300], NONE ], indices: { fielddata: { memory_size_in_bytes: 6484, evictions: 0 } } } } } Running top results in: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 12735 root 20 0 15.8g 10g0 S 74 13.2 2485:26 java Any ideas what to do? If possible I would rather avoid increasing ES_HEAP as there isn't that much free memory left on the host. Regards, Abid -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4320ec5a-1dc6-4eec-83c2-d9da618bf957%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to
Elastic{ON} Sessions Are Online
Hi All, The content from our recent conference is now available! Please check out https://www.elastic.co/blog/elasticon-sessions-are-online for all the information. Cheers, Mark -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zu_2HNTH%2B7Oh2AHcfxN6AE1tt4eS8UOVQnVi4HvKiMw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: corrupted shard after optimize
Quick followup question, is it safe to run -fix while ES is also running on the node? Understanding that some documents will be lost. On Tuesday, March 24, 2015 at 10:24:26 AM UTC-4, mjd...@gmail.com wrote: Thanks for the CheckIndex info, that worked! It looks like only one of the segments in that shard has issues: 1 of 20: name=_1om docCount=216683 codec=Lucene3x compound=false numFiles=10 size (MB)=5,111.421 diagnostics = {os=Linux, os.version=3.5.7, mergeFactor=7, source=merge, lucene.version=3.6.0 1310449 - rmuir - 2012-04-06 11:31:16, os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.6.0_26, java.vendor=Sun Microsystems Inc.} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [31 fields] test: field norms.OK [20 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: index=216690, numBits=216683 java.lang.AssertionError: index=216690, numBits=216683 at org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:252) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:932) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1325) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:631) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) test: stored fields...OK [3033562 total field count; avg 14 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:646) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051) This is on ES 1.3.4, but the index I was running optimize on was likely created back in 0.9 or 1.0. On Tuesday, March 24, 2015 at 5:27:04 AM UTC-4, Michael McCandless wrote: Hmm, not good. Which version of ES? Do you have a full stack trace for the exception? To run CheckIndex you need to add all ES jars to the classpath. It's easiest to just use a wildcard for this, e.g.: java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex ... Make sure you have the double quotes so the shell does not expand that wildcard! Mike McCandless On Mon, Mar 23, 2015 at 9:50 PM, mjd...@gmail.com wrote: I did an optimize on this index and it looks like it caused a shard to become corrupted. Or maybe the optimize just brought the shard corruption to light? On the node that reported the corrupted shard I tried shutting it down, moving the shard out and then restarting. Unfortunately the next node that got that shard then started with the same corruption issues. The errors: Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][indices.cluster ] [Meteorite II] [1-2013][0] failed to start shard Mar 24 01:40:17 localhost org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [1-2013][0] failed to fetch index version after copying it over Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], indexUUID [_na_], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index version after copying it over]; nested: CorruptIndexException[[1-2013][0] Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut: org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]] I tried using CheckIndex, but had this issue: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [Pulsing41, SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, FST41, FSTOrd41, Lucene40, Lucene41] When running with: java -cp /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex I'm not a java programmer so after I tried other classpath combinations I was out of ideas. Any tips? Looking at _cat/shards the replica is currently marked unassigned while the primary is initializing. Thanks! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop
ActionRequestValidationException when attempting to set mapping
I am attempting to define a mapping. I am first creating a new index by adding a document that doesn't contain anything specified in my mapping/schema. I am using a bulk operation to set the mapping but I get an ActionRequestValidationException error each time. Here is my JS code: var mappings = { body: [ { mappings: { component: { properties: { classifications: { type: string, index: not_analyzed } } } }} ] }; client.bulk(mappings, function(err, resp){ // error appears here }); The exact message I get is: ActionRequestValidationException[Validation Failed: 1: no requests added;] Any help would sure be appreciated. Blake McBride -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7e4580c1-708e-4cf8-a39e-a8bcc314d65b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Sample custom cluster action
Hello All, I would like create a cluster action (for instance for searching documents), but I could not find any sample to help me how implement cluster action. I had looked on cookbook explanation and sample, although it was useful but I could not find how I can make it. Thank you to let me have some sample or instruction document. Regards, Ali -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8fea8af-0b95-41cf-b6c2-27c1568b642b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Should I use elasticsearch as a core for faceted navigation-heavy website?
Hello, I'm evaluating elasticsearch for use in new project. I can see that it has all search features we need. Problem is that after reading documentation and forum I still can't understand whether elastic is suitable technology for us performance-wise. I'd be very grateful to get your opinion on that. We're building a directory of businesses, similar to Yelp. We have 5m businesses, and main feature of our site is faceted search on different facets: geography (tens of thousands geo objects), business type (several thousand options), additional services offered by that business (hundreds) and so on. So for each request (combination of search parameters) we need to get search results, but also what options are available in each facet (for example, what business types are located are in selected geography) for user to be able to narrow down his search (example: http://take.ms/oAZan). Full text search (by business name for example) is used in very small percentage of requests, bulk of requests is exact match on one or several facets. Based on the similar our project we expect 1-5m requests per day. All requests are highly diversified: no single page (combination of search params) constitutes more than 0,1% of total requests. We expect to be able to answer request in 200-300ms, so I guess request to elasticsearch should take no more than 100ms. On our similar project we use big lookup table in database with all possible combinations of params mapped to search result count. For each request we generate all possible combinations of parameters to refine current search and then check lookup table to see if they have any results. My questions are: Is elastic search suitable for our purposes? Specifically, are aggregations meant to be used in large number of low-latency requests, or are they more like analytical feature, where response time is not that important? I ask that because in discussions of aggregation and faceting performance here and elsewhere response times are mentioned in 1-10s range, which is ok for analytics and infrequent searches, but obviously on ok for us. How hard it is to get performance we need: 50 rps, 100ms response time for search+facets, on some reasonable hardware, taking into account big number of possible facet combinations and high diversification of requests? What kind of hardware should we expect to handle our loads? I understand that these are vague questions, but I just need some approximation. Is it more like 1 server with commodity hardware and simple configuration, or more like cloud of 10 servers and extensive tuning? For example, our lookup table solution works on 1 commodity server with 16gb of ram with almost default setup. Thank you for your responses, Dmitry -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Grandparents aggregations NullPointer
I've posted the issue here: https://github.com/elastic/elasticsearch/issues/10158 On Wednesday, March 11, 2015 at 4:42:27 PM UTC-4, Paolo Ciccarese wrote: I am using Elasticsearch 1.4.4. I am defining a parent-child index with three levels (grandparents) following the instructions in the document: https://www.elastic.co/guide/en/elasticsearch/guide/current/grandparents.html My structure is Continent - Country - Region MAPPING -- I create an index with the following mapping: curl -XPOST 'localhost:9200/geo' -d' { mappings: { continent: {}, country: { _parent: { type: continent } }, region: { _parent: { type: country } } } }' INDEXING -- I index three entities: curl -XPOST 'localhost:9200/geo/continent/europe' -d' { name:Europe }' curl -XPOST 'localhost:9200/geo/country/italy?parent=europe' -d' { name:Italy }' curl -XPOST 'localhost:9200/geo/region/lombardy?parent=italyrouting=europe' -d' { name:Lombardia }' QUERY THAT WORKS If I query and aggregate according to the document everything works fine: curl -XGET 'localhost:9200/geo/continent/_search?pretty=true' -d ' { query: { has_child: { type: country, query: { has_child: { type: region, query: { match: { name: Lombardia } } } } } }, aggs: { country: { terms: { field: name }, aggs: { countries: { children: { type: country }, aggs: { country_names : { terms : { field : country.name } } } } } } } }' QUERY THAT DOES NOT WORK --- However, if I try with multi-level aggregations like in: curl -XGET 'localhost:9200/geo/continent/_search?pretty=true' -d ' { query: { has_child: { type: country, query: { has_child: { type: region, query: { match: { name: Lombardia } } } } } }, aggs: { continent_names: { terms: { field: name }, aggs: { countries: { children: { type: country }, aggs: { regions: { children: { type: region }, aggs: { region_names : { terms : { field : region.name } } } } } } } } } }' I get back the following { error : SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[b5CbW5byQdSSW-rIwta0rA][geo][0]: QueryPhaseExecutionException[[geo][0]: query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: NullPointerException; }{[b5CbW5byQdSSW-rIwta0rA][geo][1]: QueryPhaseExecutionException[[geo][1]: query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: NullPointerException; }{[b5CbW5byQdSSW-rIwta0rA][geo][2]: QueryPhaseExecutionException[[geo][2]: query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: NullPointerException; }{[b5CbW5byQdSSW-rIwta0rA][geo][3]: QueryPhaseExecutionException[[geo][3]: query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: NullPointerException; }{[b5CbW5byQdSSW-rIwta0rA][geo][4]: QueryPhaseExecutionException[[geo][4]: query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: NullPointerException; }], status : 500 } The 'grandparents' document says: Querying and aggregating across generations works, as long as you step through each generation. How do I get the last query and aggregation to work across
Re: encountered monitor.jvm warning
All settings except from ES_HEAP (set to 10 GB) are defaults, so I actually am not sure about the new gen setting. The host has 80 GB memory in total and 24 CPU cores. All ES indices together sum up to ~32 GB where the biggest indices are of size ~8 GB. We are using queries mostly together with filters and also aggregations. We solved the problem with a restart of the cluster. Are there any recommended diagnostics to be performed when this problem occurs next time? Regards, Abid Am Dienstag, 24. März 2015 15:24:43 UTC+1 schrieb Jason Wee: what is the new gen setting? how much is the system memory in total? how many cpu cores in the node? what query do you use? jason On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain hus...@novacom.mygbiz.com javascript: wrote: Hi all, we today got (for the first time) warning messages which seem to indicate a memory problem: [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:12,960][WARN ][monitor.jvm ] [Danger] [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] [6.9gb]-[3.7gb]/[8.5gb]} [2015-03-24 09:08:15,372][WARN ][monitor.jvm ] [Danger] [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] [3.7gb]-[4.9gb]/[8.5gb]} [2015-03-24 09:08:18,192][WARN ][monitor.jvm ] [Danger] [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [4.9gb]-[6gb]/[8.5gb]} [2015-03-24 09:08:21,506][WARN ][monitor.jvm ] [Danger] [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old] [6gb]-[7.2gb]/[8.5gb]} We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other settings are defaults. From previous posts related to this issue, it is said that field data cache may be a problem. Requesting /_nodes/stats/indices/fielddata says: { cluster_name: my_cluster, nodes: { ILUggMfTSvix8Kc0nfNVAw: { timestamp: 1427188716203, name: Danger, transport_address: inet[/192.168.110.91:9300], host: xxx, ip: [ inet[/192.168.110.91:9300], NONE ], indices: { fielddata: { memory_size_in_bytes: 6484, evictions: 0 } } } } } Running top results in: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 12735 root 20 0 15.8g 10g0 S 74 13.2 2485:26 java Any ideas what to do? If possible I would rather avoid increasing ES_HEAP as there isn't that much free memory left on the host. Regards, Abid -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4320ec5a-1dc6-4eec-83c2-d9da618bf957%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Kibana fails in talking to Tribe Node
Hi, I have two ElasticSearch clusters and one tribe node for searching across the two clusters. And there is a kibana 4 connects to the tribe node. What I have done are: 1. Created .kibana and kibana-int manually in the cluster 2. Launch kibana 3. Got to the screen Configure an index pattern, hit the create button Kibana loads then throws MasterNotDiscoveredException[waited for [30s]] error. It seems a known issue. I also tried adding tribe.blocks.write: false into tribe node config file without any luck. I am sure saving dash board will throw the same error. Can anyone help to confirm that if Kibana can talk to tribe node? And anyone figure it out how to do it? Thank you! Abigail -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c0b2a218-572e-4e1f-929e-09852b847185%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Should I use elasticsearch as a core for faceted navigation-heavy website?
Short answer: yes. With properly sharded and scaled out environment, and using ES 1.4 or newer, you should be able to get those numbers. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Tue, Mar 24, 2015 at 5:38 PM, Dmitry dmitry.bit...@gmail.com wrote: Hello, I'm evaluating elasticsearch for use in new project. I can see that it has all search features we need. Problem is that after reading documentation and forum I still can't understand whether elastic is suitable technology for us performance-wise. I'd be very grateful to get your opinion on that. We're building a directory of businesses, similar to Yelp. We have 5m businesses, and main feature of our site is faceted search on different facets: geography (tens of thousands geo objects), business type (several thousand options), additional services offered by that business (hundreds) and so on. So for each request (combination of search parameters) we need to get search results, but also what options are available in each facet (for example, what business types are located are in selected geography) for user to be able to narrow down his search (example: http://take.ms/oAZan). Full text search (by business name for example) is used in very small percentage of requests, bulk of requests is exact match on one or several facets. Based on the similar our project we expect 1-5m requests per day. All requests are highly diversified: no single page (combination of search params) constitutes more than 0,1% of total requests. We expect to be able to answer request in 200-300ms, so I guess request to elasticsearch should take no more than 100ms. On our similar project we use big lookup table in database with all possible combinations of params mapped to search result count. For each request we generate all possible combinations of parameters to refine current search and then check lookup table to see if they have any results. My questions are: Is elastic search suitable for our purposes? Specifically, are aggregations meant to be used in large number of low-latency requests, or are they more like analytical feature, where response time is not that important? I ask that because in discussions of aggregation and faceting performance here and elsewhere response times are mentioned in 1-10s range, which is ok for analytics and infrequent searches, but obviously on ok for us. How hard it is to get performance we need: 50 rps, 100ms response time for search+facets, on some reasonable hardware, taking into account big number of possible facet combinations and high diversification of requests? What kind of hardware should we expect to handle our loads? I understand that these are vague questions, but I just need some approximation. Is it more like 1 server with commodity hardware and simple configuration, or more like cloud of 10 servers and extensive tuning? For example, our lookup table solution works on 1 commodity server with 16gb of ram with almost default setup. Thank you for your responses, Dmitry -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsOOMoJUDmdcH-zsL4EHymR6SUw6T3dKSs_UHfVq0mtCw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
Are you using KB4 or 3? If 4 then you need to refresh the fields under the index settings. On 25 March 2015 at 11:06, David Kleiner david.klei...@gmail.com wrote: My confusion just leveled-up :) So the mapping is updated, all string fields now have .raw copies but I don't see them in Kibana. Perhaps my understanding of this mechanism is incomplete and I need to manually copy fields to their .raw counterparts in logstash so that ES query returns documents with .raw fields in them.. And I did remove the TTL so my cluster is happy again. On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote: I got the index template to work, turns out it's really straightforward. I now do have .raw fields and all the world is green. I'll add the template to my other index so it gets geoip mapping when I roll to April. Cheers, David [trimmed] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93PAVFmAfjODzhD93Hn3XJCKuuyco4NU5vXczc3aGyhg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Install Kibana4 as a plugin
No you cannot, it runs on its own port. On 24 March 2015 at 23:56, DK desmond.kirr...@gmail.com wrote: Hi, Is it possible to install Kibana 4 as a plugin to Elasticsearch 1.5.0? If so do I need to configure anything afterwards? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9y8fnOrfTCr%3DuULF4C9Js0ntJ_yUxZKLJEL%2ByZA4dOxg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Is java elasticsearch a joke?
The problem is in either case, I see no new index or anything. On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote: I am using elasticsearch java client, indexing a document looks simple. I know 100% my cluster is running, and my name is create. Node node = nodeBuilder().clusterName(mycluster).node(); Client client = node.client(); String json = { + \user\:\kimchy\, + \postDate\:\2013-01-30\, + \message\:\trying out Elasticsearch\ + }; IndexResponse response = client.prepareIndex(twitter, tweet) .setSource(json) .execute() .actionGet(); I read somewhere that you should shove .client(true), but I get: org.elasticsearch.discovery.MasterNotDiscoveredException What gives? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Is java elasticsearch a joke?
I am using elasticsearch java client, indexing a document looks simple. I know 100% my cluster is running, and my name is create. Node node = nodeBuilder().clusterName(mycluster).node(); Client client = node.client(); String json = { + \user\:\kimchy\, + \postDate\:\2013-01-30\, + \message\:\trying out Elasticsearch\ + }; IndexResponse response = client.prepareIndex(twitter, tweet) .setSource(json) .execute() .actionGet(); I read somewhere that you should shove .client(true), but I get: org.elasticsearch.discovery.MasterNotDiscoveredException What gives? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6b7b18e-69a5-41c3-a192-27cf37b230fd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
My confusion just leveled-up :) So the mapping is updated, all string fields now have .raw copies but I don't see them in Kibana. Perhaps my understanding of this mechanism is incomplete and I need to manually copy fields to their .raw counterparts in logstash so that ES query returns documents with .raw fields in them.. And I did remove the TTL so my cluster is happy again. On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote: I got the index template to work, turns out it's really straightforward. I now do have .raw fields and all the world is green. I'll add the template to my other index so it gets geoip mapping when I roll to April. Cheers, David [trimmed] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Help with mapping update
Indeed, it just works in 4! Thank you again, David On Tuesday, March 24, 2015 at 5:49:16 PM UTC-7, Mark Walkom wrote: Are you using KB4 or 3? If 4 then you need to refresh the fields under the index settings. On 25 March 2015 at 11:06, David Kleiner david@gmail.com javascript: wrote: My confusion just leveled-up :) So the mapping is updated, all string fields now have .raw copies but I don't see them in Kibana. Perhaps my understanding of this mechanism is incomplete and I need to manually copy fields to their .raw counterparts in logstash so that ES query returns documents with .raw fields in them.. And I did remove the TTL so my cluster is happy again. On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote: I got the index template to work, turns out it's really straightforward. I now do have .raw fields and all the world is green. I'll add the template to my other index so it gets geoip mapping when I roll to April. Cheers, David [trimmed] -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/14929de4-fce5-4a49-aaa3-ebda189632b7%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: filter bitsets
Thanks for the clarification - this was very helpful. On Friday, March 20, 2015 at 5:45:59 AM UTC-7, Jörg Prante wrote: Caching filters are implemented in ES, not in Lucene. E.g. org.elasticsearch,common.lucene.search.CachedFilter is a class that implements cached filters on the base of Lucene filter class. The format is not only bitsets. The Lucene filter instance is cached, no matter if it is doc sets or bit sets or whatever. ES code extends Lucene filters by several methods for fast evaluation and traversal. ES evaluates the filter in the given filter chain order, from outer to inner (also called top down). When a series of boolean filters (i.e. should/must/must_not) is used, they can be evaluated efficiently by composition. See org.elasticsearch,common.lucene.search.XBooleanFilter for the composition algorithm. Field data will be loaded when a field is used for operations like filter or sort. The higher the cardinality, the more effort is needed. This is because the index is inverted. Jörg On Fri, Mar 20, 2015 at 3:30 AM, Ashish Mishra laughin...@gmail.com javascript: wrote: Not sure I understand the difference between composable vs. cacheable. Can filters be cached without using bitsets? What format are the results stored in, if not as bitsets? In the example below, would the string range field y filter be evaluated on every document in the index, or just on the documents matching the previous field x filter? Also, will y field data be loaded for all documents in the index, or just for the documents matching the previous filter. On Thursday, March 19, 2015 at 3:21:12 AM UTC-7, Jörg Prante wrote: There are several concepts: - filter operation (bool, range/geo/script) - filter composition (composable or not, composable means bitsets are used) - filter caching (ES stores filter results or not, if not cached, ES must walk doc-by-doc to apply filter) #1 says you should take care what kind of inner filter the and/or/not filter uses, and then you should arrange filters in the right order to avoid unnecessary complexity #2 most of the filters are cacheable, but not by default. These doc try to explain how the and filter consists of inner filter clauses and what is happening because default caching is off. I can not see this is implying bitsets. #3 correct interpretation The use of bitsets is a pointer for composable filters, these should/must/mustnot filters use an internal Lucene bitset implementation for efficient computation. Jörg On Thu, Mar 19, 2015 at 5:58 AM, Ashish Mishra laughin...@gmail.com wrote: I'm trying to optimize filter queries for performance and am slightly confused by the online docs. Looking at: 1) https://www.elastic.co/blog/all-about-elasticsearch-filter-bitsets 2) http://www.elastic.co/guide/en/elasticsearch/reference/ current/query-dsl-and-filter.html 3) http://www.elastic.co/guide/en/elasticsearch/guide/ current/_filter_order.html #1 says that Bool filter uses bitsets, while And/Or/Not does doc-by-doc matching. #2 says that And result is optionally cacheable (implying that it uses bitsets). #3 says that Bool does doc-by-doc matching if the inner filters are not cacheable. This is confusing, is there a clear guideline on when bitsets are used? Let's say I have two high-cardinality fields, x and y. Field data for y is loaded into memory, while x is not. What is the optimal way to structure this query? filter: { and: [ { term: { x: F828477AF7, _cache: false // Don't want to cache since query will not be repeated } }, { range: { y: { gt: CB70V63BD8AE // String range query, should only be executed on result of previous filters } } } ] } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/52dd306b-d229-462b-8b3c-b9cb2fff8c5f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/52dd306b-d229-462b-8b3c-b9cb2fff8c5f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0dbceece-5c74-4867-90df-951f8f0cae8a%40googlegroups.com
Re: Is java elasticsearch a joke?
Yes, which is why I was baffled. In fact, the code I used was straight out of that specific page. The client code also came straight from the docs. (http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/client.html): So the code I used from the documentation that appears so basic does not work even though my cluster is running. The docs make it very simple, so I am surprised it does not work out of the box on my local machine with a single node (running on my local machine on OSX installed using brew for latest ES 1.5, the Java Elasticsearch jar is also 1.5: Node node = nodeBuilder().clusterName(yourclustername).node(); // I replaced yourclustername with my cluster name //Node node = nodeBuilder().clusterName(yourclustername).client(true).node(); // this does not work either Client client = node.client(); String json = { + \user\:\kimchy\, + \postDate\:\2013-01-30\, + \message\:\trying out Elasticsearch\ + }; IndexResponse response = client.prepareIndex(twitter, tweet) .setSource(json) .execute() .actionGet(); What is the problem? On Wednesday, March 25, 2015 at 12:34:49 AM UTC-4, Mark Walkom wrote: I don't mean to be rude, but just because it doesn't work doesn't make it a joke. Have you read the docs? http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html On 25 March 2015 at 14:44, Sai Asuka asuka...@gmail.com javascript: wrote: The problem is in either case, I see no new index or anything. On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote: I am using elasticsearch java client, indexing a document looks simple. I know 100% my cluster is running, and my name is create. Node node = nodeBuilder().clusterName(mycluster).node(); Client client = node.client(); String json = { + \user\:\kimchy\, + \postDate\:\2013-01-30\, + \message\:\trying out Elasticsearch\ + }; IndexResponse response = client.prepareIndex(twitter, tweet) .setSource(json) .execute() .actionGet(); I read somewhere that you should shove .client(true), but I get: org.elasticsearch.discovery.MasterNotDiscoveredException What gives? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/252076e0-b1d9-4142-b108-72b47f7841dd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re[2]: ES: Ways to work with frequently updated fields of document
By different systems do you mean database, ES etc? Среда, 25 марта 2015, 8:32 +11:00 от Mark Walkom markwal...@gmail.com: Best best is to try to separate hot and cold/warm data so you are only updating things which *need* to be updated. It may make sense to split this out into different systems, but this is really up to you. On 25 March 2015 at 04:28, Александр Свиридов ooo_satu...@mail.ru wrote: I have forum. And every topic has such field as viewCount - how many times topic was viewed by forum users. I wanted that all fields of topics were taken from ES (id,date,title,content and viewCount). However, this case after every topic view ES must reindex entire document again - I asked the question about particial update at stack - http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed . It means that if topic is viewed 1000 times ES will index it 1000 times. And if I have a lot of users many documents will be indexed again and again. This is first strategy. The second strategy, as I think is to take some fields of topic from index and some from database. At this case I take viewAcount from DB. However, then I can store all fields in DB and use index only as INDEX - to get ids of current topic. Are there better strategies? What is the best way to solve such problem? -- Александр Свиридов -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com . To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru . For more options, visit https://groups.google.com/d/optout . -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com . To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9P6L49jbDpOccA_z-ckKVFjz1o%2BXWWttN%2BYneL0YBfDg%40mail.gmail.com . For more options, visit https://groups.google.com/d/optout . -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1427261899.358293155%40f317.i.mail.ru. For more options, visit https://groups.google.com/d/optout.
Re: Is java elasticsearch a joke?
I don't mean to be rude, but just because it doesn't work doesn't make it a joke. Have you read the docs? http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html On 25 March 2015 at 14:44, Sai Asuka asuka.s...@gmail.com wrote: The problem is in either case, I see no new index or anything. On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote: I am using elasticsearch java client, indexing a document looks simple. I know 100% my cluster is running, and my name is create. Node node = nodeBuilder().clusterName(mycluster).node(); Client client = node.client(); String json = { + \user\:\kimchy\, + \postDate\:\2013-01-30\, + \message\:\trying out Elasticsearch\ + }; IndexResponse response = client.prepareIndex(twitter, tweet) .setSource(json) .execute() .actionGet(); I read somewhere that you should shove .client(true), but I get: org.elasticsearch.discovery.MasterNotDiscoveredException What gives? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9h1na1Y4YHT%2Bug%2BDVEs4k9AstqB4Vx7OQzT9nbeXSVqg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Why _timestamp field is not returned with the format that we have set in the mapping?
Hi, I want to make use of the _timestamp field to keep track the last indexed datetime of an doc in ElasticSearch. I have set the mapping as below: mappings: { _default_: { _timestamp: { enabled: true, store: true, format: -MM-dd'T'HH:mm:ss.SSS } I have mentioned the _timestamp field with date format (as red font highlighted). However, when I query the doc, the _timestamp field is still showing as the unix format as below: fields: { _timestamp: 1427177681930 } *Question* 1) I am using ElasticSearch version 1.32. Is this a bug? 2) How can we get the _timestamp field with the correct format that we want? How to convert the _timestamp to a readable format? I use Sense to query. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68f60356-f3de-406b-890c-338e5cc31a64%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
How to merge terms and match_all
Hi Guys, I'm trying to merge both statement to one search as below detail *1. query.terms : I get filter set (A,B,C) from drop down list on webpage* { query: { filtered: { query: { bool: { should: [ { terms: { field1.raw: [ A,B,C ] } } ] } } } } } *2. query.match : I get filter data (D E F) from text box on webpage (Universal Search)* { query: { filtered: { query: { match: { _all: D E F } } } } } So I need to search by get filter information from drop down and text box on webpage to query data in ES in same time, Is it possible? If you need more informatin, please let me know. Thanks Hiko -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8587e6ca-3035-4b91-9652-a4562902a616%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Stempel (Polish) Analysis plugin 2.5.0 released
Heya, We are pleased to announce the release of the Elasticsearch Stempel (Polish) Analysis plugin, version 2.5.0. The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) analysis module into elasticsearch.. https://github.com/elastic/elasticsearch-analysis-stempel/ Release Notes - elasticsearch-analysis-stempel - Version 2.5.0 Update: * [39] - Update to elasticsearch 1.5.0 (https://github.com/elastic/elasticsearch-analysis-stempel/issues/39) * [36] - Analysis: Fix using PreBuiltXXXFactory (https://github.com/elastic/elasticsearch-analysis-stempel/pull/36) * [34] - Tests: upgrade randomizedtesting-runner to 2.1.10 (https://github.com/elastic/elasticsearch-analysis-stempel/issues/34) * [33] - Tests: index.version.created must be set (https://github.com/elastic/elasticsearch-analysis-stempel/issues/33) Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-analysis-stempel project repository: https://github.com/elastic/elasticsearch-analysis-stempel/ For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch Enjoy, -The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/55119ea2.28e4b40a.744c.07a6SMTPIN_ADDED_MISSING%40gmr-mx.google.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Japanese (kuromoji) Analysis plugin 2.5.0 released
Heya, We are pleased to announce the release of the Elasticsearch Japanese (kuromoji) Analysis plugin, version 2.5.0. The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.. https://github.com/elastic/elasticsearch-analysis-kuromoji/ Release Notes - elasticsearch-analysis-kuromoji - Version 2.5.0 Update: * [56] - Update to elasticsearch 1.5.0 (https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/56) * [49] - Tests: upgrade randomizedtesting-runner to 2.1.10 (https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/49) * [47] - Tests: index.version.created must be set (https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/47) * [34] - Use PreBuiltXXXFactory (https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/34) Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-analysis-kuromoji project repository: https://github.com/elastic/elasticsearch-analysis-kuromoji/ For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch Enjoy, -The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5511a528.c807b40a.61ab.7d19SMTPIN_ADDED_MISSING%40gmr-mx.google.com. For more options, visit https://groups.google.com/d/optout.
Re: Do I have to explicitly exclude the _all field in queries?
OK, I think I figured this out. It's the space between Brian and Levine. The query: query: Node.author:Brian Levine is actually interpreted as Node.author:Brian OR Levine in which case Levine is searched for in the _all field. Seems so obvious now! ;-) -b On Tuesday, March 24, 2015 at 1:50:42 PM UTC-4, Brian Levine wrote: Hi all, I clearly haven't completely grokked something in how QueryString queries are interpreted. Consider the following query: { query: { query_string: { analyze_wildcard: true, query: Node.author:Brian Levine } }, fields: [Node.author], explain:true } Note: The Node.author field is not_analyzed. The results from this query include documents for which the Node.author field contains neither Brian nor Levine. In examining the the explanation, I found that the documents were included because another field in the document contained Levine. A snippet from the explanation shows that the _all field was considered: { value: 0.08775233, description: weight(_all:levine in 464) [PerFieldSimilarity], result of:, ... Do I need to explicitly exclude the _all field in the query? Separate question: Because the Node.author field is not_analyzed, I had thought that the value Brian Levine would also not be analyzed and therefore only documents whose Node.author field contained Brian Levine exactly would be matched, yet the explanation shows that the brian and levine tokens were considered. I also noticed that if I change the query to: query: Node.author:(Brian Levine) then result set changes. Only the documents whose Node.author field contains either brian OR levine are included (which is what I would have expected). According to the explanation, the _all field is not considered in this query. So I'm confused. Clearly, I don't understand how my original query is interpreted. Hopefully, someone can enlighten me. Thanks. -brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/63c5fff3-cfd5-4003-89ad-015d8fdb4879%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Search templates: partials
Nope. I reached my time limit for trying to get that approach to work and finished the job using one really big template. It's one more thing on my things to explore again list. I figured, though, that by postponing it, I would at least have the benefit of using a later version of ES when I got back to it. If you're still having issues in 1.4.4, maybe it's time to open an issue. On Tuesday, March 24, 2015 at 12:46:44 PM UTC-4, Zdeněk Šebl wrote: I have same troubles now on ES 1.4.4 Did you found solution how to use partials in Mustache file templates? Thanks, Zdenek Dne čtvrtek 7. srpna 2014 20:53:09 UTC+2 Glen Smith napsal(a): Developing on v1.2.2, I'm deploying mustache templates by file system under config/scripts. When I do something like this: parent.mustache { query: { filtered: { query: { bool: { must: [ {match_all:{}} {{#condition}} ,{match: {condition: {{condition {{/condition}} ] } }, {{ child}} } } } child.mustache filter: { bool: { must: [ {query: {match_all:{}}} {{#status}} ,{ query: { match: { status: {{status}} } } } {{/status}} ] } } I get: [2014-08-07 14:48:58,454][WARN ][script ] [x_node] failed to load/compile script [parent] org.elasticsearch.common.mustache.MustacheException: Template 'child' not found I've tried pathing in parent (scripts/child, config/scripts/child) with the same result. Are partials supported? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dca89f6c-ca6b-4f2f-8842-708fe822d014%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Search templates: partials
I have same troubles now on ES 1.4.4 Did you found solution how to use partials in Mustache file templates? Thanks, Zdenek Dne čtvrtek 7. srpna 2014 20:53:09 UTC+2 Glen Smith napsal(a): Developing on v1.2.2, I'm deploying mustache templates by file system under config/scripts. When I do something like this: parent.mustache { query: { filtered: { query: { bool: { must: [ {match_all:{}} {{#condition}} ,{match: {condition: {{condition {{/condition}} ] } }, {{ child}} } } } child.mustache filter: { bool: { must: [ {query: {match_all:{}}} {{#status}} ,{ query: { match: { status: {{status}} } } } {{/status}} ] } } I get: [2014-08-07 14:48:58,454][WARN ][script ] [x_node] failed to load/compile script [parent] org.elasticsearch.common.mustache.MustacheException: Template 'child' not found I've tried pathing in parent (scripts/child, config/scripts/child) with the same result. Are partials supported? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a4af502-4232-4c6c-98cf-a4c414fea3d1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch ICU Analysis plugin 2.5.0 released
Heya, We are pleased to announce the release of the Elasticsearch ICU Analysis plugin, version 2.5.0. The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.. https://github.com/elasticsearch/elasticsearch-analysis-icu/ Release Notes - elasticsearch-analysis-icu - Version 2.5.0 Update: * [49] - Update to elasticsearch 1.5.0 (https://github.com/elastic/elasticsearch-analysis-icu/issues/49) * [43] - Tests: upgrade randomizedtesting-runner to 2.1.10 (https://github.com/elastic/elasticsearch-analysis-icu/issues/43) * [41] - Tests: index.version.created must be set (https://github.com/elastic/elasticsearch-analysis-icu/issues/41) Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-analysis-icu project repository: https://github.com/elasticsearch/elasticsearch-analysis-icu/ For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch Enjoy, -The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5511950b.e662b40a.7ce1.8291SMTPIN_ADDED_MISSING%40gmr-mx.google.com. For more options, visit https://groups.google.com/d/optout.
ES: Ways to work with frequently updated fields of document
I have forum. And every topic has such field as viewCount - how many times topic was viewed by forum users. I wanted that all fields of topics were taken from ES (id,date,title,content and viewCount). However, this case after every topic view ES must reindex entire document again - I asked the question about particial update at stack - http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed . It means that if topic is viewed 1000 times ES will index it 1000 times. And if I have a lot of users many documents will be indexed again and again. This is first strategy. The second strategy, as I think is to take some fields of topic from index and some from database. At this case I take viewAcount from DB. However, then I can store all fields in DB and use index only as INDEX - to get ids of current topic. Are there better strategies? What is the best way to solve such problem? -- Александр Свиридов -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru. For more options, visit https://groups.google.com/d/optout.
Re: ES JVM memory usage consistently above 90%
Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjdu...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADM0w%3Dh%3DMgfP%3DDSR0PmhVgXAGcApMDxZE%3D4_jVJXbRwk8CzTHw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: ES JVM memory usage consistently above 90%
When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com javascript: wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Search templates: partials
Thanks for info. I am going to open new issue. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/338d2bb4-7bf2-4794-abc1-c6aa4a40ad55%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Kibana: how to show values directly without aggregation?
ELK experts, I'm desperately need your help. I've spend hours and days on this simple thing but still can't figure it out. My settings are Elasticsearch 1.4.4 and Kibana 4.0.1. My program feeds data into ElasticSearch. The data contain a timestamp and a value. I want to create a bar chart, with the timestamp as X-axis and the value as Y-axis, directly. No aggregation. How can I make it? Appreciate any hints. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7dedd653-c168-4f4f-ac06-30bd3a9b155c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ElasticSearch data directory on Windows 7
@Mark the link you posted is for linux paths. I assumed windows path will be on similar lines and expected index data to be on below path: C:\ES home\data\elasticsearch\nodes\0\indices\idd I see the path looks like it contains all the index data but I am puzzled why the size of data directory is only 77 KB. There is a file of size 2 KB and it seems to contain some data that looks encoded (it appears my application data) C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state My question is how can 2.2GB size data is compressed into 2KB unless I am looking at wrong directory. On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote: Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-dir-layout.html#_zip_and_tar_gz On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com javascript: wrote: Hi , I have downloaded and installed ES on windows 7 and imported 2million+ records into it. (from a csv of size 2.2GB). I can search records using fiddler and it works as expected. I was hoping to see size of below directory to increase few GB but the size is only 77 kb. C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data This means ES stores data in some other directory on disk. Can you please tell me the directory location where ES stores the data? Regards, Ravi -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ElasticSearch data directory on Windows 7
Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-dir-layout.html#_zip_and_tar_gz On 24 March 2015 at 16:34, Ravi Gupta guptarav...@gmail.com wrote: Hi , I have downloaded and installed ES on windows 7 and imported 2million+ records into it. (from a csv of size 2.2GB). I can search records using fiddler and it works as expected. I was hoping to see size of below directory to increase few GB but the size is only 77 kb. C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data This means ES stores data in some other directory on disk. Can you please tell me the directory location where ES stores the data? Regards, Ravi -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_GsyAuw_qQDwHZUY2PevFesfLymY0yPmZ5tQyaenxHzw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
[Spark] SchemaRdd saveToEs produces Bad JSON errors
Hi, I'm a very new user to ES so Im hoping someone can point me to what I'm doing wrong. I did the following: download a fresh copy of Spark1.2.1. Switch to bin folder and ran: wget https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/2.1.0.BUILD-SNAPSHOT/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar ./spark-shell --jars elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar import org.apache.spark.sql.SQLContext case class KeyValue(key: Int, value: String) val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext._ sc.parallelize(1 to 50).map(i=KeyValue(i, i.toString)) .saveAsParquetFile(large.parquet) parquetFile(large.parquet).registerTempTable(large) val schemaRDD = sql(SELECT * FROM large) import org.elasticsearch.spark._ schemaRDD.saveToEs(test/spark) At this point I get errors like this: 15/03/24 11:02:35 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 15/03/24 11:02:35 INFO DAGScheduler: Job 2 failed: runJob at EsSpark.scala:51, took 0.302236 s org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskComplet ionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[[38,38]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive x content from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51, 56, 44, 34, 51, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51 , 57, 44, 34, 51, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 48, 44, 34, 52, 48, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 4 9, 44, 34, 52, 49, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 50, 44, 34, 52, 50, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 51, 44, 34, 52, 51, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 52, 44, 34, 52, 52, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 53, 44, 34, 52, 53, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 54, 44, 34, 52, 54, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 55, 44, 34 , 52, 55, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 56, 44, 34, 52, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 57, 44, 34, 5 2, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 53, 48, 44, 34, 53, 48, 34, 93, 10]]; ]]; Bailing out.. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76) Any idea what I'm doing wrong? (I started with the Beta3 jar before trying the nightly but also with no luck). I am running against elasticsearch1.4.4 /spark_1.2.1/bin$ curl localhost:9200 { status : 200, name : Wizard, cluster_name : elasticsearch, version : { number : 1.4.4, build_hash : c88f77ffc81301dfa9dfd81ca2232f09588bd512, build_timestamp : 2015-02-19T13:05:36Z, build_snapshot : false, lucene_version : 4.10.3 }, tagline : You Know, for Search } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Using scroll and different results sizes
Hey all, We use ES as our indexing sub system. Our canonical record store is Cassandra. Due to the denormalization we perform for faster query speed, occasionally the state of documents in ES can lag behind the state of our Cassandra instance. To accommodate this eventually consist system, our query path is the following. Query ES - returns scrollId and document ids- Load entities from Cassandra - Drop stale documents from results and asynchronously remove them. If the user requested 10 entities, and only 8 were current, we will be missing 2 results and need to make another trip to ES to satisfy the result size. Is it possible to do the following? //get first page with 10 results POST /testindex/mytype/_search?scroll=1m {from:0, size: 10} POST /testing/mytype/_search?scroll_id= result from previous {size:2} I always seem to get back 10 on my second request. If not other option is available, I'd rather truncate our result set size to the user's requested size than take the performance hit of using from and size to drop results. Our document counts can be 50million+, so I'm concerned with the performance implications of not using scroll. Thoughts? Thanks, Todd -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4a6cf1a9-9b96-47c2-a3f6-0955b3e74283%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch with JSON-array, causing serialize -error
same issue here. Any clues? On Sunday, April 20, 2014 at 2:47:40 PM UTC-3, PyrK wrote: I'm using elasticsearch with mongodb -collection using elmongo https://github.com/usesold/elmongo. I have a collection (elasticsearch index's point of view json-array), that contains for example field: random_point: [ 0.10007477086037397, 0 ] That's most likely the reason I get this error, when trying to index my collection. [2014-04-20 16:48:51,228][DEBUG][action.bulk ] [Emma Frost] [ mediacontent-2014-04-20t16:48:44.116z][4] failed to execute bulk item ( index) index {[mediacontent-2014-04$ org.elasticsearch.index.mapper.MapperParsingException: object mapping [ random_point] trying to serialize a value with no field associated with it , current value [0.1000747708603739$ at org.elasticsearch.index.mapper.object.ObjectMapper. serializeValue(ObjectMapper.java:595) at org.elasticsearch.index.mapper.object.ObjectMapper.parse( ObjectMapper.java:467) at org.elasticsearch.index.mapper.object.ObjectMapper. serializeValue(ObjectMapper.java:599) at org.elasticsearch.index.mapper.object.ObjectMapper. serializeArray(ObjectMapper.java:587) at org.elasticsearch.index.mapper.object.ObjectMapper.parse( ObjectMapper.java:459) at org.elasticsearch.index.mapper.DocumentMapper.parse( DocumentMapper.java:506) at org.elasticsearch.index.mapper.DocumentMapper.parse( DocumentMapper.java:450) at org.elasticsearch.index.shard.service.InternalIndexShard. prepareIndex(InternalIndexShard.java:327) at org.elasticsearch.action.bulk.TransportShardBulkAction. shardIndexOperation(TransportShardBulkAction.java:381) at org.elasticsearch.action.bulk.TransportShardBulkAction. shardOperationOnPrimary(TransportShardBulkAction.java:155) at org.elasticsearch.action.support.replication. TransportShardReplicationOperationAction$AsyncShardOperationAction. performOnPrimary(TransportShardReplicationOperationAction$ at org.elasticsearch.action.support.replication. TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run( TransportShardReplicationOperationAction.java:430) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701) [2014-04-20 16:48:54,129][INFO ][cluster.metadata ] [Emma Frost] [ mediacontent-2014-04-20t16:39:09.348z] deleting index Is there any ways to bypass this? That array is a needed value in my collection. Is there anyways to give some option in elasticsearch to not to index that JSON-field, tho it's not going to be searchable field at all? Best regards, PK -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bfd442a0-ef52-4ae9-be6c-5e98475cbff4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ES JVM memory usage consistently above 90%
Thanks. Will try starting again. By the way, what will be the impact if I just delete the node/1 (then the node.max_local_storage_nodes to 1 and start elasticsearch) ? On Wed, Mar 25, 2015 at 1:36 AM, mjdu...@gmail.com wrote: As long as it's the first ES instance starting on that node it'll grab 0 instead of 1. I don't know if you can explicitly set the data node directory in the config. On Tuesday, March 24, 2015 at 2:16:57 PM UTC-4, Yogesh wrote: Thanks. Mine is a one node cluster so I am simply using the XCURL to shut it down and the doing bin/elasticsearch -d to start it up. To check if it has shutdown, I try to hit it using http. So, now how do I start it with the node/0 data directory? There is nothing there in node/1 data directory but I don't suppose deleting it would be the solution? (Sorry for the basic questions I am new to this!) On Tue, Mar 24, 2015 at 11:32 PM, mjd...@gmail.com wrote: Sometimes that happens when the new node starts up via monit or other automated thing before the old node is fully shutdown. I'd suggest shutting down the node and verify it's done via ps before allowing the new node to start up. In the case of monit if the check is hitting the http port then it'll think it's down before it actually fully quits. On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote: Thanks a lot mjdude! It does seem like it attached to the wrong data directory. In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but node stats shows the data directory as elasticsearch/data/tool/nodes/ 1. Now, how do I change this? On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote: When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/to pic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0f990291-fa2 0-4bba-881f-3f378985c8c9%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/to pic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer .
Re: ES JVM memory usage consistently above 90%
As long as it's the first ES instance starting on that node it'll grab 0 instead of 1. I don't know if you can explicitly set the data node directory in the config. On Tuesday, March 24, 2015 at 2:16:57 PM UTC-4, Yogesh wrote: Thanks. Mine is a one node cluster so I am simply using the XCURL to shut it down and the doing bin/elasticsearch -d to start it up. To check if it has shutdown, I try to hit it using http. So, now how do I start it with the node/0 data directory? There is nothing there in node/1 data directory but I don't suppose deleting it would be the solution? (Sorry for the basic questions I am new to this!) On Tue, Mar 24, 2015 at 11:32 PM, mjd...@gmail.com javascript: wrote: Sometimes that happens when the new node starts up via monit or other automated thing before the old node is fully shutdown. I'd suggest shutting down the node and verify it's done via ps before allowing the new node to start up. In the case of monit if the check is hitting the http port then it'll think it's down before it actually fully quits. On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote: Thanks a lot mjdude! It does seem like it attached to the wrong data directory. In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but node stats shows the data directory as elasticsearch/data/tool/nodes/ 1. Now, how do I change this? On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote: When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/to pic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To
Re: ES JVM memory usage consistently above 90%
Thanks a lot mjdude! It does seem like it attached to the wrong data directory. In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but node stats shows the data directory as elasticsearch/data/tool/nodes/1. Now, how do I change this? On Tue, Mar 24, 2015 at 11:02 PM, mjdu...@gmail.com wrote: When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADM0w%3DhB3Qxs2a01yLtcLPaKUodOh2jimniow76fPLZ2oz12aQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch shield relam problem
Phani, We just released Shield 1.1 and 1.2 (https://www.elastic.co/blog/shield-1-1-and-1-2-released). LDAP user search is included and may be worth trying out. If you were to use it, I think your configuration would look something like: shield: authc: realms: ldap1: type: ldap order: 0 url: ldap://ldapserver:389; bind_dn: cn=Manager,dc=test,dc=org bind_password: changeme user_search: base_dn: ou=People,dc=test,dc=org group_search: base_dn: dc=test,dc=org This assumes the cn=Manager,dc=test,dc=org is a user with search credentials on the ldap. The earlier questions I had about groups would still apply On Monday, March 23, 2015 at 6:08:48 PM UTC-4, Jay Modi wrote: Since you are using uid, your setup would look something like this shield: authc: realms: ldap1: type: ldap order: 0 url: ldap://ldapserver:389; user_dn_templates: - uid={0}, ou=People,dc=test,dc=org This assumes all users are directly in the People OU. If that is not the case, you'll have to update the template or add additional templates. Can you tell me a little more about how the groups are setup in your ldap? What is their objectClass and do they have the member, unqiueMember, or memberUid attribute? You will probably need to configure the group search and that additional information will be necessary to ensure it works. Also to help with debugging, it is helpful to set shield.authc: DEBUG in the logging.yml file On Monday, March 23, 2015 at 2:43:29 AM UTC-4, phani.n...@goktree.com wrote: Hi Jay, sorry for late reply . I am using openldap server .i followed the configurations given by es people i did like in example but i am not able to login with ldap credentials.is ldap in elastic search is mount ldap or it will import users in to the file? i have tried following link http://www.elastic.co/guide/en/shield/current/ldap.html . but i didn't get proper result i have the following configurations to my LDAP server.please find the following. Principal : cn=Manager,dc=test,dc=org Base DN : ou=People,dc=test,dc=org filter : uid=%s the above are my ldap configuration details please suggest me how can we achieve with above credentials my using above link ( http://www.elastic.co/guide/en/shield/current/ldap.html ) Thanks, phani On Wednesday, March 18, 2015 at 8:05:37 PM UTC+5:30, Jay Modi wrote: What type of LDAP server are you integrating with? We have some documentation for LDAP setup, http://www.elastic.co/guide/en/shield/current/ldap.html. If you are using Active Directory, there is a specific realm for it that abstracts some of the LDAP setup to make it simpler: http://www.elastic.co/guide/en/shield/current/active_directory.html On Wednesday, March 18, 2015 at 9:12:27 AM UTC-4, phani.n...@goktree.com wrote: Thank you Jay for quick reply yes it got worked I changed the path to es_home config.now authentication is performing fine next I am looking in to LDAP integration with elastic search can you suggest me steps how can we integrate ldap to elasticsearch. Thanks phani. On Wednesday, March 18, 2015 at 6:20:29 PM UTC+5:30, Jay Modi wrote: Hi Phani, I think the correct thing to do is: export ES_JAVA_OPTS=-Des.path.conf=/etc/elasticsearch bin/shield/esusers useradd es_admin -r admin Verify that /etc/elasticsearch/shield/users exists and contains an entry for the admin user. Once you have confirmed that, then try to authenticate. The issue with steps you have taken is that your elasticsearch instance is looking for configuration in /etc/elasticsearch and the configuration for Shield is in ES_HOME by default. The packaged versions of elasticsearch expect all configuration (including that for plugins) to be in /etc/elasticsearch. We're looking at how we can make this easier. On Wednesday, March 18, 2015 at 5:33:36 AM UTC-4, phani.n...@goktree.com wrote: HI Jay, Thank you for the reply i tried the following steps. i did .rpm installation in linux servers my configuration file located at /etc/elasticsearch (main es coniguration file) But when i install shied i see there is a configurations directory created inside ES_HOME(/usr/share/elasticsearch/config) I issued following command to add path :export ES_JAVA_OPTS=-Des.path.conf=/usr/share/elasticsearch/config i am able to create user but when i try to authenticate it is not validating even though we added the path. please suggest me if i am doing wrong here? On Monday, March 16, 2015 at 10:12:00 PM UTC+5:30, Jay Modi wrote: Hi Phani, How did you install elasticsearch and where is your elasticsearch configuration located? If you have used a RPM or DEB package, you will need to add an environment variable before running the
Cluster state storage question
Hi, I am starting to look at the size of my cluster state and I started to notice that the shard information is duplicated. One grouping seems to be from the view of the index and the other from which shards live on which host. I'm sure there's a logical reason for this, i'm just interested to know why? This probably also compresses pretty well. Cheers, Rob -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/43907b97-4ef9-4950-87a0-20edd6be663a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Phonetic Analysis plugin 2.5.0 released
Heya, We are pleased to announce the release of the Elasticsearch Phonetic Analysis plugin, version 2.5.0. The Phonetic Analysis plugin integrates phonetic token filter analysis with elasticsearch.. https://github.com/elastic/elasticsearch-analysis-phonetic/ Release Notes - elasticsearch-analysis-phonetic - Version 2.5.0 Update: * [39] - Update to elasticsearch 1.5.0 (https://github.com/elastic/elasticsearch-analysis-phonetic/issues/39) * [34] - Tests: upgrade randomizedtesting-runner to 2.1.10 (https://github.com/elastic/elasticsearch-analysis-phonetic/issues/34) * [33] - Tests: index.version.created must be set (https://github.com/elastic/elasticsearch-analysis-phonetic/issues/33) Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-analysis-phonetic project repository: https://github.com/elastic/elasticsearch-analysis-phonetic/ For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch Enjoy, -The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5511a183.4ad3b40a.1a8a.0a88SMTPIN_ADDED_MISSING%40gmr-mx.google.com. For more options, visit https://groups.google.com/d/optout.
Do I have to explicitly exclude the _all field in queries?
Hi all, I clearly haven't completely grokked something in how QueryString queries are interpreted. Consider the following query: { query: { query_string: { analyze_wildcard: true, query: Node.author:Brian Levine } }, fields: [Node.author], explain:true } Note: The Node.author field is not_analyzed. The results from this query include documents for which the Node.author field contains neither Brian nor Levine. In examining the the explanation, I found that the documents were included because another field in the document contained Levine. A snippet from the explanation shows that the _all field was considered: { value: 0.08775233, description: weight(_all:levine in 464) [PerFieldSimilarity], result of:, ... Do I need to explicitly exclude the _all field in the query? Separate question: Because the Node.author field is not_analyzed, I had thought that the value Brian Levine would also not be analyzed and therefore only documents whose Node.author field contained Brian Levine exactly would be matched, yet the explanation shows that the brian and levine tokens were considered. I also noticed that if I change the query to: query: Node.author:(Brian Levine) then result set changes. Only the documents whose Node.author field contains either brian OR levine are included (which is what I would have expected). According to the explanation, the _all field is not considered in this query. So I'm confused. Clearly, I don't understand how my original query is interpreted. Hopefully, someone can enlighten me. Thanks. -brian -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/120103bc-a60d-418a-b092-09f71732b682%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [Spark] SchemaRdd saveToEs produces Bad JSON errors
Hi, It appears the problem lies with what type of document you are trying to save to ES - which is basically invalid. For some reason the table/rdd schema gets lost and the content is serialized without it: |[38,38]| I'm not sure why this happens hence why I raised an issue [1] Thanks, [1] https://github.com/elastic/elasticsearch-hadoop/issues/403 On 3/24/15 5:11 PM, Yana Kadiyska wrote: Hi, I'm a very new user to ES so Im hoping someone can point me to what I'm doing wrong. I did the following: download a fresh copy of Spark1.2.1. Switch to bin folder and ran: |wget https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/2.1.0.BUILD-SNAPSHOT/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar ./spark-shell --jars elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar import org.apache.spark.sql.SQLContext case class KeyValue(key: Int, value: String) val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext._ sc.parallelize(1 to 50).map(i=KeyValue(i, i.toString)) .saveAsParquetFile(large.parquet) parquetFile(large.parquet).registerTempTable(large) val schemaRDD = sql(SELECT * FROM large) import org.elasticsearch.spark._ schemaRDD.saveToEs(test/spark) | At this point I get errors like this: |15/03/24 11:02:35 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 15/03/24 11:02:35 INFO DAGScheduler: Job 2 failed: runJob at EsSpark.scala:51, took 0.302236 s org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 10, localhost): org.apache.spark.util.TaskComplet ionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment received[[38,38]][MapperParsingException[failed to parse]; nested: ElasticsearchParseException[Failed to derive x content from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51, 56, 44, 34, 51, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51 , 57, 44, 34, 51, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 48, 44, 34, 52, 48, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 4 9, 44, 34, 52, 49, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 50, 44, 34, 52, 50, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 51, 44, 34, 52, 51, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 52, 44, 34, 52, 52, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 53, 44, 34, 52, 53, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 54, 44, 34, 52, 54, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 55, 44, 34 , 52, 55, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 56, 44, 34, 52, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 57, 44, 34, 5 2, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 53, 48, 44, 34, 53, 48, 34, 93, 10]]; ]]; Bailing out.. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76) | Any idea what I'm doing wrong? (I started with the Beta3 jar before trying the nightly but also with no luck). I am running against elasticsearch1.4.4 |/spark_1.2.1/bin$ curl localhost:9200 { status : 200, name : Wizard, cluster_name : elasticsearch, version : { number : 1.4.4, build_hash : c88f77ffc81301dfa9dfd81ca2232f09588bd512, build_timestamp : 2015-02-19T13:05:36Z, build_snapshot : false, lucene_version : 4.10.3 }, tagline : You Know, for Search } | -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com mailto:elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com?utm_medium=emailutm_source=footer. For more options, visit https://groups.google.com/d/optout. -- Costin -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5511A5B1.4070907%40gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: ES JVM memory usage consistently above 90%
Sometimes that happens when the new node starts up via monit or other automated thing before the old node is fully shutdown. I'd suggest shutting down the node and verify it's done via ps before allowing the new node to start up. In the case of monit if the check is hitting the http port then it'll think it's down before it actually fully quits. On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote: Thanks a lot mjdude! It does seem like it attached to the wrong data directory. In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but node stats shows the data directory as elasticsearch/data/tool/nodes/1. Now, how do I change this? On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com javascript: wrote: When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3ca6aa0-f3d9-4067-933c-1a2eb0a15075%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[ANN] Elasticsearch Smart Chinese Analysis plugin 2.5.0 released
Heya, We are pleased to announce the release of the Elasticsearch Smart Chinese Analysis plugin, version 2.5.0. Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module into elasticsearch.. https://github.com/elastic/elasticsearch-analysis-smartcn/ Release Notes - elasticsearch-analysis-smartcn - Version 2.5.0 Update: * [36] - Update to elasticsearch 1.5.0 (https://github.com/elastic/elasticsearch-analysis-smartcn/issues/36) * [32] - Tests: upgrade randomizedtesting-runner to 2.1.10 (https://github.com/elastic/elasticsearch-analysis-smartcn/issues/32) * [31] - Tests: index.version.created must be set (https://github.com/elastic/elasticsearch-analysis-smartcn/issues/31) Issues, Pull requests, Feature requests are warmly welcome on elasticsearch-analysis-smartcn project repository: https://github.com/elastic/elasticsearch-analysis-smartcn/ For questions or comments around this plugin, feel free to use elasticsearch mailing list: https://groups.google.com/forum/#!forum/elasticsearch Enjoy, -The Elasticsearch team -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5511a68b.655cb40a.704e.0a7dSMTPIN_ADDED_MISSING%40gmr-mx.google.com. For more options, visit https://groups.google.com/d/optout.
Re: ES JVM memory usage consistently above 90%
Thanks. Mine is a one node cluster so I am simply using the XCURL to shut it down and the doing bin/elasticsearch -d to start it up. To check if it has shutdown, I try to hit it using http. So, now how do I start it with the node/0 data directory? There is nothing there in node/1 data directory but I don't suppose deleting it would be the solution? (Sorry for the basic questions I am new to this!) On Tue, Mar 24, 2015 at 11:32 PM, mjdu...@gmail.com wrote: Sometimes that happens when the new node starts up via monit or other automated thing before the old node is fully shutdown. I'd suggest shutting down the node and verify it's done via ps before allowing the new node to start up. In the case of monit if the check is hitting the http port then it'll think it's down before it actually fully quits. On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote: Thanks a lot mjdude! It does seem like it attached to the wrong data directory. In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but node stats shows the data directory as elasticsearch/data/tool/nodes/ 1. Now, how do I change this? On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote: When it restarted did it attach to the wrong data directory? Take a look at _nodes/_local/stats?pretty and check the 'data' directory location. Has the cluster recovered after the restart? Check _cluster/health?pretty as well. On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote: Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap memory (I think since Marvel showed memory as 1 GB which corresponds to the heap, my RAM is 50GB) I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared and I'm freaked out because of it! So, after restarting my ES (curl shutdown) to increase the heap, the Marvel has stopped showing me my data (it still shows that disk memory is lower so that means the data is still on disk) and upon searching Sense shows IndexMissingException[[my_new_twitter_river] missing] Why is this happening?!?! On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote: Are you saying JVM is using 99% of the system memory or 99% of the heap? If it's 99% of the available heap that's bad and you will have cluster instability. I suggest increasing your JVM heap size if you can, I can't find it right now but I remember a blog post that used twitter as a benchmark and they also could get to ~50M documents with the default 1G heap. On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote: Hi, I have set up elasticsearch on one node and am using the Twitter river to index tweets. It has been going fine with almost 50M tweets indexed so far in 13 days. When I started indexing, the JVM usage (observed via Marvel) hovered between 10-20%, then started remaining around 30-40% but for the past 3-4 days it has continuously been above 90%, reaching 99% at times! I restarted elasticsearch thinking it might get resolved but as soon as I switched it back on, the JVM usage went back to 90%. Why is this happening and how can I remedy it? (The JVM memory is the default 990.75MB) Thanks Yogesh -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/to pic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40goo glegroups.com https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/ topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit
Elasticsearch on Mesos / Marathon using Docker container
Hi all, Is anybody here running Elasticsearch on Mesos Marathon using Docker containers? I’m wondering how much memory should I allocate to Elasticsearch Marathon task that is the only task running on the host e.g. all available memory as reported by Mesos or some smaller amount to leave something for the system? Please note that I’m not asking about the heap size but about memory allocated to the Marathon task. Any suggestions? Best, Domen -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc96a97a-1724-4390-af4e-362529cecd5d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
New to elastic Search and Kibana environment, how to link elastic search to cabin Error
Dear All, I do not know how to link my elastic search to the index pattern in Kinbana; I have elastic search running with two index items; this is the error message am getting: null is not an object (evaluating 'index.timeField.name') http://109.123.216.106/index.js?_b=5888:127043:55 wrappedCallback@http://109.123.216.106/index.js?_b=5888:20888:81 http://109.123.216.106/index.js?_b=5888:20974:34 $eval@http://109.123.216.106/index.js?_b=5888:22017:28 $digest@http://109.123.216.106/index.js?_b=5888:21829:36 $apply@http://109.123.216.106/index.js?_b=5888:22121:31 done@http://109.123.216.106/index.js?_b=5888:17656:51 completeRequest@http://109.123.216.106/index.js?_b=5888:17870:15 onreadystatechange@http://109.123.216.106/index.js?_b=5888:17809:26 Any solution? This is my first day to work with Kibana -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc4718c5-fc3b-40ce-b37d-cec99685304c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: ElasticSearch data directory on Windows 7
Actually as Windows users don't have an installer they need to use the zip, so the path in the link are valid. On 24 March 2015 at 17:38, Ravi Gupta guptarav...@gmail.com wrote: @Mark the link you posted is for linux paths. I assumed windows path will be on similar lines and expected index data to be on below path: C:\ES home\data\elasticsearch\nodes\0\indices\idd I see the path looks like it contains all the index data but I am puzzled why the size of data directory is only 77 KB. There is a file of size 2 KB and it seems to contain some data that looks encoded (it appears my application data) C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state My question is how can 2.2GB size data is compressed into 2KB unless I am looking at wrong directory. On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote: Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/ current/setup-dir-layout.html#_zip_and_tar_gz On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com wrote: Hi , I have downloaded and installed ES on windows 7 and imported 2million+ records into it. (from a csv of size 2.2GB). I can search records using fiddler and it works as expected. I was hoping to see size of below directory to increase few GB but the size is only 77 kb. C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data This means ES stores data in some other directory on disk. Can you please tell me the directory location where ES stores the data? Regards, Ravi -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-zDPEGDaW2x6dKtVPUkagG%2BvhwpEmn8aKc_KScasmuNg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Error: Configure an index pattern
Dear all, I do not know how to link my elastic search to the index pattern in Kinbana; I have elastic search running with two index items; this is the error message am getting: null is not an object (evaluating 'index.timeField.name http://index.timefield.name/') http://109.123.216.106/index.js?_b=5888:127043:55 wrappedCallback@http://109.123.216.106/index.js?_b=5888:20888:81http://109.123.216.106/index.js?_b=5888:20974:34 $eval@http://109.123.216.106/index.js?_b=5888:22017:28 $digest@http://109.123.216.106/index.js?_b=5888:21829:36 $apply@http://109.123.216.106/index.js?_b=5888:22121:31 done@http://109.123.216.106/index.js?_b=5888:17656:51 completeRequest@http://109.123.216.106/index.js?_b=5888:17870:15 onreadystatechange@http://109.123.216.106/index.js?_b=5888:17809:26 Any solution? This is my first day to work with Kibana -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c68ddcef-4b1a-45f7-8511-e6525701e9d5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [Kibana] Possibility to hide _source field from Discover detail view?
Thanks for the fast reply. I created https://github.com/elastic/kibana/issues/3429. On Monday, March 23, 2015 at 8:47:21 PM UTC+1, Mark Walkom wrote: There is not, if you want that it'd definitely be worth raising as a request on github though! On 23 March 2015 at 21:40, Görge Albrecht goerge@gmail.com javascript: wrote: Hi All, Is there any way in Kibana 4 to hide the _source field from the detail view on the Discover table? As all fields are already shown in the detail view, the additional _source field seems to increase the visual noise without adding any value. Thanks in advance, Görge -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/715cdb0b-a98b-427a-abcb-1231069be6f7%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/715cdb0b-a98b-427a-abcb-1231069be6f7%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a421efd7-1590-44f3-a041-60497db9c7a7%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Load ES Nested Mapping from Hive
Hi, I'm trying to load an ES nested mapping from Hive. My tables are listed below: CREATE TABLE ORDER( order_id bigint, total_amount bigint, customer bigint) STORED AS ORC CREATE TABLE ORDER_DETAILS( order_id bigint, Order_Item_id bigint, Item_Size bigint, Item_amount bigint, Item_type string) STORED AS ORC Order and Order_Details have a 1-M relationship I want to store the data above into Elasticseacrh, such that Order_Details is a Nested Mapping of Order. My ES Mapping would look somehwat like this { order : { orderid : {type: float }, total_amount : {type: float }, customer : {type: float } properties : { order_details : { type : nested, properties: { Order_Item_id : {type: float }, Item_Size : {type: float }, Item_amount : {type: float }, Item_type: {type: string } } } } } } My question is how do I convert the 1-M relation in Hive and load it into ES mapping? Possible Solution I will create another Hive table which will contain the join query of the two source Hive tables But what is structure of this Hive table, so that it maps to ES mapping? Any help/suggestions? Thanks Ankur -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cdfee0e9-e0fa-47e4-b847-55b010984574%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Unexpected behavior (to me :-) from function_score
Hi All, Starting to play with function_score and doc_values... I have a set of 30 documents with integers, ranging from 1 to 30. Now I want a query that returns all documents with value 5 on top, followed by the rest of the documents, scored by their distance to 5. Query: query: { function_score: { query: {match_all: {}}, boost_mode: replace, functions: [ { gauss: { int_field: { origin: 5, scale: 5, offset: 2, decay : 0.1 }} } ] } } This returns the document with 5 with a score of 1 (expected), followed by the rest of the documents with a score 0 (NOT expected). Does someone see what I'm doing wrong? Looking at the explain, I see weird values for the doc_values. Following is the explain for the top document (with 5 as its value). Note the stated values for doc_value and origin! _explanation: { value: 1, description: function score, product of:, details: [ { value: 1, description: Math.min of, details: [ { value: 1, description: function score, score mode [multiply], details: [ { value: 1, description: function score, product of:, details: [ { value: 1, description: match filter: *:* }, { value: 1, description: Function for field int_field:, details: [ { value: 1, description: exp(-0.5*pow(MIN of: [Math.max(Math.abs(-6.20093664E13(=doc value) - -6.20093664E13(=origin))) - 2.0(=offset), 0)],2.0)/5.428681023790649) } ] } ] } ] }, { value: 3.4028235e+38, description: maxBoost } ] }, Best regards, Jan -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e2d438dd-26af-43df-b50d-21fca69121f6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: PUT gzipped data into elasticsearch
Logstash has both Java and HTTP output, but I assume you want to use HTTP. Set http.compression parameter to true in ES configuration, then you can use gzip-compressed HTTP traffic using Accept-Encoding header. http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-http.html Jörg On Tue, Mar 24, 2015 at 2:30 PM, Marcel Matus matusmar...@gmail.com wrote: Hi Joe, no, we post data into elasticsearch using Logstash. Actually, the complete way is: APP - Logstash - RMQ - Logstash - ES. And when I include F5 VIP's, there are 6 services on the way to process these data... Marcel On Tue, Mar 24, 2015 at 1:50 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: Are you using Java API? Jörg On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com wrote: Hi, some of our data are big ones (1 - 10 MB), and if there are milions of those, it causes us trouble in our internal network. We would like to compress these data in generation time, and send over the network to elasticsearch in compressed state. I tried to find some solution, but usually I found only retrieving gzipped data. Is there a way, how to put compressed data into elasticsearch? Thanks, Marcel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to a topic in the Google Groups elasticsearch group. To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/iWT19g3_Viw/unsubscribe. To unsubscribe from this group and all its topics, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Marcel Matus -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3DzuBnV4LWr7ySj6arirFzjxgCARzq%2Bvh05mFqyJ7%2Bpg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.