Re: PayloadTermQuery in ElasticSearch

2015-03-20 Thread Devaraja Swami
Thanks for your help, Jorg. I'd already gotten the similarity and analysis part to work, without using a plugin. For analysis, I used the builtin delimited_payload_tokenizer. For similarity, I used a custom similarity (from my Lucene code) and just stuck it as a jar file into the lib folder of

Re: Index alias problem

2015-03-20 Thread Mark Walkom
Can you provide the errors you are seeing? On 19 March 2015 at 17:20, Mathias Adler mathias.ad...@rebtel.com wrote: Hi, I just upgraded my four node ES cluster to 1.4.2. After restart I cant retrieve data using Kibana3 unless I manually creates an alias for each index? I still have one

Re: Deleting old indices??

2015-03-20 Thread Mark Walkom
You should really be setting the event timestamp to the one from the log file. If you ask over on https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users you will get some guidance on that. On 19 March 2015 at 22:09, Siddharth Trikha siddharthtrik...@gmail.com wrote: I am using the ELK

Extract information from a index folder when es not running.

2015-03-20 Thread Pedro Joro
Hallo. Is it possible to extract information from a index folder, without elasticsearch running?? I'm learning different ways to troubleshoot elasticsearch and moving index folders from one server to another using scp does not work. Is there any command line tool that could dump the

Re: Limit large number of threads

2015-03-20 Thread Lukáš Vlček
Hi, please note that Bigdesk does not show all internal thread pools in ES now. Regards, Lukas On Fri, Mar 20, 2015 at 10:56 AM, Abid Hussain huss...@novacom.mygbiz.com wrote: Hi all, I know I'm not the first one wondering about the number of threads but I didn't find anything really

Re: Special Characters not indexed and hence not searchable

2015-03-20 Thread Muddadi Hemaanusha
Am using the pattern [^@:\/\.\!\=\-\\w\\p{L}\\d]+ in sense for settings. It is showing showing Bad string sytax error. Do I need to add anything for this pattern inorder to accept the string in sense. Where I would like to add special characters like'-','/','(',')' to my pattern.. Here is

Re: Index alias problem

2015-03-20 Thread Mathias Adler
Hi Mark, I don't get any errors in the logs. Froma ES 1.4 node Kibana returns: *No results* There were no results because no indices were found that match your selected time span So for some reason Kibana can't find the index if I don't set an index alias. _aliases?pretty output from ES1.4

Remember some information in a kind of query context in native script filter

2015-03-20 Thread Loïc Wenkin
Hi all, I searched multiple times around the Internet, but I cannot find out any valuable clue. I would like to remember some data across multiple hits when I execute a filter (written in Java) over a result. Be care that I want this set of data to be forgotten when the search is finished. Is

Limit large number of threads

2015-03-20 Thread Abid Hussain
Hi all, I know I'm not the first one wondering about the number of threads but I didn't find anything really appropriate to my question. We use ES with the default values for the thread pool sizes, that is actually (according to what bigdesk says): * Search 72 * Index 24 * Bulk 24 * Refresh 10

Difference between index

2015-03-20 Thread Sagar Shah
Hello everyone, I am looking at stats of my index created for an application. Index has been created with 5 shards and 0 replicas. I am not sure what is the difference between following? What does *count: 1260748 *represent? and what does *index_total: 498044* represent? I see total records

Getting port 9200 open for integration tests?

2015-03-20 Thread Kevin Burton
What's the best way to embed elastic search in my tests. I want integration tests for HTTP and want to do PUT/GET directly. I looked at ElasticsearchIntegrationTest but it only seems to use node or transport client directly. Is there a way to get it to expose a port? or maybe the port is

Re: Very slow queries for parent-child index

2015-03-20 Thread Chen Wang
Vlad, I tried similar thing a while back. Of cause the query performance depends on your ES configuration as well. But I finally ended up giving up on parent/child, and flat the parent/child document into nested object as the parent/child never gives me the performance I need. Chen On

Re: AutoCompletion Suggester - Duplicate record in suggestion return

2015-03-20 Thread Xavier TROMP
Hello everyone, I have the same problem using ES 1.4.4. Did someone came up with a better solution ? Any help would be greatly appreciated. Thank you, Xavier On Monday, December 1, 2014 at 11:21:14 AM UTC+1, Tom wrote: Hi, i still have same problems with completion suggest duplicates of

What is the best configuration to run on linux VPS server without crashing?

2015-03-20 Thread Yashin Soraballee
Hello guys, I am trying to run elasticsearch on a VPS server running CentOS with 4GB of RAM. It starts successfully but with the following errors and warning message below. # sudo service elasticsearch start error: permission denied on key 'vm.max_map_count' Starting elasticsearch: [ OK ]

Data Node Allocation Weight

2015-03-20 Thread smonasco
Hi, I'm in a bit of a crunch. We have some servers we added to our cluster whose processors were not as good as the older servers. I'd rather not get into how we got swindled into accepting them, but this leaves me in a place with imbalanced hardware between my nodes. The crunch bit comes

Re: PayloadTermQuery in ElasticSearch

2015-03-20 Thread joergpra...@gmail.com
Thanks for the hint that similarity class should be in the ES lib folder. I will try this to see if that enables my plugin code to have per-field custom similarity. Payloads are a broad subject. For example, in my plugin, payload filters are missing. Let's assume you use UIMA or some NLP tagging.

Re: filter bitsets

2015-03-20 Thread joergpra...@gmail.com
Caching filters are implemented in ES, not in Lucene. E.g. org.elasticsearch,common.lucene.search.CachedFilter is a class that implements cached filters on the base of Lucene filter class. The format is not only bitsets. The Lucene filter instance is cached, no matter if it is doc sets or bit

Re: Limit large number of threads

2015-03-20 Thread Abid Hussain
Thanks for clarification. Still I wonder why such a huge amount of thread is created and if this can lead to issues - especially as it seems to me that they are never released. Am Freitag, 20. März 2015 11:12:42 UTC+1 schrieb Lukáš Vlček: Hi, please note that Bigdesk does not show all

Re: Limit large number of threads

2015-03-20 Thread Abid Hussain
We're using 1.4.2 currently. Are there any issues known with this version? If there is nothing to worry about 400 threads, it's ok for me. I just wonder why they never seem to be released. Am Freitag, 20. März 2015 16:20:03 UTC+1 schrieb Jörg Prante: If thread counts go out of bounds, it may

Re: Limit large number of threads

2015-03-20 Thread Mark Walkom
Each segment, each network connection and a whole bunch of other things will add to this. 280 is a very low number and I wouldn't even worry about it. On 20 March 2015 at 06:08, Abid Hussain huss...@novacom.mygbiz.com wrote: Thanks for clarification. Still I wonder why such a huge amount of

Re: What is the best configuration to run on linux VPS server without crashing?

2015-03-20 Thread Mark Walkom
By the looks of things, you should look for a new provider as they are doing some things on the underlying hypervisor restricting ES from locking memory access. However your heap size is very small, how much data is in your cluster. On 20 March 2015 at 07:13, Yashin Soraballee

Re: Limit large number of threads

2015-03-20 Thread joergpra...@gmail.com
If thread counts go out of bounds, it may be a lockup somewhere. What version of ES do you use? Jörg On Fri, Mar 20, 2015 at 2:08 PM, Abid Hussain huss...@novacom.mygbiz.com wrote: Thanks for clarification. Still I wonder why such a huge amount of thread is created and if this can lead to

Re: Limit large number of threads

2015-03-20 Thread joergpra...@gmail.com
Hm, I doubt it is ok if a 1.4.0 node has 195 threads in state BLOCKED: Thread 19374: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=175 (Compiled frame)

Parent/Child relationship considerations

2015-03-20 Thread Joshua P
I've been reading up on parent/child relationships and had a couple questions: On the Practical Considerations http://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-performance.html page, it's suggested that you should Use parent-child relationships sparingly, and only

Re: Organizing data for Elasticsearch and Kibana

2015-03-20 Thread Karthik Sharma
How will moving to KB4 help in this case? Will it unlock the option of plotting dual axis graphs? Regards Karthik. On Friday, 20 March 2015 05:52:46 UTC+13, Mark Walkom wrote: KB reads data from Elasticsearch, so yeah an index is the same thing for both. Basically you either need a

Cluster issue - raiseTimeoutFailure

2015-03-20 Thread Alex
Hi, I have a java application that is indexind data in an Elasticsearch cluster(*3* *nodes*). The ES is well configured and is working ok(indexing the received data from java). Cluster configuration for each node from /etc/elasticsearch/elasticsearch.yml ES_MAX_MEM: 2g ES_MIN_MEM: 2g

Re: Limit large number of threads

2015-03-20 Thread joergpra...@gmail.com
I think you should check a thread dump created by tools like jstack if you have a high JVM thread count in state BLOCKED. This might be a pointer that something unusual is going on, but I'm not sure. Jörg On Fri, Mar 20, 2015 at 4:41 PM, Abid Hussain huss...@novacom.mygbiz.com wrote: We're

Re: standalone ES configuration

2015-03-20 Thread Mark Walkom
If you disable multicast you don't need to set unicast as well. On 20 March 2015 at 09:39, tbc tbchamb...@gmail.com wrote: I want to run standalone. I understand that 'es.discovery.zen.ping.multicast.enabled: false' is the essential setting, but I'm confused by advice given in the logstash

Re: Saved scripted metric aggregation in Elasticsearch and Kibana 4

2015-03-20 Thread Krzysztof Zarzycki
Hi, I'm also very interested in answer of Anna's question. I'll be grateful if anyone can help! W dniu środa, 26 listopada 2014 17:33:07 UTC+1 użytkownik Anna napisał: Hi Hendrik, thanks for your interest. I would like to approach the following use case: In Kibana 4, I would like to

Re: How does Elasticsearch convert dates to JSON string representations?

2015-03-20 Thread Mark Walkom
It's Elasticsearch, Elastic is the company :) We convert dates to unix epoch, which is why you should insert them as UTC. On 20 March 2015 at 10:22, Roger de Cordova Farias roger.far...@fontec.inf.br wrote: Elastic won't edit your source. The long type is used internally 2015-03-20 14:16

Re: Multiple indices vs. multiple shards approach

2015-03-20 Thread Vladi Feigin
Thank you Mark Can you please elaborate regarding the routing? Are you meaning using customer id as a routing value? Can you give an example? Link? Should I override the shard calculation function? בתאריך 20 במרץ 2015 19:43, מאת Mark Walkom markwal...@gmail.com: This is where you use routing and

Re: Multiple indices vs. multiple shards approach

2015-03-20 Thread Mark Walkom
This is where you use routing and aliases. Use routing to send each customers documents to a specific shard, you can then query using the same routing value and reduce your exposure. Then use aliases so you can easily move larger customers out to their own index if need be. On 20 March 2015 at

Re: How does Elasticsearch convert dates to JSON string representations?

2015-03-20 Thread Roger de Cordova Farias
Elastic won't edit your source. The long type is used internally 2015-03-20 14:16 GMT-03:00 Erik Iverson erikriver...@gmail.com: Hello everyone, I have a question on how Elasticsearch returns JSON representations of fields with the date type. My confusion comes from the fact that the page

Re: Multiple indices vs. multiple shards approach

2015-03-20 Thread christian . dahlqvist
Hi, You could get around this by using routing based on customer ID when indexing and searching. This will ensure that all documents belonging to a single customer will be located in the same shard, which means that each search for a specific customer can hit a single shard instead of all 9,

standalone ES configuration

2015-03-20 Thread tbc
I want to run standalone. I understand that 'es.discovery.zen.ping.multicast.enabled: false' is the essential setting, but I'm confused by advice given in the logstash group. In a two year-old post [1], Untergeek also implies that discovery.zen.ping.unicast.hosts should be set. That can't be

Re: Multiple indices vs. multiple shards approach

2015-03-20 Thread Matías Waisgold
I had a project with the same context. We decided to increase the # of shards as it was impossible to have one index for each customer. Another approach is to have only some customers (hardcoded) separated from the rest. If you can, in advance, detect this users it might be a good idea and then

Re: What is the best configuration to run on linux VPS server without crashing?

2015-03-20 Thread Yashin Soraballee
Hello Mark, Thanks for the information. There is no much data in the cluster. My website is still on pre-production, and it crashes with the least amount of data. I've made only about ten entries! Even if I don't use the service ( no web interaction ), it crashes itself after a while :( If you

Multiple indices vs. multiple shards approach

2015-03-20 Thread Vladi Feigin
Hello, Please share your thoughts We have one big ES index and 18 shards (9 primary and 9 replicas) We have thousands of customers and each customer could have millions or as opposite very small number of documents We never search across all customers but within a specific customer. In other

Re: Cluster issue - raiseTimeoutFailure

2015-03-20 Thread Mark Walkom
You shouldn't sent MIN and MAX just use ES_HEAP_SIZE and it will set both, you shouldn't also change the threadpools unless you understand and are aware of what they entail. I think the problem here would most likely be write consistency -

How does Elasticsearch convert dates to JSON string representations?

2015-03-20 Thread Erik Iverson
Hello everyone, I have a question on how Elasticsearch returns JSON representations of fields with the date type. My confusion comes from the fact that the page http://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html says: The date type is a special type which

Growing old-gen size

2015-03-20 Thread mjdude5
Hello, I'm trying to better understand our ES memory usage with relation to our workload. Right now 2 of our nodes have above-average heap usage. Looking at _stats their old-gen is large, ~6gb. Here's the _cat usage I've been trying to make sense of:

Re: Index alias problem

2015-03-20 Thread Mark Walkom
What is the index pattern you are using in Kibana? It shouldn't need an alias unless you explicitly define it as the pattern. On 20 March 2015 at 03:24, Mathias Adler mathias.ad...@rebtel.com wrote: Hi Mark, I don't get any errors in the logs. Froma ES 1.4 node Kibana returns: *No results*

Re: How does Elasticsearch convert dates to JSON string representations?

2015-03-20 Thread Roger de Cordova Farias
Well, the company won't edit his source anyway :p (but I get your point, I'm used to refer to Elasticsearch as Elastic, I have to fix it) I think his question is: he posts a document with a date in string format and retrieve it in the same format. He was expecting to retrieve it as long type

Re: What is the best configuration to run on linux VPS server without crashing?

2015-03-20 Thread Mark Walkom
Personally, I'd change hosting providers as a default, empty install of Elasticsearch will not OOM like that. You can try AWS, DigitalOcean, Linode, I don't really have any recommendations there though. On 20 March 2015 at 09:16, Yashin Soraballee yashin.sorabal...@gmail.com wrote: Hello Mark,

Re: Data Node Allocation Weight

2015-03-20 Thread Mark Walkom
If you have hot and cold data, you could allocate the cold data to those nodes (see http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-cluster.html ). Or you could allocate those nodes to be master only, but that may be a resource waste. On 20 March 2015 at 07:30, smonasco

Are documents of different types indexed via one index?

2015-03-20 Thread Александр Свиридов
From ES document I understood that index is like database and document type is like table. Every table has its own indexes. Let's suppose we have one ES index with two document types (posts and book) and we have 10 posts and 30 books. If I  search for some text expression in ONLY

Re: Index alias problem

2015-03-20 Thread Mathias Adler
Index pattern is default [logstash-].MM.DD, same on all nodes Thanks for reaching out...! //MA 2015-03-20 20:34 GMT+01:00 Mark Walkom markwal...@gmail.com: What is the index pattern you are using in Kibana? It shouldn't need an alias unless you explicitly define it as the pattern. On 20

Re: Multiple indices vs. multiple shards approach

2015-03-20 Thread Mark Walkom
There is a whole bunch of good stuff in the docs so I'd suggest you start there - http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-routing Don't play with the hashing/sharding algorithm unless you know exactly what you are doing! On 20 March 2015 at 11:34,

Re: How does Elasticsearch convert dates to JSON string representations?

2015-03-20 Thread Mark Walkom
Ah ok, we convert between these formats so it's completely transparent to the end user and easier to store under the hood. On 20 March 2015 at 12:39, Roger de Cordova Farias roger.far...@fontec.inf.br wrote: Well, the company won't edit his source anyway :p (but I get your point, I'm used to

Re: Are documents of different types indexed via one index?

2015-03-20 Thread Mark Walkom
You can mix types in a single index, but we recommend you separate them out. Obviously a search against 30 docs is a lot faster than one against 1 billion. On 20 March 2015 at 13:08, Александр Свиридов ooo_satu...@mail.ru wrote: From ES document I understood that index is like database and

Aggregations of phrases

2015-03-20 Thread Bruno Kamiche
I have a field that contains text from different sources (it could be a facebook post, a twitter tweet, a blog article, etc), so it varies in length. I need to find common phrases in that field to determine conversation subjects. terms aggregations works fine for words, but I need to find

Re[2]: Are documents of different types indexed via one index?

2015-03-20 Thread Александр Свиридов
So, as I understand you ritgh all documents of all types in one index indexed together. Not as in databases - one table (ome type) - one index. Пятница, 20 марта 2015, 14:19 -07:00 от Mark Walkom markwal...@gmail.com: You can mix types in a single index, but we recommend you separate them out.