Linking of query/search

2014-09-09 Thread matej . zerovnik
I'm currently evaluating using ES and logstash as a central log management. Aside from storing logs in ES, I will need to do some querying and reports as well. This is where I get into troubles, because I'm not sure how to solve the following problem. I have a service, where a certain user

Elasticsearch in Local mode is slow

2014-09-09 Thread sarathrs
Hi, I have populated elasticsearch index in local mode using node client with number of shards as 1 and number of replicas as 0. When i run a boolean query, it is taking 200ms to execute and fetch results. Same code using node client in local mode pointing to the cluster running on my

Re: Disabling automatic index creation

2014-09-09 Thread Yevhen Bobrov
Ye, this is it. Thanx! On 9/8/2014 6:03 PM, David Pilato wrote: I think you are looking for this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation -- *David Pilato* | /Technical Advocate/ | *Elasticsearch.com* @dadoonet

Re: ES Plugin to extend Lucene's Standard Tokenizer

2014-09-09 Thread joergpra...@gmail.com
Also congrats for writing a plugin! For alternative punctuation tokenizing, you can also look at the classic tokenizer, which is the behavior of Lucene standard tokenization before 3.1, when it switched to Unicode text segmentation. Best, Jörg On Tue, Sep 9, 2014 at 5:36 AM, vineeth mohan

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread joergpra...@gmail.com
Your question is not very clear, but maybe you refer to JSR 236 for Java EE. Elasticsearch was never designed to run as a container managed Java EE component in a Java EE container, so the question about thread creation is unrelated. If you want to use an ES client from within a Java EE

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread Tasha CARL
Elasticsearch was never designed to run as a container managed Java EE component in a Java EE container, so the question about thread creation is unrelated. That's exactly my point and that's why I ask this question. If you want to use an ES client from within a Java EE container, start a

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread joergpra...@gmail.com
I think you confuse two things: - the Java platform (Java Language, Java Virtual Machine) - the Java EE specification These two things are different. As the Java EE specification is community-driven and defined, it does not mean to put any burden or restrictions on the Java platform, or on any

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread Tasha CARL
Let me ask a different question then: Do you agree when I say that if you include ES in your Java EE application (JAR file) and you create a single tone to manage the ES connection to the cluster. That in this case, the ES engine runs inside of your Java EE application container? Tasha On 9

Error while reading elasticsearch data in hadoop program

2014-09-09 Thread gaurav redkar
Hi, I was following the example given on official elasticsearch documentation to read data from elasticsearch using hadoop but i am getting the following error. java.lang.Exception: java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.MapWritable, received

Re: How do I help the users understand some unexpected search hits (Or how can I do highlighting on _all)

2014-09-09 Thread mooky
Is (2) expected? Is there a buggette? Anyone familiar with highlighting have any insight? On Monday, 8 September 2014 10:16:02 UTC+1, mooky wrote: I have looked at doing highlighting on _all. I set store: true, and I am getting results. I expect the contents of _all to be gobbledigook -

Re: Aggregation framework, Java API

2014-09-09 Thread Emanuel Buzek
Thanks Ivan. Yes, it was the post filter which was ignored. We use filtered query only when the user sends a query string, otherwise (when only exact filters for specific columns are specified) we use the post filter. It seems strange to me to use the FilteredQuery when the query string is

Re: Cleaning up red Shards

2014-09-09 Thread Johnathan Phan
Hey mate, sorry solved this in the irc channel. For future reference I created the following command to delete all red index's. curl -s localhost:9200/_cat/indices 21 | grep ^red | awk '{print $2}' | while read red; do curl -XDELETE 'http://localhost:9200/$red'; done Regards John On Mon, Sep

Simple Array Example

2014-09-09 Thread EzCocos
Dear All, I am trying to upload via a river to Elasticsearch some MySQL data which look like this: product_id product_info_id '1' 1 '1' 2 '2' 3 '2' 4 I am looking to

Re: Cluster health yellow on one node even when setting node.local to true

2014-09-09 Thread w . pink
I've set index.number_of_replicas: 0 Is there something else I need to set? Thanks On Monday, 8 September 2014 17:56:56 UTC+2, Jörg Prante wrote: You have to set the replica level from 1 to 0 on single node to switch health from yellow to green. Jörg On Mon, Sep 8, 2014 at 5:41 PM,

Re: Indexing is becoming slow, what to look for?

2014-09-09 Thread Thomas
By setting this parameter, some additional questions of mine have been generated: By setting indices.memory.index_buffer_size to a specific node and not to all nodes of the cluster, will this configuration be taken into account from all nodes? Is it going to be cluster wide or only for index

Re: How do I help the users understand some unexpected search hits (Or how can I do highlighting on _all)

2014-09-09 Thread Nikolas Everett
I imaging its caused by your analysis configuration. User the analyze api and check what is output for all those terms. On Sep 9, 2014 5:15 AM, mooky nick.minute...@gmail.com wrote: Is (2) expected? Is there a buggette? Anyone familiar with highlighting have any insight? On Monday, 8

Re: Simple Array Example

2014-09-09 Thread vineeth mohan
Hello , Is product_id unique , seems to me its not. In that case why have you made it as _id. _id should be unique and in face of dedupe , it would be over written. So remove defenition of _id should solve the issue. Thanks Vineeth On Tue, Sep 9, 2014 at 4:09 PM, EzCocos

Trying to install elasticsearch as a service on Centos 6 with cPanel

2014-09-09 Thread Ali Samii
I am trying to install elasticsearch on a cPanel server with the following specs: *OS:* CENTOS 6.5 x86_64 *Virtual Machine:* virtuoso *WHM Version:* 11.44.1 (build 18) I have used yum to install elasticsearch using the install guide provided here:

Terms Query OR Terms Filter with Match All

2014-09-09 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi, I have a list of terms (type:string), and I want to retrieve all documents that have those terms. The term field is like a unique incremental sequence field. Which would be more performant? A Terms Query OR a Terms Filter with a Match All. Or some totally other approach Thanks,

Re: ES Plugin to extend Lucene's Standard Tokenizer

2014-09-09 Thread Raj Gupta
Hi Vineeth, I haven't looked at the plugin Bryan has created , However creating a plugin for special characters gives better performance over patter tokenizer or custom filters. Regards, Raj On Tuesday, September 9, 2014 9:06:08 AM UTC+5:30, vineeth mohan wrote: Hello Bryan , Congrats on

Re: Error while reading elasticsearch data in hadoop program

2014-09-09 Thread gaurav redkar
Hi Costin, Thanks for the heads up regarding gist. I will try to follow the guidelines in the future. As for my program, I am using Elasticsearch Hadoop v2.1.0.Beta1 . I tried your suggestion and changed the output value class to LinkedMapWritable. but now I am getting the following error.

Re: Simple Array Example

2014-09-09 Thread EzCocos
Many thanks Vineeth for your help. product_id is indeed unique. If I remove _id I get 4 documents rather than 2 documents containing each an array of related product_info_id. I guess I want _id to be the key under which same product_id got aggregated. In other word if I remove _id, how

How to fix IndexMissingException

2014-09-09 Thread Brian Wilkins
I recently ran into an issue where my cluster is reporting an IndexMissingException. I tried deleting the faulty index, but I keep getting the same error returned. How do I fix this problem? $ curl -XDELETE 'http://localhost:9200/logstash-2014.09.04.11'

Re: Error while reading elasticsearch data in hadoop program

2014-09-09 Thread Costin Leau
Most likely you have a classpath conflict caused by multiple versions of es-hadoop. Can you double check you only have one version (2.1.0.Beta1) available? Based on the error, I'm guessing you have some 1.3 Mx or the RC somewhere in there... On 9/9/14 4:06 PM, gaurav redkar wrote: Hi Costin,

COUCHBASE + ELASTIC Parent/child mapping

2014-09-09 Thread Santosh B
Does the Couchbase/Elasticsearch plugin supports parent/child relationship using XDCR architecture ? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: Simple Array Example

2014-09-09 Thread vineeth mohan
Hello , In that case you should make product_id as _type rather than _id. Thanks Vineeth On Tue, Sep 9, 2014 at 6:38 PM, EzCocos jmdepi...@gmail.com wrote: Many thanks Vineeth for your help. product_id is indeed unique. If I remove _id I get 4 documents rather than 2 documents

Changing response from update from a script

2014-09-09 Thread Jennifer Cumming
I have an update script that updates the contents of a nested object array and it works fine when everything is fine but sometimes we want to be able to reject an update due to invalid parameters (such as a missing mandatory field) and return an error. I can set the ctx.op to none so that the

Re: NullPointerException during snapshot

2014-09-09 Thread Alastair James
Wondering if anyone has any input on this? On Monday, 8 September 2014 10:53:24 UTC+1, Alastair James wrote: Hi there. I recently had a snapshot (ec2 plugin to s3) fail on one shard with a NullPointerException. Unfortunately I did not keep the exact error message. However, I tried the

Re: Simple Array Example

2014-09-09 Thread EzCocos
Sorry! but I don't understand. I tried to replace _id by _type and I get 4 documents. Would you be able to give me the related short mapping that would work? For instance for the data: product_id product_info_id '1' 1 '1'

Re: Trying to install elasticsearch as a service on Centos 6 with cPanel

2014-09-09 Thread joergpra...@gmail.com
This is not an ES issue. As said on IRC, because of shell interpretation of ? or *, you should always use quotes around curl requests, like this curl 'http://localhost:9200/_nodes/process?pretty' Jörg On Tue, Sep 9, 2014 at 2:17 PM, Ali Samii aalisa...@gmail.com wrote: I am trying to install

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread joergpra...@gmail.com
It depends. With ES engine you mean the machinery to search and index I presume. If you use embedded single node in a JEE container: yes. If you use NodeClient in JEE container with data, but being part of an ES cluster: yes and no. If you use TransportClient: no, you just connect a client to

Bulk Indexing Problems

2014-09-09 Thread Joshua P
Hi there! I'm trying to do a one-time index of about 800,000 records into an instance of elasticsearch. But I'm having a bit of trouble. It continually fails around 200,000 records. Looking at in the Elasticsearch Head Plugin, my index goes offline and becomes unrecoverable. For now, I have

Some Bulk Indexing PRoblems

2014-09-09 Thread Joshua P
Hi there! Sorry I posted two topics. I've somehow managed to post an incomplete post. I'm trying to do a one-time index of about 800,000 records into an instance of elasticsearch. But I'm having a bit of trouble. It continually fails around 200,000 records. Looking at in the Elasticsearch

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread Tasha CARL
If you use NodeClient in JEE container with data, but being part of an ES cluster: yes and no. (think you mean without data) That's what I do. Why yes and no? IMHO it's yes and if this is the case, it should respect the Java EE specs :) On 9 September 2014 16:14, joergpra...@gmail.com

Re: ES Plugin to extend Lucene's Standard Tokenizer

2014-09-09 Thread Bryan Warner
You can definitely use the Pattern Tokenizer to define your own token separators (i.e. the word boundary breaks), but you will add complexity and lose some of the benefits of the StandardTokenizer. First, regarding complexity, if you want certain characters to not become token separators,

An extended serializer for the Elasticsearch sink for Flume

2014-09-09 Thread Rotem
We use Flume at Gigya to ship logs and application events into Elasticsearch (and kibana). We're using the flume-ng elasticsearch sink, but we've encountered a few issues with it. So we created an extended version of the logstash serializer that the sink uses, that fixes some bugs contained

Re: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread joergpra...@gmail.com
Again: why do you want ES respect the JEE specs? It is not a JEE component at all and will never be. You can run anything in a JEE container, also software which is not a JEE component. Running a NodeClient (with or without data) means the node is becoming a part of the ES cluster, i.e. the JVM

Re: Bulk Indexing Problems

2014-09-09 Thread vineeth mohan
Hello Joshuva , I have a feeling this has something to do with the threadpool. There is a limit on number of feeds to be queued for indexing. Try increasing the size of threadpool queue of index and bulk to a large number. Also through cluster node API on threadpool, you can see if any request

Re: Some Bulk Indexing PRoblems

2014-09-09 Thread vineeth mohan
Hello Joshua , Please refrain from posting the same question twice. If you need to add additional information , just reply to the original thread. Thanks Vineeth On Tue, Sep 9, 2014 at 7:54 PM, Joshua P jpetersen...@gmail.com wrote: Hi there! Sorry I posted two topics. I've somehow

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
Thanks for the reply, Vineeth! What's a practical heap size? I've seen some people saying they set it to 30gb but this confuses me because in the /etc/default/elasticsearch file, the comment suggests the max is only 1gb? I'll look into the threadpool issue. Is there a Java API for monitoring

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
You also said you wouldn't recommend indexing that much information at once. How would you suggest breaking it up and what status should I look for before doing another batch? I have to come up with some process that is repeatable and mostly automated. On Tuesday, September 9, 2014 11:12:59

Re: Bulk Indexing Problems

2014-09-09 Thread vineeth mohan
Hello Joshua , I am not sure which variable you are referring to on the memory settings in the config file , please paste the comment and config. I usually change the config from init.d script. Best approach would be to bulk index say 10,000 feeds in sync mode , wait until is everything is

RE: ElasticSearch multi-threading and the Java EE specifications

2014-09-09 Thread Tasha CARL
You can run anything in a JEE container, also software which is not a JEE component. That's exactly the point of the question :) De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De la part de joergpra...@gmail.com Envoyé : mardi 9 septembre 2014 16:51 À

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
Here is /etc/default/elasticsearch # Run Elasticsearch as this user ID and group ID #ES_USER=elasticsearch #ES_GROUP=elasticsearch # Heap Size (defaults to 256m min, 1g max) ES_HEAP_SIZE=512m # Heap new generation #ES_HEAP_NEWSIZE= # max direct memory #ES_DIRECT_SIZE= # Maximum number of open

Search by equivalent words

2014-09-09 Thread Teodor Pripoae
Hi, I have an ES index where I store companies indexed by name. Is there any way so I can search by similar words ? For example I have the name Discount Pets Supplies and I want to also be able to query like Discount Pets and Supplies, so basically I'm asking if there is a way to mark as

Re: Search by equivalent words

2014-09-09 Thread vineeth mohan
Hello , The direction you are heading is of the synonym analyzer. You can declare your synonym set as = and and set it as a synonym token filter. What this means is that , while indexing , if its sees , it will register and also against that position. SYNONYM TOKEN FILTER -

Re: Aggregation framework, Java API

2014-09-09 Thread Ivan Brusic
A filtered query with no explicit query will ultimately be translated into a match-all/constant-score query at the Lucene level. I prefer to explicitly define all my match all queries and use the specific post filter name, and not the old filter name, which was deprecated due to its ambiguity.

Re: Bulk Indexing Problems

2014-09-09 Thread joergpra...@gmail.com
Let ES_HEAP_SIZE at least to 1 GB, for smaller heaps like 512m and indexing around 1 million docs, you need some more fine tuning, which is complicated. Your machine is ok to set the heap to 4 GB which is 50% of 8 GB RAM. Jörg On Tue, Sep 9, 2014 at 5:39 PM, Joshua P jpetersen...@gmail.com

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
Hi Jörg, Can you elaborate on what you mean by I still need more fine tuning? I've upped the heap size to 4g (in both places I mentioned before because it's not clear to me which one ES actually uses). I haven't tried to index again yet. Other than throttling my indexing, what are some

Boosting a list of field for queries

2014-09-09 Thread Amish Asthana
Hi Folks We have a bunch of fields for a document. Lets call them field1,'field2., FIELD1,FIELD2.. When we search for a particular text we want to see those records first which has that text in FIELD1...FIELDn before records from field1, field2 etc. Now the complications : a) We have

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
Just reran the indexer and found this error coming up. I'm running out of disk space on the partition ES wants to write to. F38KqHhnRDWtiJCss5Wz0g -- INTERNAL_SERVER_ERROR -- TranslogException[[index_type][0] Failed to write operation

Re: Bulk Indexing Problems

2014-09-09 Thread joergpra...@gmail.com
You mentioned problems around 200.000 docs. What are these problems and how do you think you can fix them? How does your bulk indexing procedure look like? By finetuning I mean slimming down all ES settings to the absolute minimum to slow down indexing and allocate less resources. But in your

Re: Bulk Indexing Problems

2014-09-09 Thread joergpra...@gmail.com
Check the path.data setting in config/elasticsearch.yml Jörg On Tue, Sep 9, 2014 at 7:50 PM, Joshua P jpetersen...@gmail.com wrote: Just reran the indexer and found this error coming up. I'm running out of disk space on the partition ES wants to write to. F38KqHhnRDWtiJCss5Wz0g --

Phrase wildcard search

2014-09-09 Thread Amish Asthana
We have a requirement to phrase wildcard search in elastic search. The requirement is to search for lets say search for Barce* Me*i and it should give any document which has Barcelona Messi in a phrase in ANY field. We have been able to do it using span near with slop 0, and it works fine . The

Re: Faster sloppy phrase queries

2014-09-09 Thread Ivan Brusic
Hopefully Mike McCandless will get some of the new Lucene features into Elasticsearch: http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html I suspect it will come soon. -- Ivan On Mon, Sep 8, 2014 at 2:11 PM, Nikolas Everett nik9...@gmail.com wrote: On Mon,

_timestamp default not used when path is specified

2014-09-09 Thread Gastón Tonietti
Hi guys, I have a problem setting up my _timestamp strategy, I have an index with _timestamp enabled with the following mapping: my_index: {mappings: {_default_: {_timestamp: {enabled: true, store: true, default: now, path: my_timestamp},properties: {}}, }} And here is my problem, when

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
This is the code I've been using to index: I'm going to try to fix the running out of space issue and then try slimming down settings. Thank you. public class Indexer { private static final Logger logger = LogManager.getLogger( ESBulkUploader); public static void main(String[] args)

Re: Error while reading elasticsearch data in hadoop program

2014-09-09 Thread gaurav redkar
Hi Costin, I had downloaded the elasticsearch-hadoop-2.1.0.Beta1.zip file and used all the jars from that for the program. Later I even tried replacing all the jars in my program with jars from with elasticsearch-hadoop-2.0.0.zip file, but still facing the same error. On Tue, Sep 9, 2014 at

Re: Error while reading elasticsearch data in hadoop program

2014-09-09 Thread Costin Leau
If by error you mean the ClassNotFoundException, you need to check again your classpath. Also be sure to add es-hadoop to your job classpath (typically pack it with the jar) - the documentation describes some of the options available [1] [1]

Possible JVM issue with missing filter

2014-09-09 Thread Les Barstow
(Apologies in advance for including sample code - having issues with gist at the second...) When running some tests against ElasticSearch 1.3.2, we noticed what appears to be a problem with the missing filter. This problem manifests with JVM 1.7.0_60 (recommended in the ES documentation), but

Re: Phrase wildcard search

2014-09-09 Thread Amish Asthana
My quey is something like this : query: { span_near: {clauses: [ {span_multi:{match:{wildcard:{name:{wildcard:comp*} ,{span_multi:{match:{wildcard:{name:{wildcard:engaged} ] ,slop:0

Re: Possible JVM issue with missing filter

2014-09-09 Thread joergpra...@gmail.com
I can not reproduce this on Mac OS X. The behavior of missing/exists filter for empty fields has slightly changed, see https://github.com/elasticsearch/elasticsearch/issues/5659 Jörg On Tue, Sep 9, 2014 at 11:18 PM, Les Barstow les.bars...@returnpath.com wrote: (Apologies in advance for

Re: Bulk Indexing Problems

2014-09-09 Thread joergpra...@gmail.com
Code looks okay, so it might be just the full volume that is in the way Jörg On Tue, Sep 9, 2014 at 8:44 PM, Joshua P jpetersen...@gmail.com wrote: This is the code I've been using to index: I'm going to try to fix the running out of space issue and then try slimming down settings. Thank

Re: Possible JVM issue with missing filter

2014-09-09 Thread Les Barstow
Thanks. https://github.com/elasticsearch/elasticsearch/issues/7348 is a recently opened bug that addresses my problem. Don't know why 1.7.0_67 fixes it on my Linux box (and others who have replicated it...) -- Les On Tue, Sep 9, 2014 at 4:43 PM, joergpra...@gmail.com joergpra...@gmail.com

Specifying bucket values ordering in Range aggregation

2014-09-09 Thread Raul, Jr. Martinez
Hello, I'd like to add an order parameter to a range aggregation ( use case: sorting the age of records from latest added to oldest) but adding order : {_terms:asc} or _count causes a search error. I'm trying to confirm if range aggregation doesn't really support order parameter at this point

Re: Bulk Indexing Problems

2014-09-09 Thread Joshua P
Thanks! Turns out I was using less space on the VM than I thought; that with a lack of decent error checking and I didn't catch the out-of-space problem. As soon as I added more space, I was able to index everything without a problem. Thanks again. On Tuesday, September 9, 2014 6:49:35 PM

Re: elasticsearch Java API for function_score query

2014-09-09 Thread Ivan Brusic
Malini, I would suggest starting a new thread instead of adding to an old one. I find the Java API for the boost functions to be confusing, or at least, not as clean as the rest of the Java API. I wonder if the Elasticsearch team would accept a PR. Jörg's example above could be used as a skeleton

Re: stuck thread problem?

2014-09-09 Thread Hieu Nguyen
Thanks for the response, Martijn! We'll consider upgrading, but it'd be great to root cause the issue. On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote: Patrick: I have never seen this, but this means the openjdk on FreeBSD doesn't support cpu sampling of threads.

Re: Boosting a list of field for queries

2014-09-09 Thread Amish Asthana
Hi Folks I have created test data like this in Sense. PUT /myccindexallanalyzed/testobject/_bulk {index:{_id:1}} {name:do not,description:single test word} {index:{_id:2}} {name:single test word,description:do not} I am trying to search for sin* test and want the document which has this phrase