I'm currently evaluating using ES and logstash as a central log management.
Aside from storing logs in ES, I will need to do some querying and reports
as well. This is where I get into troubles, because I'm not sure how to
solve the following problem.
I have a service, where a certain user
Hi,
I have populated elasticsearch index in local mode using node client with
number of shards as 1 and number of replicas as 0. When i run a boolean
query, it is taking 200ms to execute and fetch results. Same code using
node client in local mode pointing to the cluster running on my
Ye, this is it.
Thanx!
On 9/8/2014 6:03 PM, David Pilato wrote:
I think you are looking for this:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation
--
*David Pilato* | /Technical Advocate/ | *Elasticsearch.com*
@dadoonet
Also congrats for writing a plugin!
For alternative punctuation tokenizing, you can also look at the classic
tokenizer, which is the behavior of Lucene standard tokenization before
3.1, when it switched to Unicode text segmentation.
Best,
Jörg
On Tue, Sep 9, 2014 at 5:36 AM, vineeth mohan
Your question is not very clear, but maybe you refer to JSR 236 for Java
EE.
Elasticsearch was never designed to run as a container managed Java EE
component in a Java EE container, so the question about thread creation is
unrelated.
If you want to use an ES client from within a Java EE
Elasticsearch was never designed to run as a container managed Java EE
component in a Java EE container, so the question about thread creation
is unrelated.
That's exactly my point and that's why I ask this question.
If you want to use an ES client from within a Java EE container,
start a
I think you confuse two things:
- the Java platform (Java Language, Java Virtual Machine)
- the Java EE specification
These two things are different.
As the Java EE specification is community-driven and defined, it does not
mean to put any burden or restrictions on the Java platform, or on any
Let me ask a different question then:
Do you agree when I say that if you include ES in your Java EE application
(JAR file) and you create a single tone to manage the ES connection to the
cluster. That in this case, the ES engine runs inside of your Java EE
application container?
Tasha
On 9
Hi, I was following the example given on official elasticsearch
documentation to read data from elasticsearch using hadoop but i am getting
the following error.
java.lang.Exception: java.io.IOException: Type mismatch in value from map:
expected org.apache.hadoop.io.MapWritable, received
Is (2) expected? Is there a buggette?
Anyone familiar with highlighting have any insight?
On Monday, 8 September 2014 10:16:02 UTC+1, mooky wrote:
I have looked at doing highlighting on _all.
I set store: true, and I am getting results.
I expect the contents of _all to be gobbledigook -
Thanks Ivan.
Yes, it was the post filter which was ignored. We use filtered query only
when the user sends a query string, otherwise (when only exact filters for
specific columns are specified) we use the post filter. It seems strange to
me to use the FilteredQuery when the query string is
Hey mate, sorry solved this in the irc channel.
For future reference I created the following command to delete all red
index's.
curl -s localhost:9200/_cat/indices 21 | grep ^red | awk '{print $2}' |
while read red; do curl -XDELETE 'http://localhost:9200/$red'; done
Regards
John
On Mon, Sep
Dear All,
I am trying to upload via a river to Elasticsearch some MySQL data which
look like this:
product_id product_info_id
'1' 1
'1' 2
'2' 3
'2' 4
I am looking to
I've set
index.number_of_replicas: 0
Is there something else I need to set?
Thanks
On Monday, 8 September 2014 17:56:56 UTC+2, Jörg Prante wrote:
You have to set the replica level from 1 to 0 on single node to switch
health from yellow to green.
Jörg
On Mon, Sep 8, 2014 at 5:41 PM,
By setting this parameter, some additional questions of mine have been
generated:
By setting indices.memory.index_buffer_size to a specific node and not to
all nodes of the cluster, will this configuration be taken into account
from all nodes? Is it going to be cluster wide or only for index
I imaging its caused by your analysis configuration. User the analyze api
and check what is output for all those terms.
On Sep 9, 2014 5:15 AM, mooky nick.minute...@gmail.com wrote:
Is (2) expected? Is there a buggette?
Anyone familiar with highlighting have any insight?
On Monday, 8
Hello ,
Is product_id unique , seems to me its not.
In that case why have you made it as _id.
_id should be unique and in face of dedupe , it would be over written.
So remove defenition of _id should solve the issue.
Thanks
Vineeth
On Tue, Sep 9, 2014 at 4:09 PM, EzCocos
I am trying to install elasticsearch on a cPanel server with the following
specs:
*OS:* CENTOS 6.5 x86_64
*Virtual Machine:* virtuoso
*WHM Version:* 11.44.1 (build 18)
I have used yum to install elasticsearch using the install guide provided
here:
Hi,
I have a list of terms (type:string), and I want to retrieve all documents
that have those terms. The term field is like a unique incremental sequence
field.
Which would be more performant? A Terms Query OR a Terms Filter with a
Match All. Or some totally other approach
Thanks,
Hi Vineeth,
I haven't looked at the plugin Bryan has created ,
However creating a plugin for special characters gives better performance
over patter tokenizer or custom filters.
Regards,
Raj
On Tuesday, September 9, 2014 9:06:08 AM UTC+5:30, vineeth mohan wrote:
Hello Bryan ,
Congrats on
Hi Costin,
Thanks for the heads up regarding gist. I will try to follow the guidelines
in the future. As for my program, I am using Elasticsearch Hadoop
v2.1.0.Beta1 . I tried your suggestion and changed the output value class
to LinkedMapWritable. but now I am getting the following error.
Many thanks Vineeth for your help.
product_id is indeed unique. If I remove _id I get 4 documents rather than
2 documents containing each an array of related product_info_id.
I guess I want _id to be the key under which same product_id got
aggregated. In other word if I remove _id, how
I recently ran into an issue where my cluster is reporting an
IndexMissingException. I tried deleting the faulty index, but I keep
getting the same error returned. How do I fix this problem?
$ curl -XDELETE 'http://localhost:9200/logstash-2014.09.04.11'
Most likely you have a classpath conflict caused by multiple versions of es-hadoop. Can you double check you only have
one version (2.1.0.Beta1) available?
Based on the error, I'm guessing you have some 1.3 Mx or the RC somewhere in
there...
On 9/9/14 4:06 PM, gaurav redkar wrote:
Hi Costin,
Does the Couchbase/Elasticsearch plugin supports parent/child relationship
using XDCR architecture ?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Hello ,
In that case you should make product_id as _type rather than _id.
Thanks
Vineeth
On Tue, Sep 9, 2014 at 6:38 PM, EzCocos jmdepi...@gmail.com wrote:
Many thanks Vineeth for your help.
product_id is indeed unique. If I remove _id I get 4 documents rather than
2 documents
I have an update script that updates the contents of a nested object array
and it works fine when everything is fine but sometimes we want to be able
to reject an update due to invalid parameters (such as a missing mandatory
field) and return an error.
I can set the ctx.op to none so that the
Wondering if anyone has any input on this?
On Monday, 8 September 2014 10:53:24 UTC+1, Alastair James wrote:
Hi there.
I recently had a snapshot (ec2 plugin to s3) fail on one shard with a
NullPointerException. Unfortunately I did not keep the exact error message.
However, I tried the
Sorry! but I don't understand. I tried to replace _id by _type and I get 4
documents.
Would you be able to give me the related short mapping that would work? For
instance for the data:
product_id product_info_id
'1' 1
'1'
This is not an ES issue. As said on IRC, because of shell interpretation of
? or *, you should always use quotes around curl requests, like this
curl 'http://localhost:9200/_nodes/process?pretty'
Jörg
On Tue, Sep 9, 2014 at 2:17 PM, Ali Samii aalisa...@gmail.com wrote:
I am trying to install
It depends.
With ES engine you mean the machinery to search and index I presume.
If you use embedded single node in a JEE container: yes.
If you use NodeClient in JEE container with data, but being part of an ES
cluster: yes and no.
If you use TransportClient: no, you just connect a client to
Hi there!
I'm trying to do a one-time index of about 800,000 records into an instance
of elasticsearch. But I'm having a bit of trouble. It continually fails
around 200,000 records. Looking at in the Elasticsearch Head Plugin, my
index goes offline and becomes unrecoverable.
For now, I have
Hi there! Sorry I posted two topics. I've somehow managed to post an
incomplete post.
I'm trying to do a one-time index of about 800,000 records into an instance
of elasticsearch. But I'm having a bit of trouble. It continually fails
around 200,000 records. Looking at in the Elasticsearch
If you use NodeClient in JEE container with data, but being part of an ES
cluster: yes and no.
(think you mean without data)
That's what I do. Why yes and no? IMHO it's yes and if this is the case, it
should respect the Java EE specs :)
On 9 September 2014 16:14, joergpra...@gmail.com
You can definitely use the Pattern Tokenizer to define your own token
separators (i.e. the word boundary breaks), but you will add complexity and
lose some of the benefits of the StandardTokenizer.
First, regarding complexity, if you want certain characters to not become
token separators,
We use Flume at Gigya to ship logs and application events into
Elasticsearch (and kibana).
We're using the flume-ng elasticsearch sink, but we've encountered a few
issues with it. So we created an extended version of the logstash
serializer that the sink uses, that fixes some bugs contained
Again: why do you want ES respect the JEE specs? It is not a JEE component
at all and will never be.
You can run anything in a JEE container, also software which is not a JEE
component.
Running a NodeClient (with or without data) means the node is becoming a
part of the ES cluster, i.e. the JVM
Hello Joshuva ,
I have a feeling this has something to do with the threadpool.
There is a limit on number of feeds to be queued for indexing.
Try increasing the size of threadpool queue of index and bulk to a large
number.
Also through cluster node API on threadpool, you can see if any request
Hello Joshua ,
Please refrain from posting the same question twice.
If you need to add additional information , just reply to the original
thread.
Thanks
Vineeth
On Tue, Sep 9, 2014 at 7:54 PM, Joshua P jpetersen...@gmail.com wrote:
Hi there! Sorry I posted two topics. I've somehow
Thanks for the reply, Vineeth!
What's a practical heap size? I've seen some people saying they set it to
30gb but this confuses me because in the /etc/default/elasticsearch file,
the comment suggests the max is only 1gb?
I'll look into the threadpool issue. Is there a Java API for monitoring
You also said you wouldn't recommend indexing that much information at
once. How would you suggest breaking it up and what status should I look
for before doing another batch? I have to come up with some process that is
repeatable and mostly automated.
On Tuesday, September 9, 2014 11:12:59
Hello Joshua ,
I am not sure which variable you are referring to on the memory settings in
the config file , please paste the comment and config.
I usually change the config from init.d script.
Best approach would be to bulk index say 10,000 feeds in sync mode , wait
until is everything is
You can run anything in a JEE container, also software which is not a JEE
component.
That's exactly the point of the question :)
De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De
la part de joergpra...@gmail.com
Envoyé : mardi 9 septembre 2014 16:51
À
Here is /etc/default/elasticsearch
# Run Elasticsearch as this user ID and group ID
#ES_USER=elasticsearch
#ES_GROUP=elasticsearch
# Heap Size (defaults to 256m min, 1g max)
ES_HEAP_SIZE=512m
# Heap new generation
#ES_HEAP_NEWSIZE=
# max direct memory
#ES_DIRECT_SIZE=
# Maximum number of open
Hi,
I have an ES index where I store companies indexed by name. Is there any
way so I can search by similar words ?
For example I have the name Discount Pets Supplies and I want to also
be able to query like Discount Pets and Supplies, so basically I'm asking
if there is a way to mark as
Hello ,
The direction you are heading is of the synonym analyzer.
You can declare your synonym set as = and and set it as a synonym token
filter.
What this means is that , while indexing , if its sees , it will register
and also against that position.
SYNONYM TOKEN FILTER -
A filtered query with no explicit query will ultimately be translated into
a match-all/constant-score query at the Lucene level. I prefer to
explicitly define all my match all queries and use the specific post filter
name, and not the old filter name, which was deprecated due to its
ambiguity.
Let ES_HEAP_SIZE at least to 1 GB, for smaller heaps like 512m and indexing
around 1 million docs, you need some more fine tuning, which is
complicated. Your machine is ok to set the heap to 4 GB which is 50% of 8
GB RAM.
Jörg
On Tue, Sep 9, 2014 at 5:39 PM, Joshua P jpetersen...@gmail.com
Hi Jörg,
Can you elaborate on what you mean by I still need more fine tuning?
I've upped the heap size to 4g (in both places I mentioned before because
it's not clear to me which one ES actually uses). I haven't tried to index
again yet.
Other than throttling my indexing, what are some
Hi Folks
We have a bunch of fields for a document.
Lets call them field1,'field2., FIELD1,FIELD2..
When we search for a particular text we want to see those records first
which has that text in FIELD1...FIELDn before records from field1,
field2 etc.
Now the complications :
a) We have
Just reran the indexer and found this error coming up. I'm running out of
disk space on the partition ES wants to write to.
F38KqHhnRDWtiJCss5Wz0g -- INTERNAL_SERVER_ERROR --
TranslogException[[index_type][0] Failed to write operation
You mentioned problems around 200.000 docs. What are these problems and how
do you think you can fix them? How does your bulk indexing procedure look
like?
By finetuning I mean slimming down all ES settings to the absolute minimum
to slow down indexing and allocate less resources. But in your
Check the path.data setting in config/elasticsearch.yml
Jörg
On Tue, Sep 9, 2014 at 7:50 PM, Joshua P jpetersen...@gmail.com wrote:
Just reran the indexer and found this error coming up. I'm running out of
disk space on the partition ES wants to write to.
F38KqHhnRDWtiJCss5Wz0g --
We have a requirement to phrase wildcard search in elastic search.
The requirement is to search for lets say search for Barce* Me*i and it
should give any document which has Barcelona Messi in a phrase in ANY
field.
We have been able to do it using span near with slop 0, and it works fine .
The
Hopefully Mike McCandless will get some of the new Lucene features into
Elasticsearch:
http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html
I suspect it will come soon.
--
Ivan
On Mon, Sep 8, 2014 at 2:11 PM, Nikolas Everett nik9...@gmail.com wrote:
On Mon,
Hi guys,
I have a problem setting up my _timestamp strategy, I have an index with
_timestamp enabled with the following mapping:
my_index: {mappings: {_default_: {_timestamp: {enabled: true,
store: true,
default: now,
path: my_timestamp},properties: {}},
}}
And here is my problem, when
This is the code I've been using to index:
I'm going to try to fix the running out of space issue and then try
slimming down settings. Thank you.
public class Indexer {
private static final Logger logger = LogManager.getLogger(
ESBulkUploader);
public static void main(String[] args)
Hi Costin,
I had downloaded the elasticsearch-hadoop-2.1.0.Beta1.zip file and used all
the jars from that for the program. Later I even tried replacing all the
jars in my program with jars from with elasticsearch-hadoop-2.0.0.zip file,
but still facing the same error.
On Tue, Sep 9, 2014 at
If by error you mean the ClassNotFoundException, you need to check again your classpath. Also be sure to add es-hadoop
to your job classpath (typically pack it with the jar) - the documentation
describes some of the options available [1]
[1]
(Apologies in advance for including sample code - having issues with gist
at the second...)
When running some tests against ElasticSearch 1.3.2, we noticed what
appears to be a problem with the missing filter. This problem manifests
with JVM 1.7.0_60 (recommended in the ES documentation), but
My quey is something like this :
query:
{
span_near:
{clauses:
[
{span_multi:{match:{wildcard:{name:{wildcard:comp*}
,{span_multi:{match:{wildcard:{name:{wildcard:engaged}
]
,slop:0
I can not reproduce this on Mac OS X.
The behavior of missing/exists filter for empty fields has slightly
changed, see https://github.com/elasticsearch/elasticsearch/issues/5659
Jörg
On Tue, Sep 9, 2014 at 11:18 PM, Les Barstow les.bars...@returnpath.com
wrote:
(Apologies in advance for
Code looks okay, so it might be just the full volume that is in the way
Jörg
On Tue, Sep 9, 2014 at 8:44 PM, Joshua P jpetersen...@gmail.com wrote:
This is the code I've been using to index:
I'm going to try to fix the running out of space issue and then try
slimming down settings. Thank
Thanks. https://github.com/elasticsearch/elasticsearch/issues/7348 is a
recently opened bug that addresses my problem. Don't know why 1.7.0_67
fixes it on my Linux box (and others who have replicated it...)
--
Les
On Tue, Sep 9, 2014 at 4:43 PM, joergpra...@gmail.com joergpra...@gmail.com
Hello,
I'd like to add an order parameter to a range aggregation ( use case:
sorting the age of records from latest added to oldest) but adding order
: {_terms:asc} or _count causes a search error.
I'm trying to confirm if range aggregation doesn't really support order
parameter at this point
Thanks!
Turns out I was using less space on the VM than I thought; that with a lack
of decent error checking and I didn't catch the out-of-space problem. As
soon as I added more space, I was able to index everything without a
problem.
Thanks again.
On Tuesday, September 9, 2014 6:49:35 PM
Malini, I would suggest starting a new thread instead of adding to an old
one.
I find the Java API for the boost functions to be confusing, or at least,
not as clean as the rest of the Java API. I wonder if the Elasticsearch
team would accept a PR. Jörg's example above could be used as a skeleton
Thanks for the response, Martijn! We'll consider upgrading, but it'd be
great to root cause the issue.
On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote:
Patrick: I have never seen this, but this means the openjdk on FreeBSD
doesn't support cpu sampling of threads.
Hi Folks
I have created test data like this in Sense.
PUT /myccindexallanalyzed/testobject/_bulk
{index:{_id:1}}
{name:do not,description:single test word}
{index:{_id:2}}
{name:single test word,description:do not}
I am trying to search for sin* test and want the document which has this
phrase
69 matches
Mail list logo