date:20140418

Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Rafał Kuć

Hello!

Please look at the attachment plugin for Elasticsearch: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-attachment-type.html

It uses Apache Tika under the hood. The list of supported formats is
available here: http://tika.apache.org/0.10/formats.html

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


 Hi ES users,

 Is there anyway we can perform the text search present in the images or pdf
 files through elasticsearch.

 I mean to say that suppose I have pdf/image(will be stored in ES as base64
 format) file indexed in ES. And if that image file contains prashant as
 text in it so is there a way I can search for the prashant and get the
 record for that image as well.



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-presnet-in-the-images-or-pdf-files-through-elasticsearch-tp4054367.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/588849345.20140418080555%40alud.com.pl.
For more options, visit https://groups.google.com/d/optout.

Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Prashant Agrawal

Hi ,

If I am not wrong you are talking about 
https://github.com/elasticsearch/elasticsearch-mapper-attachments
https://github.com/elasticsearch/elasticsearch-mapper-attachments  

So in this I can index the attachments(say pdf file) and that will be stored
as base64 encoding. So is this plugin made available for searching the text
present in pdf file as well?

If yes what will be the result if I search for some keyword in attachment,
will it return the proper text data or the base64 encoded data?

~Prashant



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054371.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1397802282750-4054371.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Rafał Kuć

Hello!

You'll need to send the file contents to Elasticsearch in base64 form
and Elasticsearch will use Tika to extract data from the file.

However, in typical case, you would rather store, not the whole data
of the binary file (as it can be quite big), but rather a path to the
file, so that the application that will query Elasticsearch know where
to look for the original file itself. 

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


 Hi ,

 If I am not wrong you are talking about 
 https://github.com/elasticsearch/elasticsearch-mapper-attachments
 https://github.com/elasticsearch/elasticsearch-mapper-attachments  

 So in this I can index the attachments(say pdf file) and that will be stored
 as base64 encoding. So is this plugin made available for searching the text
 present in pdf file as well?

 If yes what will be the result if I search for some keyword in attachment,
 will it return the proper text data or the base64 encoded data?

 ~Prashant



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054371.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2310555013.20140418083728%40alud.com.pl.
For more options, visit https://groups.google.com/d/optout.

Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Prashant Agrawal

So can I say that the mapper-attachment plugin is made to work like below:
Whether I am sending text file or pdf file or image file to ES , the plugin
will extract the *text content* in all three scenarios and will store it
into the ES and then it will be available for search as well?



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054374.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1397803484020-4054374.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Srinivasan Ramaswamy

I would like to influence the ranking with few fields that are not stored
in the index (eg click data for keyword-documents). I have used custom
SearchComponent in Solr to implement similar functionality in the past. I
am wondering how can i achieve the same in ElasticSearch.

I know this thread is a very old thread, but i didnt find much information
on how to do custom scoring (in elasticsearch) with data thats not stored
in the index. This thread looked very relevant to my requirement, so trying
to see whether you guys have solved similar requirements with elasticsearch.

Thanks
Srini

On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:

Hi Otis,

So if I understand it correctly (providing my knowledge is quite limited
here) you are asking if
1) it is possible to hook into query processing flow and inject or extend
custom handlers for individual flow phases and
2) if we can find in ES the same functionality which is currently provided
by components listed here: http://wiki.apache.org/solr/SearchComponent (or
here:
http://lucene.apache.org/solr/api/org/apache/solr/handler/component/SearchComponent.html
).

As for #1, frankly, I do not know. I have been playing with plugins a bit
but did not have a chance to explore full potential of it yet. I remember
that Shay mentioned that not every aspect of ES is pluggable now but that
is all I know about it (personally, I did not hit the limits by myself yet,
may be I would if I wanted to employ Carrot2 clustering or something like
that)

As for #2, if you are after one-to-one comparison of Solr SearchComponents
and ES then I think we would find some matches and also some misses. Still
it could be an interesting exercise to do (although we should be careful to
include only those features that do work well in distributed environment).
We could probably end up identifying new feature requests, so this can be
useful.

Regards,
Lukas

On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic
otis.gos...@gmail.comjavascript:
wrote:

Hi Lukas,

Yes, SearchComponents are about extensibility, but specifically about
extending how queries are handled within Solr once Solr gets them. I
know ES has other types of plugins, and you've listed several of them,
but I'm wondering about which of them is SearchComponent-like.
I've looked at
http://www.elasticsearch.org/guide/reference/modules/plugins.html
, but couldn't find the answer to my Q there. Maybe I'm looking at
the wrong place?

Thanks,
Otis
--
Sematext is hiring Search Engineers --
http://sematext.com/about/jobs.html

On Sep 6, 2:57 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi,

I am not Solr expert but to me it seems that SearchComponents in Solr
are
about extensibility of out of the box functionality. If that is the case
then I would say that we can talk about plugins in ES world. Although
there
is no official doc about how to implement custom plugins yet it is
really
not difficult. Apart from that there are several plugins that are part
of
distribution (river plugins, attachments mapper, ICU analysis, scripting
languages ... to name a few) and they can be used as an inspiration if
a new
plugin implementation is needed.

My 2 cents.

Lukas

On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic
otis.gospodne...@gmail.com

wrote:
Hello,

A long time Solr user posted a good question about ES over on Sematext
Blog, about an equivalent of Solr's SearchComponents in ES:

http://blog.sematext.com/2010/05/03/elastic-search-distributed-lucene.
..

I'm curious, too. Thanks.

Otis
--
Sematext is hiring Search Engineers --
http://sematext.com/about/jobs.html

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/16add2bc-c629-4613-934f-004c8cc749df%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Kibana-auth install under RHEL6 server ?

2014-04-18 Thread Andrea Martines



 No one ?


:( I keep trying but there's always a tool that does not work :/ 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b9dfff2-7b8c-441e-8fd6-fee0402fcdc5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Rafał Kuć

Hello!

The attachment plugin will use Tika to extract the text from binary
file content that you send in the base64. Tika does a good job with
text extraction, however you have to test it yourself, if your files
are parsed well enough for your use case.

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


 So can I say that the mapper-attachment plugin is made to work like below:
 Whether I am sending text file or pdf file or image file to ES , the plugin
 will extract the *text content* in all three scenarios and will store it
 into the ES and then it will be available for search as well?



 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054374.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/241416263.20140418094630%40alud.com.pl.
For more options, visit https://groups.google.com/d/optout.

Re: ELK stack needs tuning

2014-04-18 Thread R. Toma

Hi Jörg,

Thank you for pointing me to this article. I needed to read it twice, but I 
think I understand it now.

I believe shard overallocating works for use-cases where you want to store 
 search 'users' or  'products'. Such data allows you to divide all 
documents into groups to be stored in different shards using routing. All 
shards get indexed  searched.

But how does this work for logstash indices? I could create 1 index with 
365 shards (if I want 1 year of retention) and use alias routing (alias per 
date with routing to a shard) to index into a different shard every day, 
but after 1 year I need to purge a shard. And purging a shard is not easy. 
It would require a delete of every document in the shard.

Or am I missing something?

Regards,
Renzp


Op donderdag 17 april 2014 16:15:43 UTC+2 schreef Jörg Prante:

 17 new indices every day - whew. Why don't you use shard overallocating?


 https://groups.google.com/forum/#!msg/elasticsearch/49q-_AgQCp8/MRol0t9asEcJ

 Jörg



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/88a6f992-400b-4fb5-80e5-7b024b17ffd6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[ANN] Elasticsearch AWS cloud plugin 2.1.1 released

2014-04-18 Thread David Pilato

Heya,


We are pleased to announce the release of the Elasticsearch AWS cloud plugin, 
version 2.1.1.

The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast 
discovery mechanism and add S3 repositories..

https://github.com/elasticsearch/elasticsearch-cloud-aws/

Release Notes - elasticsearch-cloud-aws - Version 2.1.1



Update:
 * [74] - cloud-aws 2.1.0 doesn't support elasticsearch 1.1.1 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/74)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-aws project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-aws/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5350ef75.643c9869.13c60%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

Re: Wildcard query is not working.

2014-04-18 Thread Dan Tuffery

You're setting the size parameter to 0 in your queries so it won't return 
anything. Also, you need to have an copy of the URL value in your index 
that is not analyzed which you can use for your wildcard query. In your 
mapping you need to specify that you want to index the URL value verbatim:

URL: {
type: string,
fields: {
untouched: {
type: string,
index: not_analyzed
},
}
}

Using the mapping above the URL value will be indexed using default 
standard analyzer, it will also index a verbatim copy of the value as 
specified by the 'untouched' field which you would use in the wildcard 
query:

curl -XGET 'http://localhost:9200/message_index/message_indext/_search' -d 
'{query:{wildcard:{URL.untouched:http://www.mohit-kumar-yadav.com*}}}'

Dan

On Thursday, April 17, 2014 8:55:20 PM UTC+1, Mohit Kumar Yadav wrote:

 hi folks,
 In my document there is a field which contians only URL as it value. 
 forexample {URL : 
 http://www.mohit-kumar-yadav.com\123124343\login_user.html; 
 }
 {URL : http://www.mohit-kumar-yadav.com\home_user.html}
 how can i search these documents.
 I am using following query :- 

 1. Curl -XGET '
 http://localhost:9200/message_index/message_indext/_search?size=0' 
 -d'{query:{wildcard:{URL:*mohit-kumar-yadav*}}}'

 no result.. query return zero hits

 2. Curl -XGET '
 http://localhost:9200/message_index/message_indext/_search?size=0' 
 -d'{query:{field:{URL:http://www.mohit-kumar-yadav.com}}}'

 no result.. query return zero hits

 3.  Curl -XGET '
 http://localhost:9200/message_index/message_indext/_search?size=0' 
 -d'{fuzzy_like_this_field : {URL : {like_text : 
 www.mohit-kumar-yadav.com,max_query_terms : 25}}}'

 no result.. query return zero hits


 please suggest me where i am doing wrong..

 Thanks in advance..!!!

 Regrads
 Mohit
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/84f57d26-3072-407c-bcfa-cdb40400788b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Is ElasticSearch the Right Tool for This

2014-04-18 Thread Clinton Gormley

Hiya

It's a bit more verbose, but yes you can do queries like that easily.  I've
assumed that all of your fields are exact value not_analyzed string
fields, rather than full text fields:

GET /_search
{
  _source: [ col1, col2 ],
  query: {
filtered: {
  filter: {
bool: {
  must: [
{
  bool: {
should: [
  { terms: { col3: [ some, value ]}},
  { missing: { field: col3 }}
]
  }
},
{
  bool: {
should: [
  { terms: { col4: [ another, set, values ]}},
  { missing: { field: col4 }}
]
  }
},
{ term: { col5: hello }}
  ],
  must_not: [
{ term: { col6: world }}
  ]
}
  }
}
  },
  sort: col7
}

All of those lookups use filters, so would be cached, making all future
executions very fast indeed.


On 18 April 2014 08:37, Paul paulj3...@gmail.com wrote:

 Hi,

 We're looking to move our infrastructure to ElasticSearch and I have some
 concerns.  We plan on using this more as a database and less than a search
 engine.  I know there are some companies out there that are doing this, but
 I have some queries that, with one SQL command I can get the results I
 need, whereas ElasticSearch I would need to do filters of queries, etc.


 An example, using SQL parlance, how would I do the following statement:

 select col1, col2 from mytable where col3 in [, some, value] and col4
 in [another, set, , values] and col5 = hello and col6 not in
 world order by col7.

 This is an example of some data I would be querying, and I would be
 performing 1000's of queries at a time.



 So my question:  Can ElasticSearch do this and if so, how can I do the
 above query.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a59fcffe-5671-4ee0-a6bf-d49aedd3189b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/a59fcffe-5671-4ee0-a6bf-d49aedd3189b%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKSKZ4uzfD3BuFtWqpnAg97Yc7m4cEtGBBbrYOoN5x7n0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Setting Node ID

2014-04-18 Thread Michael Salmon

I'm planning on trying out multiple nodes on one host and I'd like to be able 
to control the node id but as far as I can see this is set in NodeEnvironment 
to the first unused value. The reason for setting the id is so that I would 
like to include it in the node name which I currently set to the hostname.

How do others handle this?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4bba7cbf-5e0d-4401-931b-6ef442d3c87d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Word count per document

2014-04-18 Thread Aharon Twizer

Hi,

I'm new to ElasticSearch.

What I want to do is to upload a few hundred documents and then look for 
words in those documents.

The most important part is to get the count of the each word per document. 
e.g. If I look for the word boy, the answer I'll get is that it appears 3 
times in document A and 5 times in document B.

Can I do that with ElasticSearch?

Thanks in advanced!

Cheers,
Aharon.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f716d555-071f-44da-b868-6bc9ddd6455d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Word count per document

2014-04-18 Thread Itamar Syn-Hershko

Yes, take a look here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-termvectors.html

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Fri, Apr 18, 2014 at 2:52 PM, Aharon Twizer aharon.twi...@gmail.comwrote:

Hi,

I'm new to ElasticSearch.

What I want to do is to upload a few hundred documents and then look for
words in those documents.

The most important part is to get the count of the each word per document.
e.g. If I look for the word boy, the answer I'll get is that it appears 3
times in document A and 5 times in document B.

Can I do that with ElasticSearch?

Thanks in advanced!

Cheers,
Aharon.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f716d555-071f-44da-b868-6bc9ddd6455d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f716d555-071f-44da-b868-6bc9ddd6455d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Ztj0yDSS%2BAT8%3DM-DG7_JrjfsrLuK725RzTPEF57s6wRPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Word count per document

2014-04-18 Thread Aharon Twizer

Thanks Itamar.

But with the Term Vector I'll have to make a separate call for each 
document (I can have up to 20K documents).

I want to be able to make a single call with the word I'm looking for and 
to get the statistics for each document.


On Friday, April 18, 2014 2:52:53 PM UTC+3, Aharon Twizer wrote:

 Hi,

 I'm new to ElasticSearch.

 What I want to do is to upload a few hundred documents and then look for 
 words in those documents.

 The most important part is to get the count of the each word per document. 
 e.g. If I look for the word boy, the answer I'll get is that it appears 3 
 times in document A and 5 times in document B.

 Can I do that with ElasticSearch?

 Thanks in advanced!

 Cheers,
 Aharon.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e6e0ed5-3e3f-44a4-b11f-7f8efee2bbeb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Word count per document

2014-04-18 Thread Itamar Syn-Hershko

You should be able to do this using the aggregations framework:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html

The idea is that you bucket on document ID, and then on terms, then do a
count

But I'm not sure it was designed to handle this scenario, where you have
tens of thousands of buckets and then many unique terms in each bucket.
Maybe someone from ES core can chime in on that.

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Fri, Apr 18, 2014 at 3:40 PM, Aharon Twizer aharon.twi...@gmail.comwrote:

Thanks Itamar.

But with the Term Vector I'll have to make a separate call for each
document (I can have up to 20K documents).

I want to be able to make a single call with the word I'm looking for and
to get the statistics for each document.

On Friday, April 18, 2014 2:52:53 PM UTC+3, Aharon Twizer wrote:

Hi,

I'm new to ElasticSearch.

What I want to do is to upload a few hundred documents and then look for
words in those documents.

The most important part is to get the count of the each word per
document. e.g. If I look for the word boy, the answer I'll get is that it
appears 3 times in document A and 5 times in document B.

Can I do that with ElasticSearch?

Thanks in advanced!

Cheers,
Aharon.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4e6e0ed5-3e3f-44a4-b11f-7f8efee2bbeb%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4e6e0ed5-3e3f-44a4-b11f-7f8efee2bbeb%40googlegroups.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtQiwBa17exGbhoiGR%2B3-hvYMK4_3ueci1V_Lu7TS23WA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch on java7u55 ?

2014-04-18 Thread Lukáš Vlček

Hi,

is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
safe and recommended?

I found Robert and Uwe discussed this Java version here:
http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
I found couple of failed builds in
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16 that
might be related to this version of Java but all seemed to be rather Solr
related.

Regards,
Lukas

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Need some help for creating my model

2014-04-18 Thread Stefan Kruse

Ok new try. Is it general possible to do this with the PHP API,  i dont find 
nothing in the docu. Maybe i dont see it. Regards Stefan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/deb41747-30d3-4e48-8bb3-86f861020560%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Jason Wee

will these two links help?
https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
http://people.apache.org/~mikemccand/lucenebench/indexing.html

lucene performance test is using java 1.70 u40. that's the same version i'm
using for lucene 4.6.0.

jason

On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček lukas.vl...@gmail.com wrote:

Hi,

is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
safe and recommended?

I found Robert and Uwe discussed this Java version here:
http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
I found couple of failed builds in
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16 that
might be related to this version of Java but all seemed to be rather Solr
related.

Regards,
Lukas

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: logstash 1.4.0 debian package init script not working

2014-04-18 Thread Goofy03

Do you have check permission on /opt/logstash and /var/log/logstash 
/etc/logstash … same user than in the init script ?

Solve this for me on debian but i can't get event when apache log is 
update. than if i run it in root (console way) all is working …
Ho and i have add logstash user to adm group …

Le vendredi 18 avril 2014 06:36:51 UTC+2, OJ LaBoeuf a écrit :

 The upstart job also doesn't seem to work, it just keeps dying over and 
 over again never logging anything to the logfile.  

 If i manually start logstash everything works normally.

 On Thursday, April 17, 2014 6:12:38 PM UTC-7, OJ LaBoeuf wrote:

 Running Ubuntu 12.04 64bit, the logstash init script does not work.

 here's the script that came with logstash deb

 In particular I don't understand how the script is trying to parse 
 something from the logstash pid, before it even starts the program..?

   log_daemon_msg Starting $DESC

   # Parse the actual JAVACMD from the process' environment, we don't 
 care about errors.
   JAVA=$(cat /proc/$(cat ${PID_FILE} 2/dev/null)/environ 
 2/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
   if start-stop-daemon --test --start --pidfile $PID_FILE \
  --user $LS_USER --exec $JAVA \
   /dev/null; then
  # Prepare environment

 I checked and JAVA is empty at this location, so what the heck is this 
 trying to do?


 running this bit:
 sudo start-stop-daemon --test --start --pidfile /var/run/logstash.pid 
 --user logstash --exec 

 results in the same message i get at the commandline when trying to 
 /etc/init.d/logstash start
 start-stop-daemon: unable to stat  (No such file or directory)


 Please advise.



 Full init script pasted below 


 #!/bin/bash
 #
 # /etc/init.d/logstash -- startup script for LogStash.
 #
 ### BEGIN INIT INFO
 # Provides:  logstash
 # Required-Start:$all
 # Required-Stop: $all
 # Default-Start: 2 3 4 5
 # Default-Stop:  0 1 6
 # Short-Description: Starts logstash
 # Description:   Starts logstash using start-stop-daemon
 ### END INIT INFO

 set -e

 NAME=logstash
 DESC=Logstash Daemon
 DEFAULT=/etc/default/$NAME

 if [ `id -u` -ne 0 ]; then
echo You need root privileges to run this script
exit 1
 fi

 . /lib/lsb/init-functions

 if [ -r /etc/default/rcS ]; then
. /etc/default/rcS
 fi

 # The following variables can be overwritten in $DEFAULT
 PATH=/bin:/usr/bin:/sbin:/usr/sbin

 # See contents of file named in $DEFAULT for comments
 LS_USER=logstash
 LS_GROUP=logstash
 LS_HOME=/var/lib/logstash
 LS_HEAP_SIZE=500m
 LS_JAVA_OPTS=-Djava.io.tmpdir=${LS_HOME}
 LS_LOG_FILE=/var/log/logstash/$NAME.log
 LS_CONF_DIR=/etc/logstash/conf.d
 LS_OPEN_FILES=16384
 LS_NICE=19
 LS_OPTS=
 LS_PIDFILE=/var/run/$NAME.pid

 # End of variables that can be overwritten in $DEFAULT

 # overwrite settings from default file
 if [ -f $DEFAULT ]; then
. $DEFAULT
 fi

 # Define other required variables
 PID_FILE=${LS_PIDFILE}
 DAEMON=/opt/logstash/bin/logstash
 DAEMON_OPTS=agent -f ${LS_CONF_DIR} -l ${LS_LOG_FILE} ${LS_OPTS}

 # Check DAEMON exists
 if ! test -e $DAEMON; then
log_failure_msg Script $DAEMON doesn't exist
exit 1
 fi

 case $1 in
start)
   if [ -z $DAEMON ]; then
  log_failure_msg no logstash script found - $DAEMON
  exit 1
   fi

   # Check if a config file exists
   if [ ! $(ls -A $LS_CONF_DIR/*.conf 2 /dev/null) ]; then
  log_failure_msg There aren't any configuration files in 
 $LS_CONF_DIR
  exit 1
   fi

   log_daemon_msg Starting $DESC

   # Parse the actual JAVACMD from the process' environment, we don't 
 care about errors.
   JAVA=$(cat /proc/$(cat ${PID_FILE} 2/dev/null)/environ 
 2/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
   if start-stop-daemon --test --start --pidfile $PID_FILE \
  --user $LS_USER --exec $JAVA \
   /dev/null; then
  # Prepare environment
  HOME=${HOME:-$LS_HOME}
  JAVA_OPTS=${LS_JAVA_OPTS}
  ulimit -n ${LS_OPEN_FILES}
  cd ${LS_HOME}
  export PATH HOME JAVACMD JAVA_OPTS LS_HEAP_SIZE LS_JAVA_OPTS 
 LS_USE_GC_LOGGING

  # Start Daemon
  start-stop-daemon --start -b --user $LS_USER -c 
 $LS_USER:$LS_GROUP \
-d $LS_HOME --nicelevel $LS_NICE --pidfile $PID_FILE 
 --make-pidfile \
--exec $DAEMON -- $DAEMON_OPTS

  sleep 1

  # Parse the actual JAVACMD from the process' environment, we 
 don't care about errors.
  JAVA=$(cat /proc/$(cat ${PID_FILE} 2/dev/null)/environ 
 2/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
  if start-stop-daemon --test --start --pidfile $PID_FILE \
  --user $LS_USER --exec $JAVA \
  /dev/null; then

 if [ -f $PID_FILE ]; then
rm -f $PID_FILE
 fi

 log_end_msg 1
  else
 log_end_msg 0
  fi
   else
  log_progress_msg (already running)
  log_end_msg

Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber

Yes, you can use the Function Score Query [1] in combination with a native
script written in java [2]. With the native script you can basically do
whatever you want, but be careful you can significantly impact your query
performance if you are not careful.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_native_java_scripts

Thanks,
Matt Weber

On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy
ursva...@gmail.comwrote:

Thanks
Srini

On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:

Hi Otis,

So if I understand it correctly (providing my knowledge is quite limited
here) you are asking if
1) it is possible to hook into query processing flow and inject or extend
custom handlers for individual flow phases and
2) if we can find in ES the same functionality which is currently
provided by components listed here: http://wiki.apache.org/
solr/SearchComponent (or here: http://lucene.apache.org/solr/
api/org/apache/solr/handler/component/SearchComponent.html).

As for #2, if you are after one-to-one comparison of Solr
SearchComponents and ES then I think we would find some matches and also
some misses. Still it could be an interesting exercise to do (although we
should be careful to include only those features that do work well in
distributed environment). We could probably end up identifying new feature
requests, so this can be useful.

Regards,
Lukas

On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic
otis.gos...@gmail.comwrote:

Hi Lukas,

Yes, SearchComponents are about extensibility, but specifically about
extending how queries are handled within Solr once Solr gets them. I
know ES has other types of plugins, and you've listed several of them,
but I'm wondering about which of them is SearchComponent-like.
I've looked at http://www.elasticsearch.org/guide/reference/modules/
plugins.html
, but couldn't find the answer to my Q there. Maybe I'm looking at
the wrong place?

Thanks,
Otis
--
Sematext is hiring Search Engineers -- http://sematext.com/about/
jobs.html

On Sep 6, 2:57 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi,

I am not Solr expert but to me it seems that SearchComponents in Solr
are
about extensibility of out of the box functionality. If that is the
case
then I would say that we can talk about plugins in ES world. Although
there
is no official doc about how to implement custom plugins yet it is
really
not difficult. Apart from that there are several plugins that are part
of
distribution (river plugins, attachments mapper, ICU analysis,
scripting
languages ... to name a few) and they can be used as an inspiration if
a new
plugin implementation is needed.

My 2 cents.

Lukas

On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic
otis.gospodne...@gmail.com

wrote:
Hello,

A long time Solr user posted a good question about ES over on
Sematext
Blog, about an equivalent of Solr's SearchComponents in ES:

http://blog.sematext.com/2010/05/03/elastic-search-
distributed-lucene...

I'm curious, too. Thanks.

Otis
--
Sematext is hiring Search Engineers --http://sematext.com/about/
jobs.html

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/16add2bc-c629-4613-934f-004c8cc749df%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/16add2bc-c629-4613-934f-004c8cc749df%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch

Getting phrase count for each document separately.

2014-04-18 Thread Amit

I would like to get a phrase count for every document.
I do not wish to run a query for every document, i would rather run one 
single query.

For example if i have the following documents:
{
   name : John,
   Message : The lion is *very *fast
}

{
  name : Ben,
  Message : The lion is *very very* fast
}

I would like to query my documents for the word *very* and get back 
something like this:
{
   name : John,
   Message : The lion is *very *fast,
  *score : 1*
}

{
  name : Ben,
  Message : The lion is *very very* fast,
  *score : 2*
}

I failed to find out how to do this so far. I only found queries that give 
the sum of phrase count of all documents together (in my example 3).
How can I do this using elastic search query?
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51e15537-cb1f-4e49-926e-6b2a6fce56b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Getting phrase count for each document separately.

2014-04-18 Thread Amit



I would like to get a phrase count for each document separately.
I do not wish to run a query for every document, i would rather run one 
single query.

For example if i have the following documents:
{
   name : John,
   message : The lion is *very **fast*
}

{
  name : Ben,
  message : The lion is *very **fast* and the bardelas is *very fast*


}

I would like to query my documents for the phrase *very fast* and get 
back something like this:
{
   name : John,
message : The lion is *very **fast*,
  *count : 1*

}

{
  name : Ben,
   message : The lion is *very **fast* and the bardelas is *very 
fast*,
  *count : 2*
}

I failed to find out how to do this so far. I only found queries that give 
the total number of documents that contain the phrase (in my example 2 
documents).
How can I do this using elastic search query?
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f1aefe9-97fa-44ba-a4b1-644536bd2a5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Michael McCandless

1.7u55 should be safe for ElasticSearch; we just put out a blog post about
this:

http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/

And I'll fix the nightly Lucene benchmarks to use u55 too! I should NOT
have been using u40: it's not safe.

Mike

http://blog.mikemccandless.com

On Fri, Apr 18, 2014 at 9:52 AM, Jason Wee peich...@gmail.com wrote:

will these two links help?

https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
http://people.apache.org/~mikemccand/lucenebench/indexing.html

lucene performance test is using java 1.70 u40. that's the same version
i'm using for lucene 4.6.0.

jason

On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček lukas.vl...@gmail.comwrote:

Hi,

is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
safe and recommended?

I found Robert and Uwe discussed this Java version here:
http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
I found couple of failed builds in
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16
that might be related to this version of Java but all seemed to be rather
Solr related.

Regards,
Lukas

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANPQZbw5KOvKO38pfT6y0azci9cUzYOR2%3DicJy4_RW6jry1Tcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Srinivasan Ramaswamy

Thats great, thanks for your reply. This looks like a good solution for my
requirement ! Is this script applied in each shard ? I want to apply this
function to all the documents so that the Top N picked from each shard is
picked by my custom score.

Also, can you elaborate a little bit on be careful you can significantly
impact your query performance if you are not careful. I would like to
understand the best practices there.

On Friday, April 18, 2014 8:14:54 AM UTC-7, Matt Weber wrote:

Thanks,
Matt Weber

On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy
ursv...@gmail.comjavascript:
wrote:

I know this thread is a very old thread, but i didnt find much
information on how to do custom scoring (in elasticsearch) with data thats
not stored in the index. This thread looked very relevant to my
requirement, so trying to see whether you guys have solved similar
requirements with elasticsearch.

Thanks
Srini

On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:

Hi Otis,

So if I understand it correctly (providing my knowledge is quite limited
here) you are asking if
1) it is possible to hook into query processing flow and inject or
extend custom handlers for individual flow phases and
2) if we can find in ES the same functionality which is currently
provided by components listed here: http://wiki.apache.org/
solr/SearchComponent (or here: http://lucene.apache.org/solr/
api/org/apache/solr/handler/component/SearchComponent.html).

As for #1, frankly, I do not know. I have been playing with plugins a
bit but did not have a chance to explore full potential of it yet. I
remember that Shay mentioned that not every aspect of ES is pluggable now
but that is all I know about it (personally, I did not hit the limits by
myself yet, may be I would if I wanted to employ Carrot2 clustering or
something like that)

As for #2, if you are after one-to-one comparison of Solr
SearchComponents and ES then I think we would find some matches and also
some misses. Still it could be an interesting exercise to do (although we
should be careful to include only those features that do work well in
distributed environment). We could probably end up identifying new feature
requests, so this can be useful.

Regards,
Lukas

On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic
otis.gos...@gmail.comwrote:

Hi Lukas,

Yes, SearchComponents are about extensibility, but specifically about
extending how queries are handled within Solr once Solr gets them. I
know ES has other types of plugins, and you've listed several of them,
but I'm wondering about which of them is SearchComponent-like.
I've looked at http://www.elasticsearch.org/guide/reference/modules/
plugins.html
, but couldn't find the answer to my Q there. Maybe I'm looking at
the wrong place?

Thanks,
Otis
--
Sematext is hiring Search Engineers -- http://sematext.com/about/
jobs.html

On Sep 6, 2:57 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi,

I am not Solr expert but to me it seems that SearchComponents in Solr
are
about extensibility of out of the box functionality. If that is the
case
then I would say that we can talk about plugins in ES world. Although
there
is no official doc about how to implement custom plugins yet it is
really
not difficult. Apart from that there are several plugins that are
part of
distribution (river plugins, attachments mapper, ICU analysis,
scripting
languages ... to name a few) and they can be used as an inspiration
if a new
plugin implementation is needed.

My 2 cents.

Lukas

On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic
otis.gospodne...@gmail.com

wrote:
Hello,

A long time Solr user posted a good question about ES over on
Sematext
Blog, about an equivalent of Solr's SearchComponents in ES:

http://blog.sematext.com/2010/05/03/elastic-search-
distributed-lucene...

I'm curious, too. Thanks.

Otis
--
Sematext is hiring Search Engineers --http://sematext.com/about/
jobs.html

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To

Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Lukáš Vlček

Excellent, thanks Michael.
Dne 18.4.2014 18:18 Michael McCandless m...@mikemccandless.com
napsal(a):

1.7u55 should be safe for ElasticSearch; we just put out a blog post about
this:

http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/

And I'll fix the nightly Lucene benchmarks to use u55 too! I should NOT
have been using u40: it's not safe.

Mike

http://blog.mikemccandless.com

On Fri, Apr 18, 2014 at 9:52 AM, Jason Wee peich...@gmail.com wrote:

will these two links help?

https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
http://people.apache.org/~mikemccand/lucenebench/indexing.html

lucene performance test is using java 1.70 u40. that's the same version
i'm using for lucene 4.6.0.

jason

On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček lukas.vl...@gmail.comwrote:

Hi,

is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is
it safe and recommended?

I found Robert and Uwe discussed this Java version here:
http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
I found couple of failed builds in
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16
that might be related to this version of Java but all seemed to be rather
Solr related.

Regards,
Lukas

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANPQZbw5KOvKO38pfT6y0azci9cUzYOR2%3DicJy4_RW6jry1Tcw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CANPQZbw5KOvKO38pfT6y0azci9cUzYOR2%3DicJy4_RW6jry1Tcw%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYGjznSKZos%2Bb-Ar-s%2BAeSyJHWJqtY_4_ny1So4ka0iUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Error installing ldap river plugin

2014-04-18 Thread Tom Wilson

I'm completely new to elasticsearch and am trying to put together a
proof-of-concept using LDAP as a data store.

However, I came across a problem right out of the starting gate, attempting
to install the ldap river plugin, according to the instructions here:

https://github.com/tlrx/elasticsearch-river-ldap

I got this output. What went wrong, and how do I fix it?

-tom

C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\binplugin
-install tlrx/elasticsearch-river-ldap/0.0
.2
- Installing tlrx/elasticsearch-river-ldap/0.0.2...
Trying
http://download.elasticsearch.org/tlrx/elasticsearch-river-ldap/elasticsearch-river-ldap-0.0.2.zip...
Trying
http://search.maven.org/remotecontent?filepath=tlrx/elasticsearch-river-ldap/0.0.2/elasticsearch-river-ldap-0.0.2
.zip...
Trying
https://oss.sonatype.org/service/local/repositories/releases/content/tlrx/elasticsearch-river-ldap/0.0.2/elastics
earch-river-ldap-0.0.2.zip...
Trying
https://github.com/tlrx/elasticsearch-river-ldap/archive/v0.0.2.zip...
Trying
https://github.com/tlrx/elasticsearch-river-ldap/archive/master.zip...
Downloading DONE
Installed tlrx/elasticsearch-river-ldap/0.0.2 into
C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\pl
ugins\river-ldap
Usage:
-u, --url [plugin location] : Set exact URL to download the
plugin from
-i, --install [plugin name] : Downloads and installs listed
plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h...
(infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list: List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help: Prints this help message

[*] Plugin name could be:
elasticsearch/plugin/version for official elasticsearch plugins
(download from download.elasticsearch.org)
groupId/artifactId/version for community plugins (download from
maven central or oss sonatype)
username/repository for site plugins (download from github
master)

Message:
Error while installing plugin, reason: IllegalArgumentException: Plugin
installation assumed to be site plugin, but c
ontains source code, aborting installation.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/187f5738-2d27-4d8c-842d-d521934a94f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: searching most recent objects

2014-04-18 Thread Phil Greenberg

Oh, awesome, thank you so much for the help, I'll give that a try!

On Thursday, April 17, 2014 2:51:23 PM UTC-7, Itamar Syn-Hershko wrote:

 For recent X just sort on the _timestamp field and specify X as the page 
 size 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Fri, Apr 18, 2014 at 12:43 AM, Phil Greenberg 
 philip.n@gmail.comjavascript:
  wrote:

 Thanks Itamar.

 So are you saying it's not possible to ask ES for the most recent X 
 objects that match the given query?  Only to say give me the last 30 days 
 of objects?


 On Thursday, April 17, 2014 2:39:43 PM UTC-7, Itamar Syn-Hershko wrote:

 Filter (range filter on the date/time field) is exactly the way to do 
 this.

 Another possibility is using rolling indexes (e.g. an index per day, 
 like the logstash indexes are defined) but that obviously depends on a lot 
 of other business concerns and isn't really viable for most applications

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Fri, Apr 18, 2014 at 12:36 AM, Phil Greenberg philip.n@gmail.com
  wrote:

  I am also facing the same issue.

 Right now, I am just doing a filter myself, but I would assume this is 
 a common use case, an ES must have a way to deal with it?


 On Tuesday, April 15, 2014 6:24:52 PM UTC-7, Joris Bolsens wrote:

 I am using the javascript API and want to do a search and have it 
 search through the most recent objects, IE I call a search with size 100, 
 I 
 want to have the most recent 100 objects returned to me, how would I go 
 about doing that?

 I tried using sort, but it seems that it just sorts the results after 
 the search completed

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/fb4a3d83-c386-459b-beb6-a8ca4fcbb286%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/fb4a3d83-c386-459b-beb6-a8ca4fcbb286%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ec1fa45f-4534-45a2-96ae-1d5edf783ac4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ec1fa45f-4534-45a2-96ae-1d5edf783ac4%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53cdd542-4d17-47b9-bedb-3d2f6937864d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cache cleaner in hot threads

2014-04-18 Thread Nikolas Everett

I'm still doing performance work and I keep seeing the CacheCleaner pop up
[1].  I don't know how much of an effect its actually having, but I imagine
its something.

It looks like entries in the cache get queued for deletion both by cache
clear commands and by readers closing.  Would it make sense to forgo
removing entries when the readers close and let the LRU policy clean it up?

Nik

[1]:
   100.4% (502.1ms out of 500ms) cpu usage by thread
'elasticsearch[elastic1001][generic][T#73]'
 2/10 snapshots sharing following 5 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.remove(LocalCache.java:4353)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:186)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)
 4/10 snapshots sharing following 8 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.nextInTable(LocalCache.java:4306)

org.elasticsearch.common.cache.LocalCache$HashIterator.advance(LocalCache.java:4271)

org.elasticsearch.common.cache.LocalCache$HashIterator.nextEntry(LocalCache.java:4346)

org.elasticsearch.common.cache.LocalCache$KeyIterator.next(LocalCache.java:4362)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:183)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)
 4/10 snapshots sharing following 7 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.advance(LocalCache.java:4271)

org.elasticsearch.common.cache.LocalCache$HashIterator.nextEntry(LocalCache.java:4346)

org.elasticsearch.common.cache.LocalCache$KeyIterator.next(LocalCache.java:4362)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:183)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0p8QT5LadzeoinXJdQTyDzNrcaua95UON8zjVaY%3D1cMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber

Well, the scripts runs against all matching documents of the query so you
can do a match_all query [1] to have the logic applied to all your
documents. This is going to be expensive though, so try to filter out as
many documents as possible before applying the custom scoring. Maybe even
perform a rescore [2] on the top X docs. It really all depends on your
requirements though. Run some tests and tune based on those results.

When I said to be careful. I mean don't do a lot of blocking IO or long
running calculations as the script is ran against each matching document.
Cache results and make the script return as quick as possible.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks,
Matt Weber

On Fri, Apr 18, 2014 at 9:46 AM, Srinivasan Ramaswamy ursva...@gmail.comwrote:

Also, can you elaborate a little bit on be careful you can significantly
impact your query performance if you are not careful. I would like to
understand the best practices there.

On Friday, April 18, 2014 8:14:54 AM UTC-7, Matt Weber wrote:

Yes, you can use the Function Score Query [1] in combination with a
native script written in java [2]. With the native script you can
basically do whatever you want, but be careful you can significantly impact
your query performance if you are not careful.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/
reference/current/query-dsl-function-score-query.html
[2] http://www.elasticsearch.org/guide/en/elasticsearch/
reference/current/modules-scripting.html#_native_java_scripts

Thanks,
Matt Weber

On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy ursv...@gmail.com
wrote:

I would like to influence the ranking with few fields that are not
stored in the index (eg click data for keyword-documents). I have used
custom SearchComponent in Solr to implement similar functionality in the
past. I am wondering how can i achieve the same in ElasticSearch.

I know this thread is a very old thread, but i didnt find much
information on how to do custom scoring (in elasticsearch) with data thats
not stored in the index. This thread looked very relevant to my
requirement, so trying to see whether you guys have solved similar
requirements with elasticsearch.

Thanks
Srini

On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:

Hi Otis,

So if I understand it correctly (providing my knowledge is quite
limited here) you are asking if
1) it is possible to hook into query processing flow and inject or
extend custom handlers for individual flow phases and
2) if we can find in ES the same functionality which is currently
provided by components listed here: http://wiki.apache.org/s
olr/SearchComponent (or here: http://lucene.apache.org/solr/
api/org/apache/solr/handler/component/SearchComponent.html).

As for #1, frankly, I do not know. I have been playing with plugins a
bit but did not have a chance to explore full potential of it yet. I
remember that Shay mentioned that not every aspect of ES is pluggable now
but that is all I know about it (personally, I did not hit the limits by
myself yet, may be I would if I wanted to employ Carrot2 clustering or
something like that)

As for #2, if you are after one-to-one comparison of Solr
SearchComponents and ES then I think we would find some matches and also
some misses. Still it could be an interesting exercise to do (although we
should be careful to include only those features that do work well in
distributed environment). We could probably end up identifying new feature
requests, so this can be useful.

Regards,
Lukas

On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic otis.gos...@gmail.com
wrote:

Hi Lukas,

Yes, SearchComponents are about extensibility, but specifically about
extending how queries are handled within Solr once Solr gets them. I
know ES has other types of plugins, and you've listed several of them,
but I'm wondering about which of them is SearchComponent-like.
I've looked at http://www.elasticsearch.org/guide/reference/modules/
plugins.html
, but couldn't find the answer to my Q there. Maybe I'm looking at
the wrong place?

Thanks,
Otis
--
Sematext is hiring Search Engineers -- http://sematext.com/about/jobs
.html

On Sep 6, 2:57 pm, Lukáš Vlček lukas.vl...@gmail.com wrote:
Hi,

Switching back to ConcurrentMergeScheduler

2014-04-18 Thread David Smith

I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it 
affecting indexing performance in 1.1.0.
https://github.com/elasticsearch/elasticsearch/issues/5817

We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a 
way to switch it back using the API? I tried the following command, but it 
seems to not take.

curl -i -XPUT localhost:9200/_cluster/settings -d '{ persistent: { 
index.merge.scheduler.type: 
org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider 
} }'
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 52

{acknowledged:true,persistent:{},transient:{}}


It does not seem to be set when I try to re-GET it (and no errors in logs 
at DEBUG level or above).

curl -i -XGET localhost:9200/_cluster/settings
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 66

{persistent:{threadpool:{bulk:{size:8}}},transient:{}}


Am using the wrong way of specifying the scheduler? I also tried just 
specifying ConcurrentMergeSchedulerProvider instead of the full class name, 
but that didn't work.

Any ideas?
David

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith

Yes, function score query works with native scripts. We use it with them. 

I'm not sure whether native scripts are automatically cached.

On Saturday, April 12, 2014 1:49:32 PM UTC-4, Eric T wrote:

 Hi,

 The function score documentation doesn't mention any support for native 
 scripts, does it still work for the Function Score Query, if so is it the 
 same syntax? 
 I'm using the custom_filters_score query with a native script but the 
 query is deprecated in the latest ES version. I'm still using 0.90.3 but I 
 plan to upgrade to the latest version. 

 It says that the script_score function for function_score is cached. Does 
 this provide the same performance as the Native script? I'm wondering if 
 it's necessary to still use a native script or convert it to the 
 script_score function

 thanks
 Eric


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/edddb638-0b4e-49b8-8925-257064dc0afe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith

You can use a function score query with a native script in this manner.
 

{

  function_score : {

query : {

  match_all : { }

},

functions : [ {

  filter : {

terms : {

  myfield : [ 103, 104, 134, 180 ],

  _cache : true

}

  },

  script_score : {

script : myscriptname,

lang : native,

params : {

  myparam1 : something,

  myparam2 : somethingElse

}

  }

} ],

score_mode : sum

  }

}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb47555a-dd96-4dde-bf20-e80f42f975cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Query and Filter

2014-04-18 Thread Matt Hughes

Trying to compose a query and filter combination to no avail:

{
   from:0,
   size:200,
   query:{
  filtered:{
 query:{
query_string:{
   fields:[
  _all
   ],
   query:\Test message\
}
 },
 filter:{
and:[
   {
  term:{
 appId:a32b782c-3c51-4d76-9b01-c4c1ffe53d8b
  }
   },
   {
  term:{
 processId:754311ef-d807-4bb4-8c5e-1b480fb7034f
  }
   }
]
 }
  }
   }
}

That parses fine by ES, but never returns the results.  I know the two 
fields are correct and in my index.  If I take off the 'filter', I get the 
expected results, but I need the filter to narrow the results.  When I 
compose the same query using Kibana, it tries to use an 'ffilter' query 
which I don't see documented anywhere:

filter: {
bool: {
  must: [
{
  terms: {
_type: [
  event
]
  }
},
{
  fquery: {
query: {
  query_string: {
query: appId:(\a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\)
  }
},
_cache: true
  }
}
  ]
}


Any pointers would be most appreciated.  Pulling my hair out here.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Filter first then search

2014-04-18 Thread David Smith

I'm also curious to know if there is way to do the opposite of 
FilteredQuery... basically QueriedFilter. Filter first and then run a query 
on the filtered results.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/818d6e9f-b4d8-4427-b9c1-1723ac0dd5d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query and Filter

2014-04-18 Thread Matt Weber

Chances are your appId and processId fields are analyzed so it is breaking
up the id's.  Update your mapping of these fields so it is not analyzed
[1].  Also, you should not use an and filter to combine term filters.
 Use a boolean filter [2] with must clauses for better performance.  Read
why at
http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/.


[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html

Thanks,
Matt Weber



On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes hughes.m...@gmail.com wrote:

 Trying to compose a query and filter combination to no avail:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message\
 }
  },
  filter:{
 and:[
{
   term:{
  appId:a32b782c-3c51-4d76-9b01-c4c1ffe53d8b
   }
},
{
   term:{
  processId:754311ef-d807-4bb4-8c5e-1b480fb7034f
   }
}
 ]
  }
   }
}
 }

 That parses fine by ES, but never returns the results.  I know the two
 fields are correct and in my index.  If I take off the 'filter', I get the
 expected results, but I need the filter to narrow the results.  When I
 compose the same query using Kibana, it tries to use an 'ffilter' query
 which I don't see documented anywhere:

 filter: {
 bool: {
   must: [
 {
   terms: {
 _type: [
   event
 ]
   }
 },
 {
   fquery: {
 query: {
   query_string: {
 query: 
 appId:(\a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\)
   }
 },
 _cache: true
   }
 }
   ]
 }


 Any pointers would be most appreciated.  Pulling my hair out here.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoBc0EmeY5yUo0juR5EUuOR%3DmuaROPbYKJJ9u7qP_-HB9w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

ANN Elastisch 2.0.0-beta4 is released

2014-04-18 Thread Michael Klishin

Elastisch [1] is a small, feature complete Clojure client for ElasticSearch.

Release notes:
http://blog.clojurewerkz.org/blog/2014/04/11/elastisch-2-dot-0-0-beta4-is-released/

1. http://clojureelasticsearch.info
-- 
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAE3HoVSbsmF%3D2KoJSwn50h_NSJxs2woSZJ4FHcH7VTb_azWxHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Continuous async replication

2014-04-18 Thread Mohit Anchlia

As I understand there is currently no feature that does async replication
between 2 clusters or even within the same cluster, but we have a need to
write one. What would be the best way to do it in elasticsearch? I was
thinking of leveraging Scroll for this.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrUdDdiqy62yHaaS6bJJ08_txDCNNXR8rGr%3DRGY8gAv-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Testing for an Empty String

2014-04-18 Thread Paul

Hi,

Thanks for everyone's patience while I learn the elasticsearch query DSL. 
 I'm trying to get used to its verbosity.


How would I do a query like this, again in SQL parlance:  select col1 from 
mysource where col2 = ?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b614d6f-cb0f-4bad-9a64-f787bd0deb29%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Splunk vs. Elastic search performance?

2014-04-18 Thread Frank Flynn

We have a large Splunk instance.  We load about 1.25 Tb of logs a day.  We 
have about 1,300 loaders (servers that collect and load logs - they may do 
other things too).

As I look at Elasticsearch / Logstash / Kibana does anyone know of a 
performance comparison guide?  Should I expect to run on very similar 
hardware?  More? or Less?

Sure it depends on exactly what we're doing, the exact queries and the 
frequency we'd run them but I'm trying to get any kind of idea before we 
start.

Are there any white papers or other documents about switching?  It seems an 
obvious choice but I can only find very little performance comparisons (I 
did see that Elasticsearch just hired the former VP of Products at Splunk, 
Gaurav Gupta - but there were few numbers in that article either).

Thanks,
Frank

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query and Filter

2014-04-18 Thread Matt Hughes

Thanks for the quick reply!

I updated the mappings and confirmed both types read not_analyzed.   I also 
updated the query to use bool/must:

{
   from:0,
   size:200,
   query:{
  filtered:{
 query:{
query_string:{
   fields:[
  _all
   ],
   query:\Test message from AT by user admin was 
generated\
}
 },
 filter:{
bool:{
   must:[
  {
 term:{
where.appId:12229ac6-8e9a-43ff-ab67-e80f3c585a69
 }
  },
  {
 term:{
where.processId:
bd13dbe5-0a4c-4469-a645-44cb3fde280a
 }
  }
   ]
}
 }
  }
   }
}

Still not getting any hits though.  Tried escaping the terms.  Is there 
anything special about having nested field names like that 
'where.processId'?

On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:

 Chances are your appId and processId fields are analyzed so it is breaking 
 up the id's.  Update your mapping of these fields so it is not analyzed 
 [1].  Also, you should not use an and filter to combine term filters. 
  Use a boolean filter [2] with must clauses for better performance.  Read 
 why at 
 http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/.


 [1] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
 [2] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html

 Thanks,
 Matt Weber



 On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes hughe...@gmail.comjavascript:
  wrote:

 Trying to compose a query and filter combination to no avail:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message\
 }
  },
  filter:{
 and:[
{
   term:{
  appId:a32b782c-3c51-4d76-9b01-c4c1ffe53d8b
   }
},
{
   term:{
  processId:754311ef-d807-4bb4-8c5e-1b480fb7034f
   }
}
 ]
  }
   }
}
 }

 That parses fine by ES, but never returns the results.  I know the two 
 fields are correct and in my index.  If I take off the 'filter', I get the 
 expected results, but I need the filter to narrow the results.  When I 
 compose the same query using Kibana, it tries to use an 'ffilter' query 
 which I don't see documented anywhere:

 filter: {

 bool: {
   must: [

 {
   terms: {

 _type: [
   event

 ]
   }
 },
 {

   fquery: {
 query: {

   query_string: {
 query: 
 appId:(\a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\)

   }
 },
 _cache: true

   }
 }
   ]
 }


 Any pointers would be most appreciated.  Pulling my hair out here.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58feafb8-1110-4630-8cbd-ebfd5fef0809%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Testing for an Empty String With the Following

2014-04-18 Thread Paul

Hi,

Thanks for everyone's patience while I learn the elasticsearch query DSL. 
 I'm trying to get used to its verbosity.


How would I do a query like this, again in SQL parlance:  select col1 from 
mysource where col2 =  and col3 in [, one, two] and col4 = foo

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbf00b67-b354-4087-a937-450055fce661%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Splunk vs. Elastic search performance?

2014-04-18 Thread Mark Walkom

That's a lot of data! I don't know of any installations that big but
someone else might.

What sort of infrastructure are you running splunk on now, what's your
current and expected retention?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 19 April 2014 07:33, Frank Flynn faultlessfr...@gmail.com wrote:

We have a large Splunk instance. We load about 1.25 Tb of logs a day. We
have about 1,300 loaders (servers that collect and load logs - they may do
other things too).

As I look at Elasticsearch / Logstash / Kibana does anyone know of a
performance comparison guide? Should I expect to run on very similar
hardware? More? or Less?

Sure it depends on exactly what we're doing, the exact queries and the
frequency we'd run them but I'm trying to get any kind of idea before we
start.

Are there any white papers or other documents about switching? It seems
an obvious choice but I can only find very little performance comparisons
(I did see that Elasticsearch just hired the former VP of Products at
Splunk, Gaurav Gupta - but there were few numbers in that article either).

Thanks,
Frank

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZwX2YACKX_yobDK%2BjXHRdexq2gKQ1iOO7%3DAPPoKkBZmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

LDAP plugin not populating

2014-04-18 Thread Tom Wilson

I'm trying to set up search of LDAP objects  using the ldap river plugin. I 
managed to install the plugin and set up my new river, but all searches are 
coming up empty. The elasticsearch stdout says:

[2014-04-18 15:00:16,904][INFO ][river.ldap   ] [Silver 
Scorpion] [ldap][hpd] now, ldap river null waiting for 1m ms

Why is my ldap river null? Maybe someone can look at this and tell me 
what I'm doing wrong.

I am trying to index one LDAP object (objectClass=HCProfessional), which 
resides in the container ou=HCProfessional,o=testhie,dc=hpdtest

I included a list of basic attributes, and am authenticating using the 
default admin account. Here is the REST payload I sent the server

PUT http://localhost:9200/_river/hpd/_meta
{
type : ldap,
ldap : {
host : localhost,
port : 10389,
ssl  : false,
userDn : uid=admin,ou=users,ou=system,
credentials : secret,
baseDn : ou=HCProfessional,o=testhie,dc=hpdtest,
filter : (objectClass=HCProfessional),
scope : subtree,
attributes : [
uid,
sn, 
cn, 
description,
facsimileTelephoneNumber,
gender,
givenName,
hcSpecialization,
hpdMedicalRecordsDeliveryEmail,
hpdProviderLanguageSupported,
hpdProviderMailingAddress,
mail,
telephoneNumber
],
fields : [
_id,
sn, 
cn, 
description,
facsimileTelephoneNumber,
gender,
givenName,
hcSpecialization,
hpdMedicalRecordsDeliveryEmail,
hpdProviderLanguageSupported,
hpdProviderMailingAddress,
mail,
telephoneNumber
],
poll : 6
},
index : {
index : hpd,
type : HCProfessional
}
}


Now, when I send what I think is a simple search command:

GET http://localhost:9200/hpd/_search

I get back this:


   1. {
   2. took: 1,
   3. timed_out: false,
   4. _shards: {
   5. total: 5,
   6. successful: 5,
   7. failed: 0
   8. },
   9. hits: {
   10. total: 0,
   11. max_score: null,
   12. hits: []
   13. }
   14. }

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5b4ee277-2eee-4100-a74c-67c858d0e907%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ELK stack needs tuning

2014-04-18 Thread Mark Walkom

If you want unlimited retention you're going to have to keep adding more
nodes to the cluster to deal with it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 17 April 2014 22:48, R. Toma renzo.t...@gmail.com wrote:

 Hi Mark,

 Thank you for your comments.

 Regarding the monitoring. We use the Diamond ES collector which saves
 metrics every 30 seconds in Graphite. ElasticHQ is nice, but does
 diagnostics calculations for the whole runtime of the cluster instead of
 last X minutes. It does have nice diagnostics rules, so I created Graphite
 dashboards for them. Marvel is surely nice, but with exception of Sense it
 does not offer me anything I do not already have with Graphite.

 New finds:
 * Setting index.codec.bloom.load=false on yesterdays/older indices frees
 up memory from the fielddata pool. This stays released even when searching.
 * Closing older indices speeds up indexing  refreshing.

 Regarding the closing benefit. The impact on refreshing is great! But from
 a functional point-of-view its bad. I know about the 'overhead per index',
 but cannot find a solution to this.

 Does anyone know how to get an ELK stack with unlimited retention?

 Regards,
 Renzo



 Op woensdag 16 april 2014 11:15:32 UTC+2 schreef Mark Walkom:

 Well once you go over 31-32GB of heap you lose pointer compression which
 can actually slow you down. You might be better off reducing that and
 running multiple instances per physical.

 0.90.4 or so compression is on by default, so no need to specify that.
 You might also want to change shards to a factor of your nodes, eg 3, 6, 9
 for more even allocation.
 Also try moving to java 1.7u25 as that is the general agreed version to
 run. We run u51 with no issues though so that might be worth trialling if
 you can.

 Finally, what are you using to monitor the actual cluster? Something like
 ElasticHQ or Marvel will probably provide greater insights into what is
 happening and what you can do to improve performance.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 16 April 2014 19:06, R. Toma renzo...@gmail.com wrote:

 Hi all,

 At bol.com we use ELK for a logsearch platform, using 3 machines.

 We need fast indexing (to not loose events) and want fast  near
 realtime search. The search is currently not fast enough. Simple give me
 the last 50 events from the last 15 minutes, from any type, from todays
 indices, without any terms search queries may take 1.0 sec. Sometimes even
 passing 30 seconds.

 It currently does 3k docs added per second, but we expect 8k/sec end of
 this year.

 I have included lots of specs/config at bottom of this e-mail.


 We found 2 reliable knobs to turn:

1. index.refresh_interval. At 1 sec fast search seems impossible.
When upping the refresh to 5 sec, search gets faster. At 10 sec its even
faster. But when you search during the refresh (wouldn't a splay be 
 nice?)
its slow again. And a refresh every 10 seconds is not near realtime
anymore. No obvious bottlenecks present: cpu, network, memory, disk i/o 
 all
OK.
2. deleting old indices. No clue why this improves things. And we
really do not want to delete old data, since we want to keep at least 60
days of data online. But after deleting old data to search speed slowly
crawls back up again...


 We have zillions of metrics (measure everything) of OS, ES and JVM
 using Diamond and Graphite. Too much to include here.
 We use a nagios check to simulates Kibana queries to monitor the search
 speed every 5 minute.


 When comparing behaviour at refresh_interval 1s vs 5s we see:

- system% cpu load: depends per server: 150 vs 80, 100 vs 50, 40 vs
25 == lower
- ParNew GC run freqency: 1 vs 0.6 (per second) == less
- GMS GC run frequency: 1 vs 4 (per hour) == more
- avg index time: 8 vs 2.5 (ms) == lower
- refresh frequency: 22 vs 12 (per second) -- still high numbers at
5 sec because we have 17 active indices every day == less
- merge frequency: 12 vs 7 (per second) == less
- flush frequency: no difference
- search speed: at 1s way too slow, at 5s (at tests timed between
the refresh bursts) search calls ~50ms.


 We already looked at the threadpools:

- we increased the bulk pool
- we currently do not have any rejects in any pools
- only pool that has queueing (a spike per 1 or 2 hours) is the
'management' pool (but thats probably Diamond)


 We have a feeling something blocks/locks upon high index and high search
 frequency. But what? I have looked at nearly all metrics and _cat output.


 Our current list of untested/wild ideas:

- Is the index.codec.bloom.load=false on yesterday's indices really
the magic bullet? We haven't tried it.
- Adding a 2nd JVM per machine is an option, but as long as we do
not know the real

Re: Error installing ldap river plugin

2014-04-18 Thread Tom Wilson

I was able to install the plugin by building it from source locally and
specifying the JAR file.

-tom

On Friday, April 18, 2014 10:50:54 AM UTC-7, Tom Wilson wrote:

I'm completely new to elasticsearch and am trying to put together a
proof-of-concept using LDAP as a data store.

However, I came across a problem right out of the starting gate,
attempting to install the ldap river plugin, according to the instructions
here:

https://github.com/tlrx/elasticsearch-river-ldap

I got this output. What went wrong, and how do I fix it?

-tom

C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\binplugin
-install tlrx/elasticsearch-river-ldap/0.0
.2
- Installing tlrx/elasticsearch-river-ldap/0.0.2...
Trying
http://download.elasticsearch.org/tlrx/elasticsearch-river-ldap/elasticsearch-river-ldap-0.0.2.zip.
..
Trying
http://search.maven.org/remotecontent?filepath=tlrx/elasticsearch-river-ldap/0.0.2/elasticsearch-river-ldap-0.0.2
.zip...
Trying
https://oss.sonatype.org/service/local/repositories/releases/content/tlrx/elasticsearch-river-ldap/0.0.2/elastics
earch-river-ldap-0.0.2.zip...
Trying
https://github.com/tlrx/elasticsearch-river-ldap/archive/v0.0.2.zip...
Trying
https://github.com/tlrx/elasticsearch-river-ldap/archive/master.zip...
Downloading DONE
Installed tlrx/elasticsearch-river-ldap/0.0.2 into
C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\pl
ugins\river-ldap
Usage:
-u, --url [plugin location] : Set exact URL to download the
plugin from
-i, --install [plugin name] : Downloads and installs listed
plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h...
(infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list: List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help: Prints this help message

Message:
Error while installing plugin, reason: IllegalArgumentException: Plugin
installation assumed to be site plugin, but c
ontains source code, aborting installation.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a9b28a82-b096-4893-b9f2-6e0cd95956f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query and Filter

2014-04-18 Thread Matt Weber

Did you reindex your docs after updating the mapping?  Can you post your
mapping and original docs?

On Friday, April 18, 2014, Matt Hughes hughes.m...@gmail.com wrote:

 Thanks for the quick reply!

 I updated the mappings and confirmed both types read not_analyzed.   I
 also updated the query to use bool/must:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message from AT by user admin was
 generated\
 }
  },
  filter:{
 bool:{
must:[
   {
  term:{
 where.appId:
 12229ac6-8e9a-43ff-ab67-e80f3c585a69
  }
   },
   {
  term:{
 where.processId:
 bd13dbe5-0a4c-4469-a645-44cb3fde280a
  }
   }
]
 }
  }
   }
}
 }

 Still not getting any hits though.  Tried escaping the terms.  Is there
 anything special about having nested field names like that
 'where.processId'?

 On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:

 Chances are your appId and processId fields are analyzed so it is
 breaking up the id's.  Update your mapping of these fields so it is not
 analyzed [1].  Also, you should not use an and filter to combine term
 filters.  Use a boolean filter [2] with must clauses for better
 performance.  Read why at http://www.elasticsearch.org/blog/all-about-
 elasticsearch-filter-bitsets/.


 [1] http://www.elasticsearch.org/guide/en/elasticsearch/
 reference/current/mapping-core-types.html#string
 [2] http://www.elasticsearch.org/guide/en/elasticsearch/
 reference/current/query-dsl-bool-filter.html

 Thanks,
 Matt Weber



 On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes hughe...@gmail.com wrote:

 Trying to compose a query and filter combination to no avail:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message\
 }
  },
  filter:{
 and:[
{
   term:{
  appId:a32b782c-3c51-4d76-9b01-c4c1ffe53d8b
   }
},
{
   term:{
  processId:754311ef-d807-4bb4-8c5e-1b480fb7034f
   }
}
 ]
  }
   }
}
 }

 That parses fine by ES, but never returns the results.  I know the two
 fields are correct and in my index.  If I take off the 'filter', I get the
 expected results, but I need the filter to narrow the results.  When I
 compose the same query using Kibana, it tries to use an 'ffilter' query
 which I don't see documented anywhere:

 filter: {

 bool: {
   must: [

 {
   terms: {

 _type: [
   event

 ]
   }
 },
 {

   fquery: {
 query

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to 
 elasticsearch+unsubscr...@googlegroups.comjavascript:_e(%7B%7D,'cvml','elasticsearch%2bunsubscr...@googlegroups.com');
 .
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/58feafb8-1110-4630-8cbd-ebfd5fef0809%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/58feafb8-1110-4630-8cbd-ebfd5fef0809%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDaNmkYnDUpb4yXqSqx1Hd%3Dg7f%2BgXi1%2BuQVRMAjfs3W5A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Query and Filter

2014-04-18 Thread Matt Hughes

Nevermind.  It was an error on my part; these changes worked.  Thanks again!

On Friday, April 18, 2014 5:51:31 PM UTC-4, Matt Hughes wrote:

 Thanks for the quick reply!

 I updated the mappings and confirmed both types read not_analyzed.   I 
 also updated the query to use bool/must:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message from AT by user admin was 
 generated\
 }
  },
  filter:{
 bool:{
must:[
   {
  term:{
 where.appId:
 12229ac6-8e9a-43ff-ab67-e80f3c585a69
  }
   },
   {
  term:{
 where.processId:
 bd13dbe5-0a4c-4469-a645-44cb3fde280a
  }
   }
]
 }
  }
   }
}
 }

 Still not getting any hits though.  Tried escaping the terms.  Is there 
 anything special about having nested field names like that 
 'where.processId'?

 On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:

 Chances are your appId and processId fields are analyzed so it is 
 breaking up the id's.  Update your mapping of these fields so it is not 
 analyzed [1].  Also, you should not use an and filter to combine term 
 filters.  Use a boolean filter [2] with must clauses for better 
 performance.  Read why at 
 http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
 .


 [1] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
 [2] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html

 Thanks,
 Matt Weber



 On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes hughe...@gmail.com wrote:

 Trying to compose a query and filter combination to no avail:

 {
from:0,
size:200,
query:{
   filtered:{
  query:{
 query_string:{
fields:[
   _all
],
query:\Test message\
 }
  },
  filter:{
 and:[
{
   term:{
  appId:a32b782c-3c51-4d76-9b01-c4c1ffe53d8b
   }
},
{
   term:{
  processId:754311ef-d807-4bb4-8c5e-1b480fb7034f
   }
}
 ]
  }
   }
}
 }

 That parses fine by ES, but never returns the results.  I know the two 
 fields are correct and in my index.  If I take off the 'filter', I get the 
 expected results, but I need the filter to narrow the results.  When I 
 compose the same query using Kibana, it tries to use an 'ffilter' query 
 which I don't see documented anywhere:

 filter: {

 bool: {
   must: [

 {
   terms: {

 _type: [
   event

 ]
   }
 },
 {

   fquery: {
 query: {

   query_string: {
 query: 
 appId:(\a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\)

   }
 },
 _cache: true

   }
 }
   ]
 }


 Any pointers would be most appreciated.  Pulling my hair out here.

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a88afad-971d-4d3a-8ddf-a947ff82c99d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch 1.1.1 initialization failed

2014-04-18 Thread Eric Jain

This issue has been resolved with cloud-aws 2.1.1:

  https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/74


On Thursday, April 17, 2014 6:32:05 PM UTC-7, Eric Jain wrote:

 Just tried to upgrade elasticsearch 1.1.0 to 1.1.1 (with the cloud-aws 
 plugin 2.1.0), and am no longer able to start any nodes:

 2014-04-18 01:19:42,754 [INFO] node - [Skywalker] version[1.1.1], 
 pid[22901], build[f1585f0/2014-04-16T14:27:12Z]
 2014-04-18 01:19:42,767 [INFO] node - [Skywalker] initializing ...
 2014-04-18 01:19:42,802 [INFO] plugins - [Skywalker] loaded [cloud-aws], 
 sites []
 2014-04-18 01:19:50,019 [ERROR] bootstrap - {1.1.1}: Initialization Failed 
 ...
 1) 
 NoSuchMethodError[org.elasticsearch.gateway.blobstore.BlobStoreGateway.init(Lorg/elasticsearch/common/settings/Settings;Lorg/elasticsearch/threadpool/ThreadPool;Lorg/elasticsearch/cluster/ClusterService;)V]

 Anyone else see this issue?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb86660c-82d1-4580-8b72-d1e78866a6c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Problem of Term Suggester

2014-04-18 Thread le trung Trung

I have a problem with term suggester. I dont know what was happening. All 
friends, plz help me to explain it.

I have two 3 documents: [doc1:{content: Anh yêu ta},doc2:{content:Anh 
yêu ta}, doc3:Anh yêu tí] (content was indexed with vi_annalyzer)

I using term suggester as: SuggestionBuilder text = SuggestBuilder
.termSuggestion(Suggestion.DEFAULT_NAME).field(c).text(tí).minWordLength(2).size(1).suggestMode(missing);

I was received results from termsuggestion is:{text: ta , freq:2 , 
score:0.5} 

= Why term suggestion is ta ?. In my thinking , no term suggestion will 
been returned. Plz help me to explain it. what it's wrong and how to fix 
it. Thanks all my friends!!!


This is config vi_annalyzer: 
index:
  analysis:
analyzer:
  vi_analyzer:
type: custom
tokenizer: whitespace
filter: [trim, lowercase, hunspell_vi]
filter:
  hunspell_vi:
type: hunspell
locale: vi_VN 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c6188c07-b722-4377-b478-bd8022c4b8e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Splunk vs. Elastic search performance?

2014-04-18 Thread Greg Murnane

I'm running elasticsearch much smaller than this, but with a PowerEdge R900
with 2 X7350 CPUs, and 64 GB of RAM (24GB heap for elasticsearch) I'm able
to sustain something like 80GB per day (1/16 your volume). Some of the
latest Intel CPUs are about 4 times as powerful as the X7350, so
extrapolating from my results, with very new hardware you can probably do
1.25TB per day on around 5 nodes with 2 CPUs, 256GB RAM, and 8 disks each.
I haven't had an opportunity to test this yet, and even if this is
possible, you should probably get have more nodes than this; hardware
failure, growth, or a sudden increase in logging volume from a problem can
take down a cluster that's running at full capacity all the time.

I'd encourage you to put elasticsearch on some of your systems to generate
some benchmarks. I've never tried clustering elasticsearch with more than 5
hosts. At 1300 systems, each would be doing around 15 KB/s, which is
essentially trivial. You might try taking splunk off 2 dozen systems or so,
and committing them to elasticsearch, then see how well they keep up with
the load you're generating. Data from your particular setup will almost
always be the best sort to have.

--
The information transmitted in this email is intended only for the
person(s) or entity to which it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or
other use of, or taking of any action in reliance upon, this information by
persons or entities other than the intended recipient is prohibited. If you
received this email in error, please contact the sender and permanently
delete the email from any computer.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d465a805-0ada-4398-b4d8-f8ab56e4f34b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Splunk vs. Elastic search performance?

2014-04-18 Thread 熊贻青

We have a cluster with 10 nodes, 48g heap for each ES process. The total
indexing rate is about 25000 doc per second, about 20 indices actively
receiving new data. I'm really courious to compare and evaluate the
indexing performance numers.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP0hgQ34ZwY6Or0PUFZn_Ciu_iyZZJjyXfz%3DNBu64Ge9uN3hxQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

52 matches

Mail list logo