Re: Splunk vs. Elastic search performance?

2014-04-18 Thread 熊贻青
We have a cluster with 10 nodes, 48g heap for each ES process. The total
indexing rate is about 25000 doc per second, about 20 indices actively
receiving new data. I'm really courious to compare and evaluate the
indexing performance numers.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP0hgQ34ZwY6Or0PUFZn_Ciu_iyZZJjyXfz%3DNBu64Ge9uN3hxQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Splunk vs. Elastic search performance?

2014-04-18 Thread Greg Murnane
I'm running elasticsearch much smaller than this, but with a PowerEdge R900 
with 2 X7350 CPUs, and 64 GB of RAM (24GB heap for elasticsearch) I'm able 
to sustain something like 80GB per day (1/16 your volume). Some of the 
latest Intel CPUs are about 4 times as powerful as the X7350, so 
extrapolating from my results, with very new hardware you can probably do 
1.25TB per day on around 5 nodes with 2 CPUs, 256GB RAM, and 8 disks each. 
I haven't had an opportunity to test this yet, and even if this is 
possible, you should probably get have more nodes than this; hardware 
failure, growth, or a sudden increase in logging volume from a problem can 
take down a cluster that's running at full capacity all the time.

I'd encourage you to put elasticsearch on some of your systems to generate 
some benchmarks. I've never tried clustering elasticsearch with more than 5 
hosts. At 1300 systems, each would be doing around 15 KB/s, which is 
essentially trivial. You might try taking splunk off 2 dozen systems or so, 
and committing them to elasticsearch, then see how well they keep up with 
the load you're generating. Data from your particular setup will almost 
always be the best sort to have.

-- 
The information transmitted in this email is intended only for the 
person(s) or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited. If you 
received this email in error, please contact the sender and permanently 
delete the email from any computer.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d465a805-0ada-4398-b4d8-f8ab56e4f34b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Problem of Term Suggester

2014-04-18 Thread le trung Trung
I have a problem with term suggester. I dont know what was happening. All 
friends, plz help me to explain it.

I have two 3 documents: [doc1:{content: "Anh yêu ta"},doc2:{content:"Anh 
yêu ta"}, doc3:"Anh yêu tí"] (content was indexed with vi_annalyzer)

I using term suggester as: SuggestionBuilder text = SuggestBuilder
.termSuggestion(Suggestion.DEFAULT_NAME).field("c").text("tí").minWordLength(2).size(1).suggestMode("missing");

I was received results from termsuggestion is:{text: "ta" , freq:2 , 
score:0.5} 

=> Why term suggestion is "ta" ?. In my thinking , no term suggestion will 
been returned. Plz help me to explain it. what it's wrong and how to fix 
it. Thanks all my friends!!!


This is config vi_annalyzer: 
index:
  analysis:
analyzer:
  vi_analyzer:
type: custom
tokenizer: whitespace
filter: [trim, lowercase, hunspell_vi]
filter:
  hunspell_vi:
type: hunspell
locale: vi_VN 



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c6188c07-b722-4377-b478-bd8022c4b8e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch 1.1.1 initialization failed

2014-04-18 Thread Eric Jain
This issue has been resolved with cloud-aws 2.1.1:

  https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/74


On Thursday, April 17, 2014 6:32:05 PM UTC-7, Eric Jain wrote:
>
> Just tried to upgrade elasticsearch 1.1.0 to 1.1.1 (with the cloud-aws 
> plugin 2.1.0), and am no longer able to start any nodes:
>
> 2014-04-18 01:19:42,754 [INFO] node - [Skywalker] version[1.1.1], 
> pid[22901], build[f1585f0/2014-04-16T14:27:12Z]
> 2014-04-18 01:19:42,767 [INFO] node - [Skywalker] initializing ...
> 2014-04-18 01:19:42,802 [INFO] plugins - [Skywalker] loaded [cloud-aws], 
> sites []
> 2014-04-18 01:19:50,019 [ERROR] bootstrap - {1.1.1}: Initialization Failed 
> ...
> 1) 
> NoSuchMethodError[org.elasticsearch.gateway.blobstore.BlobStoreGateway.(Lorg/elasticsearch/common/settings/Settings;Lorg/elasticsearch/threadpool/ThreadPool;Lorg/elasticsearch/cluster/ClusterService;)V]
>
> Anyone else see this issue?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb86660c-82d1-4580-8b72-d1e78866a6c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Query and Filter

2014-04-18 Thread Matt Hughes
Nevermind.  It was an error on my part; these changes worked.  Thanks again!

On Friday, April 18, 2014 5:51:31 PM UTC-4, Matt Hughes wrote:
>
> Thanks for the quick reply!
>
> I updated the mappings and confirmed both types read not_analyzed.   I 
> also updated the query to use bool/must:
>
> {
>"from":0,
>"size":200,
>"query":{
>   "filtered":{
>  "query":{
> "query_string":{
>"fields":[
>   "_all"
>],
>"query":"\"Test message from AT by user admin was 
> generated\""
> }
>  },
>  "filter":{
> "bool":{
>"must":[
>   {
>  "term":{
> "where.appId":
> "12229ac6-8e9a-43ff-ab67-e80f3c585a69"
>  }
>   },
>   {
>  "term":{
> "where.processId":
> "bd13dbe5-0a4c-4469-a645-44cb3fde280a"
>  }
>   }
>]
> }
>  }
>   }
>}
> }
>
> Still not getting any hits though.  Tried escaping the terms.  Is there 
> anything special about having nested field names like that 
> 'where.processId'?
>
> On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:
>>
>> Chances are your appId and processId fields are analyzed so it is 
>> breaking up the id's.  Update your mapping of these fields so it is not 
>> analyzed [1].  Also, you should not use an "and" filter to combine term 
>> filters.  Use a boolean filter [2] with must clauses for better 
>> performance.  Read why at 
>> http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
>> .
>>
>>
>> [1] 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
>> [2] 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html
>>
>> Thanks,
>> Matt Weber
>>
>>
>>
>> On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes  wrote:
>>
>>> Trying to compose a query and filter combination to no avail:
>>>
>>> {
>>>"from":0,
>>>"size":200,
>>>"query":{
>>>   "filtered":{
>>>  "query":{
>>> "query_string":{
>>>"fields":[
>>>   "_all"
>>>],
>>>"query":"\"Test message\""
>>> }
>>>  },
>>>  "filter":{
>>> "and":[
>>>{
>>>   "term":{
>>>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>>>   }
>>>},
>>>{
>>>   "term":{
>>>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>>>   }
>>>}
>>> ]
>>>  }
>>>   }
>>>}
>>> }
>>>
>>> That parses fine by ES, but never returns the results.  I know the two 
>>> fields are correct and in my index.  If I take off the 'filter', I get the 
>>> expected results, but I need the filter to narrow the results.  When I 
>>> compose the same query using Kibana, it tries to use an 'ffilter' query 
>>> which I don't see documented anywhere:
>>>
>>> "filter": {
>>>
>>> "bool": {
>>>   "must": [
>>>
>>> {
>>>   "terms": {
>>>
>>> "_type": [
>>>   "event"
>>>
>>> ]
>>>   }
>>> },
>>> {
>>>
>>>   "fquery": {
>>> "query": {
>>>
>>>   "query_string": {
>>> "query": 
>>> "appId:(\"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\")"
>>>
>>>   }
>>> },
>>> "_cache": true
>>>
>>>   }
>>> }
>>>   ]
>>> }
>>>
>>>
>>> Any pointers would be most appreciated.  Pulling my hair out here.
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a88afad-971d-4d3a-8ddf-a947ff82c99d%40googlegroups.com.
For more options, visit https://groups.google.com/d/

Re: Query and Filter

2014-04-18 Thread Matt Weber
Did you reindex your docs after updating the mapping?  Can you post your
mapping and original docs?

On Friday, April 18, 2014, Matt Hughes  wrote:

> Thanks for the quick reply!
>
> I updated the mappings and confirmed both types read not_analyzed.   I
> also updated the query to use bool/must:
>
> {
>"from":0,
>"size":200,
>"query":{
>   "filtered":{
>  "query":{
> "query_string":{
>"fields":[
>   "_all"
>],
>"query":"\"Test message from AT by user admin was
> generated\""
> }
>  },
>  "filter":{
> "bool":{
>"must":[
>   {
>  "term":{
> "where.appId":
> "12229ac6-8e9a-43ff-ab67-e80f3c585a69"
>  }
>   },
>   {
>  "term":{
> "where.processId":
> "bd13dbe5-0a4c-4469-a645-44cb3fde280a"
>  }
>   }
>]
> }
>  }
>   }
>}
> }
>
> Still not getting any hits though.  Tried escaping the terms.  Is there
> anything special about having nested field names like that
> 'where.processId'?
>
> On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:
>>
>> Chances are your appId and processId fields are analyzed so it is
>> breaking up the id's.  Update your mapping of these fields so it is not
>> analyzed [1].  Also, you should not use an "and" filter to combine term
>> filters.  Use a boolean filter [2] with must clauses for better
>> performance.  Read why at http://www.elasticsearch.org/blog/all-about-
>> elasticsearch-filter-bitsets/.
>>
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/mapping-core-types.html#string
>> [2] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/query-dsl-bool-filter.html
>>
>> Thanks,
>> Matt Weber
>>
>>
>>
>> On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes  wrote:
>>
>>> Trying to compose a query and filter combination to no avail:
>>>
>>> {
>>>"from":0,
>>>"size":200,
>>>"query":{
>>>   "filtered":{
>>>  "query":{
>>> "query_string":{
>>>"fields":[
>>>   "_all"
>>>],
>>>"query":"\"Test message\""
>>> }
>>>  },
>>>  "filter":{
>>> "and":[
>>>{
>>>   "term":{
>>>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>>>   }
>>>},
>>>{
>>>   "term":{
>>>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>>>   }
>>>}
>>> ]
>>>  }
>>>   }
>>>}
>>> }
>>>
>>> That parses fine by ES, but never returns the results.  I know the two
>>> fields are correct and in my index.  If I take off the 'filter', I get the
>>> expected results, but I need the filter to narrow the results.  When I
>>> compose the same query using Kibana, it tries to use an 'ffilter' query
>>> which I don't see documented anywhere:
>>>
>>> "filter": {
>>>
>>> "bool": {
>>>   "must": [
>>>
>>> {
>>>   "terms": {
>>>
>>> "_type": [
>>>   "event"
>>>
>>> ]
>>>   }
>>> },
>>> {
>>>
>>>   "fquery": {
>>> "query"
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%
>>> 40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to 
> elasticsearch+unsubscr...@googlegroups.com
> .
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/58feafb8-1110-4630-8cbd-ebfd5fef0809%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to

Re: Error installing ldap river plugin

2014-04-18 Thread Tom Wilson
I was able to install the plugin by building it from source locally and 
specifying the JAR file.

-tom


On Friday, April 18, 2014 10:50:54 AM UTC-7, Tom Wilson wrote:
>
> I'm completely new to elasticsearch and am trying to put together a 
> proof-of-concept using LDAP as a data store.
>
> However, I came across a problem right out of the starting gate, 
> attempting to install the ldap river plugin, according to the instructions 
> here:
>
> https://github.com/tlrx/elasticsearch-river-ldap
>
> I got this output. What went wrong, and how do I fix it?
>
> -tom
>
> C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\bin>plugin 
> -install tlrx/elasticsearch-river-ldap/0.0
> .2
> -> Installing tlrx/elasticsearch-river-ldap/0.0.2...
> Trying 
> http://download.elasticsearch.org/tlrx/elasticsearch-river-ldap/elasticsearch-river-ldap-0.0.2.zip.
> ..
> Trying 
> http://search.maven.org/remotecontent?filepath=tlrx/elasticsearch-river-ldap/0.0.2/elasticsearch-river-ldap-0.0.2
> .zip...
> Trying 
> https://oss.sonatype.org/service/local/repositories/releases/content/tlrx/elasticsearch-river-ldap/0.0.2/elastics
> earch-river-ldap-0.0.2.zip...
> Trying 
> https://github.com/tlrx/elasticsearch-river-ldap/archive/v0.0.2.zip...
> Trying 
> https://github.com/tlrx/elasticsearch-river-ldap/archive/master.zip...
> Downloading DONE
> Installed tlrx/elasticsearch-river-ldap/0.0.2 into 
> C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\pl
> ugins\river-ldap
> Usage:
> -u, --url [plugin location]   : Set exact URL to download the 
> plugin from
> -i, --install [plugin name]   : Downloads and installs listed 
> plugins [*]
> -t, --timeout [duration]  : Timeout setting: 30s, 1m, 1h... 
> (infinite by default)
> -r, --remove  [plugin name]   : Removes listed plugins
> -l, --list: List installed plugins
> -v, --verbose : Prints verbose messages
> -s, --silent  : Run in silent mode
> -h, --help: Prints this help message
>
>  [*] Plugin name could be:
>  elasticsearch/plugin/version for official elasticsearch plugins 
> (download from download.elasticsearch.org)
>  groupId/artifactId/version   for community plugins (download from 
> maven central or oss sonatype)
>  username/repository  for site plugins (download from github 
> master)
>
> Message:
>Error while installing plugin, reason: IllegalArgumentException: Plugin 
> installation assumed to be site plugin, but c
> ontains source code, aborting installation.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9b28a82-b096-4893-b9f2-6e0cd95956f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ELK stack needs tuning

2014-04-18 Thread Mark Walkom
If you want unlimited retention you're going to have to keep adding more
nodes to the cluster to deal with it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 17 April 2014 22:48, R. Toma  wrote:

> Hi Mark,
>
> Thank you for your comments.
>
> Regarding the monitoring. We use the Diamond ES collector which saves
> metrics every 30 seconds in Graphite. ElasticHQ is nice, but does
> diagnostics calculations for the whole runtime of the cluster instead of
> last X minutes. It does have nice diagnostics rules, so I created Graphite
> dashboards for them. Marvel is surely nice, but with exception of Sense it
> does not offer me anything I do not already have with Graphite.
>
> New finds:
> * Setting index.codec.bloom.load=false on yesterdays/older indices frees
> up memory from the fielddata pool. This stays released even when searching.
> * Closing older indices speeds up indexing & refreshing.
>
> Regarding the closing benefit. The impact on refreshing is great! But from
> a functional point-of-view its bad. I know about the 'overhead per index',
> but cannot find a solution to this.
>
> Does anyone know how to get an ELK stack with "unlimited" retention?
>
> Regards,
> Renzo
>
>
>
> Op woensdag 16 april 2014 11:15:32 UTC+2 schreef Mark Walkom:
>>
>> Well once you go over 31-32GB of heap you lose pointer compression which
>> can actually slow you down. You might be better off reducing that and
>> running multiple instances per physical.
>>
>> >0.90.4 or so compression is on by default, so no need to specify that.
>> You might also want to change shards to a factor of your nodes, eg 3, 6, 9
>> for more even allocation.
>> Also try moving to java 1.7u25 as that is the general agreed version to
>> run. We run u51 with no issues though so that might be worth trialling if
>> you can.
>>
>> Finally, what are you using to monitor the actual cluster? Something like
>> ElasticHQ or Marvel will probably provide greater insights into what is
>> happening and what you can do to improve performance.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 16 April 2014 19:06, R. Toma  wrote:
>>
>>> Hi all,
>>>
>>> At bol.com we use ELK for a logsearch platform, using 3 machines.
>>>
>>> We need fast indexing (to not loose events) and want fast & near
>>> realtime search. The search is currently not fast enough. Simple "give me
>>> the last 50 events from the last 15 minutes, from any type, from todays
>>> indices, without any terms" search queries may take 1.0 sec. Sometimes even
>>> passing 30 seconds.
>>>
>>> It currently does 3k docs added per second, but we expect 8k/sec end of
>>> this year.
>>>
>>> I have included lots of specs/config at bottom of this e-mail.
>>>
>>>
>>> We found 2 reliable knobs to turn:
>>>
>>>1. index.refresh_interval. At 1 sec fast search seems impossible.
>>>When upping the refresh to 5 sec, search gets faster. At 10 sec its even
>>>faster. But when you search during the refresh (wouldn't a splay be 
>>> nice?)
>>>its slow again. And a refresh every 10 seconds is not near realtime
>>>anymore. No obvious bottlenecks present: cpu, network, memory, disk i/o 
>>> all
>>>OK.
>>>2. deleting old indices. No clue why this improves things. And we
>>>really do not want to delete old data, since we want to keep at least 60
>>>days of data online. But after deleting old data to search speed slowly
>>>crawls back up again...
>>>
>>>
>>> We have zillions of metrics ("measure everything") of OS, ES and JVM
>>> using Diamond and Graphite. Too much to include here.
>>> We use a nagios check to simulates Kibana queries to monitor the search
>>> speed every 5 minute.
>>>
>>>
>>> When comparing behaviour at refresh_interval 1s vs 5s we see:
>>>
>>>- system% cpu load: depends per server: 150 vs 80, 100 vs 50, 40 vs
>>>25 == lower
>>>- ParNew GC run freqency: 1 vs 0.6 (per second) == less
>>>- GMS GC run frequency: 1 vs 4 (per hour) == more
>>>- avg index time: 8 vs 2.5 (ms) == lower
>>>- refresh frequency: 22 vs 12 (per second) -- still high numbers at
>>>5 sec because we have 17 active indices every day == less
>>>- merge frequency: 12 vs 7 (per second) == less
>>>- flush frequency: no difference
>>>- search speed: at 1s way too slow, at 5s (at tests timed between
>>>the refresh bursts) search calls ~50ms.
>>>
>>>
>>> We already looked at the threadpools:
>>>
>>>- we increased the bulk pool
>>>- we currently do not have any rejects in any pools
>>>- only pool that has queueing (a spike per 1 or 2 hours) is the
>>>'management' pool (but thats probably Diamond)
>>>
>>>
>>> We have a feeling something blocks/locks upon high index and high search
>>> frequency. But what? I have looked at nearly all metrics and _cat output.
>>>
>>>
>>> 

LDAP plugin not populating

2014-04-18 Thread Tom Wilson
I'm trying to set up search of LDAP objects  using the ldap river plugin. I 
managed to install the plugin and set up my new river, but all searches are 
coming up empty. The elasticsearch stdout says:

[2014-04-18 15:00:16,904][INFO ][river.ldap   ] [Silver 
Scorpion] [ldap][hpd] now, ldap river null waiting for 1m ms

Why is my ldap river "null?" Maybe someone can look at this and tell me 
what I'm doing wrong.

I am trying to index one LDAP object (objectClass=HCProfessional), which 
resides in the container ou=HCProfessional,o=testhie,dc=hpdtest

I included a list of basic attributes, and am authenticating using the 
default admin account. Here is the REST payload I sent the server

PUT http://localhost:9200/_river/hpd/_meta
{
"type" : "ldap",
"ldap" : {
"host" : "localhost",
"port" : "10389",
"ssl"  : false,
"userDn" : "uid=admin,ou=users,ou=system",
"credentials" : "secret",
"baseDn" : "ou=HCProfessional,o=testhie,dc=hpdtest",
"filter" : "(objectClass=HCProfessional)",
"scope" : "subtree",
"attributes" : [
"uid",
"sn", 
"cn", 
"description",
"facsimileTelephoneNumber",
"gender",
"givenName",
"hcSpecialization",
"hpdMedicalRecordsDeliveryEmail",
"hpdProviderLanguageSupported",
"hpdProviderMailingAddress",
"mail",
"telephoneNumber"
],
"fields" : [
"_id",
"sn", 
"cn", 
"description",
"facsimileTelephoneNumber",
"gender",
"givenName",
"hcSpecialization",
"hpdMedicalRecordsDeliveryEmail",
"hpdProviderLanguageSupported",
"hpdProviderMailingAddress",
"mail",
"telephoneNumber"
],
"poll" : 6
},
"index" : {
"index" : "hpd",
"type" : "HCProfessional"
}
}


Now, when I send what I think is a simple search command:

GET http://localhost:9200/hpd/_search

I get back this:


   1. {
   2. "took": 1,
   3. "timed_out": false,
   4. "_shards": {
   5. "total": 5,
   6. "successful": 5,
   7. "failed": 0
   8. },
   9. "hits": {
   10. "total": 0,
   11. "max_score": null,
   12. "hits": []
   13. }
   14. }

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5b4ee277-2eee-4100-a74c-67c858d0e907%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Splunk vs. Elastic search performance?

2014-04-18 Thread Mark Walkom
That's a lot of data! I don't know of any installations that big but
someone else might.

What sort of infrastructure are you running splunk on now, what's your
current and expected retention?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 19 April 2014 07:33, Frank Flynn  wrote:

> We have a large Splunk instance.  We load about 1.25 Tb of logs a day.  We
> have about 1,300 loaders (servers that collect and load logs - they may do
> other things too).
>
> As I look at Elasticsearch / Logstash / Kibana does anyone know of a
> performance comparison guide?  Should I expect to run on very similar
> hardware?  More? or Less?
>
> Sure it depends on exactly what we're doing, the exact queries and the
> frequency we'd run them but I'm trying to get any kind of idea before we
> start.
>
> Are there any white papers or other documents about switching?  It seems
> an obvious choice but I can only find very little performance comparisons
> (I did see that Elasticsearch just hired "the former VP of Products at
> Splunk, Gaurav Gupta" - but there were few numbers in that article either).
>
> Thanks,
> Frank
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZwX2YACKX_yobDK%2BjXHRdexq2gKQ1iOO7%3DAPPoKkBZmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Testing for an Empty String With the Following

2014-04-18 Thread Paul
Hi,

Thanks for everyone's patience while I learn the elasticsearch query DSL. 
 I'm trying to get used to its verbosity.


How would I do a query like this, again in SQL parlance:  select col1 from 
mysource where col2 = "" and col3 in ["", "one", "two"] and col4 = "foo"

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cbf00b67-b354-4087-a937-450055fce661%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Query and Filter

2014-04-18 Thread Matt Hughes
Thanks for the quick reply!

I updated the mappings and confirmed both types read not_analyzed.   I also 
updated the query to use bool/must:

{
   "from":0,
   "size":200,
   "query":{
  "filtered":{
 "query":{
"query_string":{
   "fields":[
  "_all"
   ],
   "query":"\"Test message from AT by user admin was 
generated\""
}
 },
 "filter":{
"bool":{
   "must":[
  {
 "term":{
"where.appId":"12229ac6-8e9a-43ff-ab67-e80f3c585a69"
 }
  },
  {
 "term":{
"where.processId":
"bd13dbe5-0a4c-4469-a645-44cb3fde280a"
 }
  }
   ]
}
 }
  }
   }
}

Still not getting any hits though.  Tried escaping the terms.  Is there 
anything special about having nested field names like that 
'where.processId'?

On Friday, April 18, 2014 4:07:31 PM UTC-4, Matt Weber wrote:
>
> Chances are your appId and processId fields are analyzed so it is breaking 
> up the id's.  Update your mapping of these fields so it is not analyzed 
> [1].  Also, you should not use an "and" filter to combine term filters. 
>  Use a boolean filter [2] with must clauses for better performance.  Read 
> why at 
> http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/.
>
>
> [1] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
> [2] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html
>
> Thanks,
> Matt Weber
>
>
>
> On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes 
> > wrote:
>
>> Trying to compose a query and filter combination to no avail:
>>
>> {
>>"from":0,
>>"size":200,
>>"query":{
>>   "filtered":{
>>  "query":{
>> "query_string":{
>>"fields":[
>>   "_all"
>>],
>>"query":"\"Test message\""
>> }
>>  },
>>  "filter":{
>> "and":[
>>{
>>   "term":{
>>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>>   }
>>},
>>{
>>   "term":{
>>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>>   }
>>}
>> ]
>>  }
>>   }
>>}
>> }
>>
>> That parses fine by ES, but never returns the results.  I know the two 
>> fields are correct and in my index.  If I take off the 'filter', I get the 
>> expected results, but I need the filter to narrow the results.  When I 
>> compose the same query using Kibana, it tries to use an 'ffilter' query 
>> which I don't see documented anywhere:
>>
>> "filter": {
>>
>> "bool": {
>>   "must": [
>>
>> {
>>   "terms": {
>>
>> "_type": [
>>   "event"
>>
>> ]
>>   }
>> },
>> {
>>
>>   "fquery": {
>> "query": {
>>
>>   "query_string": {
>> "query": 
>> "appId:(\"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\")"
>>
>>   }
>> },
>> "_cache": true
>>
>>   }
>> }
>>   ]
>> }
>>
>>
>> Any pointers would be most appreciated.  Pulling my hair out here.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58feafb8-1110-4630-8cbd-ebfd5fef0809%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Splunk vs. Elastic search performance?

2014-04-18 Thread Frank Flynn
We have a large Splunk instance.  We load about 1.25 Tb of logs a day.  We 
have about 1,300 loaders (servers that collect and load logs - they may do 
other things too).

As I look at Elasticsearch / Logstash / Kibana does anyone know of a 
performance comparison guide?  Should I expect to run on very similar 
hardware?  More? or Less?

Sure it depends on exactly what we're doing, the exact queries and the 
frequency we'd run them but I'm trying to get any kind of idea before we 
start.

Are there any white papers or other documents about switching?  It seems an 
obvious choice but I can only find very little performance comparisons (I 
did see that Elasticsearch just hired "the former VP of Products at Splunk, 
Gaurav Gupta" - but there were few numbers in that article either).

Thanks,
Frank

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ea1a338b-5b44-485d-84b2-3558a812e8a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Testing for an Empty String

2014-04-18 Thread Paul
Hi,

Thanks for everyone's patience while I learn the elasticsearch query DSL. 
 I'm trying to get used to its verbosity.


How would I do a query like this, again in SQL parlance:  select col1 from 
mysource where col2 = ""?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b614d6f-cb0f-4bad-9a64-f787bd0deb29%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Continuous async replication

2014-04-18 Thread Mohit Anchlia
As I understand there is currently no feature that does async replication
between 2 clusters or even within the same cluster, but we have a need to
write one. What would be the best way to do it in elasticsearch? I was
thinking of leveraging Scroll for this.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrUdDdiqy62yHaaS6bJJ08_txDCNNXR8rGr%3DRGY8gAv-Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


ANN Elastisch 2.0.0-beta4 is released

2014-04-18 Thread Michael Klishin
Elastisch [1] is a small, feature complete Clojure client for ElasticSearch.

Release notes:
http://blog.clojurewerkz.org/blog/2014/04/11/elastisch-2-dot-0-0-beta4-is-released/

1. http://clojureelasticsearch.info
-- 
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAE3HoVSbsmF%3D2KoJSwn50h_NSJxs2woSZJ4FHcH7VTb_azWxHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Query and Filter

2014-04-18 Thread Matt Weber
Chances are your appId and processId fields are analyzed so it is breaking
up the id's.  Update your mapping of these fields so it is not analyzed
[1].  Also, you should not use an "and" filter to combine term filters.
 Use a boolean filter [2] with must clauses for better performance.  Read
why at
http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/.


[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html

Thanks,
Matt Weber



On Fri, Apr 18, 2014 at 12:52 PM, Matt Hughes  wrote:

> Trying to compose a query and filter combination to no avail:
>
> {
>"from":0,
>"size":200,
>"query":{
>   "filtered":{
>  "query":{
> "query_string":{
>"fields":[
>   "_all"
>],
>"query":"\"Test message\""
> }
>  },
>  "filter":{
> "and":[
>{
>   "term":{
>  "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
>   }
>},
>{
>   "term":{
>  "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
>   }
>}
> ]
>  }
>   }
>}
> }
>
> That parses fine by ES, but never returns the results.  I know the two
> fields are correct and in my index.  If I take off the 'filter', I get the
> expected results, but I need the filter to narrow the results.  When I
> compose the same query using Kibana, it tries to use an 'ffilter' query
> which I don't see documented anywhere:
>
> "filter": {
> "bool": {
>   "must": [
> {
>   "terms": {
> "_type": [
>   "event"
> ]
>   }
> },
> {
>   "fquery": {
> "query": {
>   "query_string": {
> "query": 
> "appId:(\"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\")"
>   }
> },
> "_cache": true
>   }
> }
>   ]
> }
>
>
> Any pointers would be most appreciated.  Pulling my hair out here.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoBc0EmeY5yUo0juR5EUuOR%3DmuaROPbYKJJ9u7qP_-HB9w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Filter first then search

2014-04-18 Thread David Smith
I'm also curious to know if there is way to do the opposite of 
FilteredQuery... basically QueriedFilter. Filter first and then run a query 
on the filtered results.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/818d6e9f-b4d8-4427-b9c1-1723ac0dd5d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Query and Filter

2014-04-18 Thread Matt Hughes
Trying to compose a query and filter combination to no avail:

{
   "from":0,
   "size":200,
   "query":{
  "filtered":{
 "query":{
"query_string":{
   "fields":[
  "_all"
   ],
   "query":"\"Test message\""
}
 },
 "filter":{
"and":[
   {
  "term":{
 "appId":"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b"
  }
   },
   {
  "term":{
 "processId":"754311ef-d807-4bb4-8c5e-1b480fb7034f"
  }
   }
]
 }
  }
   }
}

That parses fine by ES, but never returns the results.  I know the two 
fields are correct and in my index.  If I take off the 'filter', I get the 
expected results, but I need the filter to narrow the results.  When I 
compose the same query using Kibana, it tries to use an 'ffilter' query 
which I don't see documented anywhere:

"filter": {
"bool": {
  "must": [
{
  "terms": {
"_type": [
  "event"
]
  }
},
{
  "fquery": {
"query": {
  "query_string": {
"query": "appId:(\"a32b782c-3c51-4d76-9b01-c4c1ffe53d8b\")"
  }
},
"_cache": true
  }
}
  ]
}


Any pointers would be most appreciated.  Pulling my hair out here.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/666c3b42-222d-420b-9997-5b660713396d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith
You can use a function score query with a native script in this manner.
 

{

  "function_score" : {

"query" : {

  "match_all" : { }

},

"functions" : [ {

  "filter" : {

"terms" : {

  "myfield" : [ "103", "104", "134", "180" ],

  "_cache" : true

}

  },

  "script_score" : {

"script" : "myscriptname",

"lang" : "native",

"params" : {

  "myparam1" : "something",

  "myparam2" : "somethingElse"

}

  }

} ],

"score_mode" : "sum"

  }

}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb47555a-dd96-4dde-bf20-e80f42f975cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Function Score Query and Native scripts

2014-04-18 Thread David Smith
Yes, function score query works with native scripts. We use it with them. 

I'm not sure whether native scripts are automatically cached.

On Saturday, April 12, 2014 1:49:32 PM UTC-4, Eric T wrote:
>
> Hi,
>
> The function score documentation doesn't mention any support for native 
> scripts, does it still work for the Function Score Query, if so is it the 
> same syntax? 
> I'm using the custom_filters_score query with a native script but the 
> query is deprecated in the latest ES version. I'm still using 0.90.3 but I 
> plan to upgrade to the latest version. 
>
> It says that the script_score function for function_score is cached. Does 
> this provide the same performance as the Native script? I'm wondering if 
> it's necessary to still use a native script or convert it to the 
> script_score function
>
> thanks
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/edddb638-0b4e-49b8-8925-257064dc0afe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Switching back to ConcurrentMergeScheduler

2014-04-18 Thread David Smith
I see that ES switch back to ConcurrentMergeScheduler in 1.1.1 due to it 
affecting indexing performance in 1.1.0.
https://github.com/elasticsearch/elasticsearch/issues/5817

We're on 1.1.0 and cannot upgrade to 1.1.1 for the time being. Is there a 
way to switch it back using the API? I tried the following command, but it 
seems to not take.

curl -i -XPUT localhost:9200/_cluster/settings -d '{ "persistent": { 
"index.merge.scheduler.type": 
"org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider" 
} }'
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 52

{"acknowledged":true,"persistent":{},"transient":{}}


It does not seem to be set when I try to re-GET it (and no errors in logs 
at DEBUG level or above).

curl -i -XGET localhost:9200/_cluster/settings
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 66

{"persistent":{"threadpool":{"bulk":{"size":"8"}}},"transient":{}}


Am using the wrong way of specifying the scheduler? I also tried just 
specifying ConcurrentMergeSchedulerProvider instead of the full class name, 
but that didn't work.

Any ideas?
David

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/601a831d-2c8e-4615-b816-435a6d4e4d9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber
Well, the scripts runs against all matching documents of the query so you
can do a match_all query [1] to have the logic applied to all your
documents.  This is going to be expensive though, so try to filter out as
many documents as possible before applying the custom scoring.  Maybe even
perform a rescore [2] on the top X docs.  It really all depends on your
requirements though.  Run some tests and tune based on those results.

When I said to be careful. I mean don't do a lot of blocking IO or long
running calculations as the script is ran against each matching document.
 Cache results and make the script return as quick as possible.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-all-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-rescore.html

Thanks,
Matt Weber



On Fri, Apr 18, 2014 at 9:46 AM, Srinivasan Ramaswamy wrote:

> Thats great, thanks for your reply. This looks like a good solution for my
> requirement ! Is this script applied in each shard ? I want to apply this
> function to all the documents so that the Top N picked from each shard is
> picked by my custom score.
>
> Also, can you elaborate a little bit on "be careful you can significantly
> impact your query performance if you are not careful". I would like to
> understand the best practices there.
>
> On Friday, April 18, 2014 8:14:54 AM UTC-7, Matt Weber wrote:
>>
>> Yes, you can use the Function Score Query [1] in combination with a
>> native script written in java [2].  With the native script you can
>> basically do whatever you want, but be careful you can significantly impact
>> your query performance if you are not careful.
>>
>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/query-dsl-function-score-query.html
>> [2] http://www.elasticsearch.org/guide/en/elasticsearch/
>> reference/current/modules-scripting.html#_native_java_scripts
>>
>> Thanks,
>> Matt Weber
>>
>>
>> On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy > > wrote:
>>
>>> I would like to influence the ranking with few fields that are not
>>> stored in the index (eg click data for keyword-documents). I have used
>>> custom SearchComponent in Solr to implement similar functionality in the
>>> past. I am wondering how can i achieve the same in ElasticSearch.
>>>
>>> I know this thread is a very old thread, but i didnt find much
>>> information on how to do custom scoring (in elasticsearch) with data thats
>>> not stored in the index. This thread looked very relevant to my
>>> requirement, so trying to see whether you guys have solved similar
>>> requirements with elasticsearch.
>>>
>>> Thanks
>>> Srini
>>>
>>> On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:

 Hi Otis,

 So if I understand it correctly (providing my knowledge is quite
 limited here) you are asking if
 1) it is possible to hook into query processing flow and inject or
 extend custom handlers for individual flow phases and
 2) if we can find in ES the same functionality which is currently
 provided by components listed here: http://wiki.apache.org/s
 olr/SearchComponent (or here: http://lucene.apache.org/solr/
 api/org/apache/solr/handler/component/SearchComponent.html).

 As for #1, frankly, I do not know. I have been playing with plugins a
 bit but did not have a chance to explore full potential of it yet. I
 remember that Shay mentioned that not every aspect of ES is pluggable now
 but that is all I know about it (personally, I did not hit the limits by
 myself yet, may be I would if I wanted to employ Carrot2 clustering or
 something like that)

 As for #2, if you are after one-to-one comparison of Solr
 SearchComponents and ES then I think we would find some matches and also
 some misses. Still it could be an interesting exercise to do (although we
 should be careful to include only those features that do work well in
 distributed environment). We could probably end up identifying new feature
 requests, so this can be useful.

 Regards,
 Lukas

 On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic >>> > wrote:

> Hi Lukas,
>
> Yes, SearchComponents are about extensibility, but specifically about
> extending how queries are handled within Solr once Solr gets them.  I
> know ES has other types of plugins, and you've listed several of them,
> but I'm wondering about which of them is SearchComponent-like.
> I've looked at http://www.elasticsearch.org/guide/reference/modules/
> plugins.html
> , but couldn't find the answer to my Q there.  Maybe I'm looking at
> the wrong place?
>
> Thanks,
> Otis
> --
> Sematext is hiring Search Engineers -- http://sematext.com/about/jobs
> .html
>
> On Sep 6, 2:57 pm, Lukáš Vlček  wrote:
> > Hi,
> >
> > I am not Solr ex

Cache cleaner in hot threads

2014-04-18 Thread Nikolas Everett
I'm still doing performance work and I keep seeing the CacheCleaner pop up
[1].  I don't know how much of an effect its actually having, but I imagine
its something.

It looks like entries in the cache get queued for deletion both by cache
clear commands and by readers closing.  Would it make sense to forgo
removing entries when the readers close and let the LRU policy clean it up?

Nik

[1]:
   100.4% (502.1ms out of 500ms) cpu usage by thread
'elasticsearch[elastic1001][generic][T#73]'
 2/10 snapshots sharing following 5 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.remove(LocalCache.java:4353)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:186)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)
 4/10 snapshots sharing following 8 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.nextInTable(LocalCache.java:4306)

org.elasticsearch.common.cache.LocalCache$HashIterator.advance(LocalCache.java:4271)

org.elasticsearch.common.cache.LocalCache$HashIterator.nextEntry(LocalCache.java:4346)

org.elasticsearch.common.cache.LocalCache$KeyIterator.next(LocalCache.java:4362)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:183)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)
 4/10 snapshots sharing following 7 elements

org.elasticsearch.common.cache.LocalCache$HashIterator.advance(LocalCache.java:4271)

org.elasticsearch.common.cache.LocalCache$HashIterator.nextEntry(LocalCache.java:4346)

org.elasticsearch.common.cache.LocalCache$KeyIterator.next(LocalCache.java:4362)

org.elasticsearch.indices.cache.filter.IndicesFilterCache$ReaderCleaner$1.run(IndicesFilterCache.java:183)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   java.lang.Thread.run(Thread.java:724)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0p8QT5LadzeoinXJdQTyDzNrcaua95UON8zjVaY%3D1cMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: searching most recent objects

2014-04-18 Thread Phil Greenberg
Oh, awesome, thank you so much for the help, I'll give that a try!

On Thursday, April 17, 2014 2:51:23 PM UTC-7, Itamar Syn-Hershko wrote:
>
> For recent X just sort on the _timestamp field and specify X as the page 
> size 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko 
> Freelance Developer & Consultant
> Author of RavenDB in Action 
>
>
> On Fri, Apr 18, 2014 at 12:43 AM, Phil Greenberg 
> 
> > wrote:
>
>> Thanks Itamar.
>>
>> So are you saying it's not possible to ask ES for the most recent X 
>> objects that match the given query?  Only to say give me the last 30 days 
>> of objects?
>>
>>
>> On Thursday, April 17, 2014 2:39:43 PM UTC-7, Itamar Syn-Hershko wrote:
>>
>>> Filter (range filter on the date/time field) is exactly the way to do 
>>> this.
>>>
>>> Another possibility is using rolling indexes (e.g. an index per day, 
>>> like the logstash indexes are defined) but that obviously depends on a lot 
>>> of other business concerns and isn't really viable for most applications
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko 
>>> Freelance Developer & Consultant
>>> Author of RavenDB in Action 
>>>
>>>
>>> On Fri, Apr 18, 2014 at 12:36 AM, Phil Greenberg >> > wrote:
>>>
  I am also facing the same issue.

 Right now, I am just doing a filter myself, but I would assume this is 
 a common use case, an ES must have a way to deal with it?


 On Tuesday, April 15, 2014 6:24:52 PM UTC-7, Joris Bolsens wrote:
>
> I am using the javascript API and want to do a search and have it 
> search through the most recent objects, IE I call a search with size 100, 
> I 
> want to have the most recent 100 objects returned to me, how would I go 
> about doing that?
>
> I tried using sort, but it seems that it just sorts the results after 
> the search completed
>
  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/fb4a3d83-c386-459b-beb6-a8ca4fcbb286%
 40googlegroups.com
 .

 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/ec1fa45f-4534-45a2-96ae-1d5edf783ac4%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53cdd542-4d17-47b9-bedb-3d2f6937864d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Error installing ldap river plugin

2014-04-18 Thread Tom Wilson
I'm completely new to elasticsearch and am trying to put together a 
proof-of-concept using LDAP as a data store.

However, I came across a problem right out of the starting gate, attempting 
to install the ldap river plugin, according to the instructions here:

https://github.com/tlrx/elasticsearch-river-ldap

I got this output. What went wrong, and how do I fix it?

-tom

C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\bin>plugin 
-install tlrx/elasticsearch-river-ldap/0.0
.2
-> Installing tlrx/elasticsearch-river-ldap/0.0.2...
Trying 
http://download.elasticsearch.org/tlrx/elasticsearch-river-ldap/elasticsearch-river-ldap-0.0.2.zip...
Trying 
http://search.maven.org/remotecontent?filepath=tlrx/elasticsearch-river-ldap/0.0.2/elasticsearch-river-ldap-0.0.2
.zip...
Trying 
https://oss.sonatype.org/service/local/repositories/releases/content/tlrx/elasticsearch-river-ldap/0.0.2/elastics
earch-river-ldap-0.0.2.zip...
Trying 
https://github.com/tlrx/elasticsearch-river-ldap/archive/v0.0.2.zip...
Trying 
https://github.com/tlrx/elasticsearch-river-ldap/archive/master.zip...
Downloading DONE
Installed tlrx/elasticsearch-river-ldap/0.0.2 into 
C:\Users\twilson\Downloads\elasticsearch-1.1.1\elasticsearch-1.1.1\pl
ugins\river-ldap
Usage:
-u, --url [plugin location]   : Set exact URL to download the 
plugin from
-i, --install [plugin name]   : Downloads and installs listed 
plugins [*]
-t, --timeout [duration]  : Timeout setting: 30s, 1m, 1h... 
(infinite by default)
-r, --remove  [plugin name]   : Removes listed plugins
-l, --list: List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent  : Run in silent mode
-h, --help: Prints this help message

 [*] Plugin name could be:
 elasticsearch/plugin/version for official elasticsearch plugins 
(download from download.elasticsearch.org)
 groupId/artifactId/version   for community plugins (download from 
maven central or oss sonatype)
 username/repository  for site plugins (download from github 
master)

Message:
   Error while installing plugin, reason: IllegalArgumentException: Plugin 
installation assumed to be site plugin, but c
ontains source code, aborting installation.



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/187f5738-2d27-4d8c-842d-d521934a94f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Lukáš Vlček
Excellent, thanks Michael.
Dne 18.4.2014 18:18 "Michael McCandless" 
napsal(a):

> 1.7u55 should be safe for ElasticSearch; we just put out a blog post about
> this:
>
>
> http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/
>
> And I'll fix the nightly Lucene benchmarks to use u55 too!  I should NOT
> have been using u40: it's not safe.
>
> Mike
>
> http://blog.mikemccandless.com
>
>
> On Fri, Apr 18, 2014 at 9:52 AM, Jason Wee  wrote:
>
>> will these two links help?
>>
>> https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
>> http://people.apache.org/~mikemccand/lucenebench/indexing.html
>>
>> lucene performance test is using java 1.70 u40. that's the same version
>> i'm using for lucene 4.6.0.
>>
>> jason
>>
>>
>> On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček wrote:
>>
>>> Hi,
>>>
>>> is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is
>>> it safe and recommended?
>>>
>>> I found Robert and Uwe discussed this Java version here:
>>> http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
>>> I found couple of failed builds in
>>> http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16
>>> that might be related to this version of Java but all seemed to be rather
>>> Solr related.
>>>
>>> Regards,
>>> Lukas
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CANPQZbw5KOvKO38pfT6y0azci9cUzYOR2%3DicJy4_RW6jry1Tcw%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYGjznSKZos%2Bb-Ar-s%2BAeSyJHWJqtY_4_ny1So4ka0iUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Srinivasan Ramaswamy
Thats great, thanks for your reply. This looks like a good solution for my 
requirement ! Is this script applied in each shard ? I want to apply this 
function to all the documents so that the Top N picked from each shard is 
picked by my custom score. 

Also, can you elaborate a little bit on "be careful you can significantly 
impact your query performance if you are not careful". I would like to 
understand the best practices there.

On Friday, April 18, 2014 8:14:54 AM UTC-7, Matt Weber wrote:
>
> Yes, you can use the Function Score Query [1] in combination with a native 
> script written in java [2].  With the native script you can basically do 
> whatever you want, but be careful you can significantly impact your query 
> performance if you are not careful.
>
> [1] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
> [2] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_native_java_scripts
>
> Thanks,
> Matt Weber
>
>
> On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy 
> 
> > wrote:
>
>> I would like to influence the ranking with few fields that are not stored 
>> in the index (eg click data for keyword-documents). I have used custom 
>> SearchComponent in Solr to implement similar functionality in the past. I 
>> am wondering how can i achieve the same in ElasticSearch.
>>
>> I know this thread is a very old thread, but i didnt find much 
>> information on how to do custom scoring (in elasticsearch) with data thats 
>> not stored in the index. This thread looked very relevant to my 
>> requirement, so trying to see whether you guys have solved similar 
>> requirements with elasticsearch.
>>
>> Thanks
>> Srini
>>
>> On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:
>>>
>>> Hi Otis,
>>>
>>> So if I understand it correctly (providing my knowledge is quite limited 
>>> here) you are asking if
>>> 1) it is possible to hook into query processing flow and inject or 
>>> extend custom handlers for individual flow phases and
>>> 2) if we can find in ES the same functionality which is currently 
>>> provided by components listed here: http://wiki.apache.org/
>>> solr/SearchComponent (or here: http://lucene.apache.org/solr/
>>> api/org/apache/solr/handler/component/SearchComponent.html).
>>>
>>> As for #1, frankly, I do not know. I have been playing with plugins a 
>>> bit but did not have a chance to explore full potential of it yet. I 
>>> remember that Shay mentioned that not every aspect of ES is pluggable now 
>>> but that is all I know about it (personally, I did not hit the limits by 
>>> myself yet, may be I would if I wanted to employ Carrot2 clustering or 
>>> something like that)
>>>
>>> As for #2, if you are after one-to-one comparison of Solr 
>>> SearchComponents and ES then I think we would find some matches and also 
>>> some misses. Still it could be an interesting exercise to do (although we 
>>> should be careful to include only those features that do work well in 
>>> distributed environment). We could probably end up identifying new feature 
>>> requests, so this can be useful.
>>>
>>> Regards,
>>> Lukas 
>>>
>>> On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic 
>>> wrote:
>>>
 Hi Lukas,

 Yes, SearchComponents are about extensibility, but specifically about
 extending how queries are handled within Solr once Solr gets them.  I
 know ES has other types of plugins, and you've listed several of them,
 but I'm wondering about which of them is SearchComponent-like.
 I've looked at http://www.elasticsearch.org/guide/reference/modules/
 plugins.html
 , but couldn't find the answer to my Q there.  Maybe I'm looking at
 the wrong place?

 Thanks,
 Otis
 --
 Sematext is hiring Search Engineers -- http://sematext.com/about/
 jobs.html

 On Sep 6, 2:57 pm, Lukáš Vlček  wrote:
 > Hi,
 >
 > I am not Solr expert but to me it seems that SearchComponents in Solr 
 are
 > about extensibility of out of the box functionality. If that is the 
 case
 > then I would say that we can talk about plugins in ES world. Although 
 there
 > is no official doc about how to implement custom plugins yet it is 
 really
 > not difficult. Apart from that there are several plugins that are 
 part of
 > distribution (river plugins, attachments mapper, ICU analysis, 
 scripting
 > languages ... to name a few) and they can be used as an inspiration 
 if a new
 > plugin implementation is needed.
 >
 > My 2 cents.
 >
 > Lukas
 >
 > On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic <
 otis.gospodne...@gmail.com
 >
 >
 >
 >
 >
 >
 >
 > > wrote:
 > > Hello,
 >
 > > A long time Solr user posted a good question about ES over on 
 Sematext
 > > Blog, about an equivalent of Solr's SearchComponen

Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Michael McCandless
1.7u55 should be safe for ElasticSearch; we just put out a blog post about
this:


http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/

And I'll fix the nightly Lucene benchmarks to use u55 too!  I should NOT
have been using u40: it's not safe.

Mike

http://blog.mikemccandless.com


On Fri, Apr 18, 2014 at 9:52 AM, Jason Wee  wrote:

> will these two links help?
>
> https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
> http://people.apache.org/~mikemccand/lucenebench/indexing.html
>
> lucene performance test is using java 1.70 u40. that's the same version
> i'm using for lucene 4.6.0.
>
> jason
>
>
> On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček wrote:
>
>> Hi,
>>
>> is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
>> safe and recommended?
>>
>> I found Robert and Uwe discussed this Java version here:
>> http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
>> I found couple of failed builds in
>> http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16
>> that might be related to this version of Java but all seemed to be rather
>> Solr related.
>>
>> Regards,
>> Lukas
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANPQZbw5KOvKO38pfT6y0azci9cUzYOR2%3DicJy4_RW6jry1Tcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Getting phrase count for each document separately.

2014-04-18 Thread Amit


I would like to get a phrase count for each document separately.
I do not wish to run a query for every document, i would rather run one 
single query.

For example if i have the following documents:
{
   "name" : "John",
   "message" : "The lion is *very **fast*"
}

{
  "name" : "Ben",
  "message" : "The lion is *very **fast* and the bardelas is *very fast*
"

}

I would like to query my documents for the phrase "*very fast*" and get 
back something like this:
{
   "name" : "John",
"message" : "The lion is *very **fast*",
  *"count" : 1*

}

{
  "name" : "Ben",
   "message" : "The lion is *very **fast* and the bardelas is *very 
fast*",
  *"count" : 2*
}

I failed to find out how to do this so far. I only found queries that give 
the total number of documents that contain the phrase (in my example 2 
documents).
How can I do this using elastic search query?
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f1aefe9-97fa-44ba-a4b1-644536bd2a5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Getting phrase count for each document separately.

2014-04-18 Thread Amit
I would like to get a phrase count for every document.
I do not wish to run a query for every document, i would rather run one 
single query.

For example if i have the following documents:
{
   "name" : "John",
   "Message" : "The lion is *very *fast"
}

{
  "name" : "Ben",
  "Message" : "The lion is *very very* fast"
}

I would like to query my documents for the word "*very*" and get back 
something like this:
{
   "name" : "John",
   "Message" : "The lion is *very *fast",
  *"score" : 1*
}

{
  "name" : "Ben",
  "Message" : "The lion is *very very* fast",
  *"score" : 2*
}

I failed to find out how to do this so far. I only found queries that give 
the sum of phrase count of all documents together (in my example 3).
How can I do this using elastic search query?
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51e15537-cb1f-4e49-926e-6b2a6fce56b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Solr SearchComponent-like functionality?

2014-04-18 Thread Matt Weber
Yes, you can use the Function Score Query [1] in combination with a native
script written in java [2].  With the native script you can basically do
whatever you want, but be careful you can significantly impact your query
performance if you are not careful.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_native_java_scripts

Thanks,
Matt Weber


On Thu, Apr 17, 2014 at 11:54 PM, Srinivasan Ramaswamy
wrote:

> I would like to influence the ranking with few fields that are not stored
> in the index (eg click data for keyword-documents). I have used custom
> SearchComponent in Solr to implement similar functionality in the past. I
> am wondering how can i achieve the same in ElasticSearch.
>
> I know this thread is a very old thread, but i didnt find much information
> on how to do custom scoring (in elasticsearch) with data thats not stored
> in the index. This thread looked very relevant to my requirement, so trying
> to see whether you guys have solved similar requirements with elasticsearch.
>
> Thanks
> Srini
>
> On Wednesday, September 7, 2011 12:18:09 PM UTC-7, Lukáš Vlček wrote:
>>
>> Hi Otis,
>>
>> So if I understand it correctly (providing my knowledge is quite limited
>> here) you are asking if
>> 1) it is possible to hook into query processing flow and inject or extend
>> custom handlers for individual flow phases and
>> 2) if we can find in ES the same functionality which is currently
>> provided by components listed here: http://wiki.apache.org/
>> solr/SearchComponent (or here: http://lucene.apache.org/solr/
>> api/org/apache/solr/handler/component/SearchComponent.html).
>>
>> As for #1, frankly, I do not know. I have been playing with plugins a bit
>> but did not have a chance to explore full potential of it yet. I remember
>> that Shay mentioned that not every aspect of ES is pluggable now but that
>> is all I know about it (personally, I did not hit the limits by myself yet,
>> may be I would if I wanted to employ Carrot2 clustering or something like
>> that)
>>
>> As for #2, if you are after one-to-one comparison of Solr
>> SearchComponents and ES then I think we would find some matches and also
>> some misses. Still it could be an interesting exercise to do (although we
>> should be careful to include only those features that do work well in
>> distributed environment). We could probably end up identifying new feature
>> requests, so this can be useful.
>>
>> Regards,
>> Lukas
>>
>> On Wed, Sep 7, 2011 at 6:17 PM, Otis Gospodnetic 
>> wrote:
>>
>>> Hi Lukas,
>>>
>>> Yes, SearchComponents are about extensibility, but specifically about
>>> extending how queries are handled within Solr once Solr gets them.  I
>>> know ES has other types of plugins, and you've listed several of them,
>>> but I'm wondering about which of them is SearchComponent-like.
>>> I've looked at http://www.elasticsearch.org/guide/reference/modules/
>>> plugins.html
>>> , but couldn't find the answer to my Q there.  Maybe I'm looking at
>>> the wrong place?
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Sematext is hiring Search Engineers -- http://sematext.com/about/
>>> jobs.html
>>>
>>> On Sep 6, 2:57 pm, Lukáš Vlček  wrote:
>>> > Hi,
>>> >
>>> > I am not Solr expert but to me it seems that SearchComponents in Solr
>>> are
>>> > about extensibility of out of the box functionality. If that is the
>>> case
>>> > then I would say that we can talk about plugins in ES world. Although
>>> there
>>> > is no official doc about how to implement custom plugins yet it is
>>> really
>>> > not difficult. Apart from that there are several plugins that are part
>>> of
>>> > distribution (river plugins, attachments mapper, ICU analysis,
>>> scripting
>>> > languages ... to name a few) and they can be used as an inspiration if
>>> a new
>>> > plugin implementation is needed.
>>> >
>>> > My 2 cents.
>>> >
>>> > Lukas
>>> >
>>> > On Tue, Sep 6, 2011 at 5:35 PM, Otis Gospodnetic <
>>> otis.gospodne...@gmail.com
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > > wrote:
>>> > > Hello,
>>> >
>>> > > A long time Solr user posted a good question about ES over on
>>> Sematext
>>> > > Blog, about an equivalent of Solr's SearchComponents in ES:
>>> >
>>> > >http://blog.sematext.com/2010/05/03/elastic-search-
>>> distributed-lucene...
>>> >
>>> > > I'm curious, too.  Thanks.
>>> >
>>> > > Otis
>>> > > --
>>> > > Sematext is hiring Search Engineers --http://sematext.com/about/
>>> jobs.html
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/16add2bc-c629-4613-934f-004c8cc749df%40googlegroups.com

Re: logstash 1.4.0 debian package init script not working

2014-04-18 Thread Goofy03
Do you have check permission on /opt/logstash and /var/log/logstash 
/etc/logstash … same user than in the init script ?

Solve this for me on debian but i can't get event when apache log is 
update. than if i run it in root (console way) all is working …
Ho and i have add logstash user to adm group …

Le vendredi 18 avril 2014 06:36:51 UTC+2, OJ LaBoeuf a écrit :
>
> The upstart job also doesn't seem to work, it just keeps dying over and 
> over again never logging anything to the logfile.  
>
> If i manually start logstash everything works normally.
>
> On Thursday, April 17, 2014 6:12:38 PM UTC-7, OJ LaBoeuf wrote:
>>
>> Running Ubuntu 12.04 64bit, the logstash init script does not work.
>>
>> here's the script that came with logstash deb
>>
>> In particular I don't understand how the script is trying to parse 
>> something from the logstash pid, before it even starts the program..?
>>
>>   log_daemon_msg "Starting $DESC"
>>
>>   # Parse the actual JAVACMD from the process' environment, we don't 
>> care about errors.
>>   JAVA=$(cat /proc/$(cat "${PID_FILE}" 2>/dev/null)/environ 
>> 2>/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
>>   if start-stop-daemon --test --start --pidfile "$PID_FILE" \
>>  --user "$LS_USER" --exec "$JAVA" \
>>   >/dev/null; then
>>  # Prepare environment
>>
>> I checked and JAVA is empty at this location, so what the heck is this 
>> trying to do?
>>
>>
>> running this bit:
>> sudo start-stop-daemon --test --start --pidfile /var/run/logstash.pid 
>> --user "logstash" --exec ""
>>
>> results in the same message i get at the commandline when trying to 
>> /etc/init.d/logstash start
>> start-stop-daemon: unable to stat  (No such file or directory)
>>
>>
>> Please advise.
>>
>>
>>
>> Full init script pasted below 
>>
>>
>> #!/bin/bash
>> #
>> # /etc/init.d/logstash -- startup script for LogStash.
>> #
>> ### BEGIN INIT INFO
>> # Provides:  logstash
>> # Required-Start:$all
>> # Required-Stop: $all
>> # Default-Start: 2 3 4 5
>> # Default-Stop:  0 1 6
>> # Short-Description: Starts logstash
>> # Description:   Starts logstash using start-stop-daemon
>> ### END INIT INFO
>>
>> set -e
>>
>> NAME=logstash
>> DESC="Logstash Daemon"
>> DEFAULT=/etc/default/$NAME
>>
>> if [ `id -u` -ne 0 ]; then
>>echo "You need root privileges to run this script"
>>exit 1
>> fi
>>
>> . /lib/lsb/init-functions
>>
>> if [ -r /etc/default/rcS ]; then
>>. /etc/default/rcS
>> fi
>>
>> # The following variables can be overwritten in $DEFAULT
>> PATH=/bin:/usr/bin:/sbin:/usr/sbin
>>
>> # See contents of file named in $DEFAULT for comments
>> LS_USER=logstash
>> LS_GROUP=logstash
>> LS_HOME=/var/lib/logstash
>> LS_HEAP_SIZE="500m"
>> LS_JAVA_OPTS="-Djava.io.tmpdir=${LS_HOME}"
>> LS_LOG_FILE=/var/log/logstash/$NAME.log
>> LS_CONF_DIR=/etc/logstash/conf.d
>> LS_OPEN_FILES=16384
>> LS_NICE=19
>> LS_OPTS=""
>> LS_PIDFILE=/var/run/$NAME.pid
>>
>> # End of variables that can be overwritten in $DEFAULT
>>
>> # overwrite settings from default file
>> if [ -f "$DEFAULT" ]; then
>>. "$DEFAULT"
>> fi
>>
>> # Define other required variables
>> PID_FILE=${LS_PIDFILE}
>> DAEMON=/opt/logstash/bin/logstash
>> DAEMON_OPTS="agent -f ${LS_CONF_DIR} -l ${LS_LOG_FILE} ${LS_OPTS}"
>>
>> # Check DAEMON exists
>> if ! test -e $DAEMON; then
>>log_failure_msg "Script $DAEMON doesn't exist"
>>exit 1
>> fi
>>
>> case "$1" in
>>start)
>>   if [ -z "$DAEMON" ]; then
>>  log_failure_msg "no logstash script found - $DAEMON"
>>  exit 1
>>   fi
>>
>>   # Check if a config file exists
>>   if [ ! "$(ls -A $LS_CONF_DIR/*.conf 2> /dev/null)" ]; then
>>  log_failure_msg "There aren't any configuration files in 
>> $LS_CONF_DIR"
>>  exit 1
>>   fi
>>
>>   log_daemon_msg "Starting $DESC"
>>
>>   # Parse the actual JAVACMD from the process' environment, we don't 
>> care about errors.
>>   JAVA=$(cat /proc/$(cat "${PID_FILE}" 2>/dev/null)/environ 
>> 2>/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
>>   if start-stop-daemon --test --start --pidfile "$PID_FILE" \
>>  --user "$LS_USER" --exec "$JAVA" \
>>   >/dev/null; then
>>  # Prepare environment
>>  HOME="${HOME:-$LS_HOME}"
>>  JAVA_OPTS="${LS_JAVA_OPTS}"
>>  ulimit -n ${LS_OPEN_FILES}
>>  cd "${LS_HOME}"
>>  export PATH HOME JAVACMD JAVA_OPTS LS_HEAP_SIZE LS_JAVA_OPTS 
>> LS_USE_GC_LOGGING
>>
>>  # Start Daemon
>>  start-stop-daemon --start -b --user "$LS_USER" -c 
>> "$LS_USER":"$LS_GROUP" \
>>-d "$LS_HOME" --nicelevel "$LS_NICE" --pidfile "$PID_FILE" 
>> --make-pidfile \
>>--exec $DAEMON -- $DAEMON_OPTS
>>
>>  sleep 1
>>
>>  # Parse the actual JAVACMD from the process' environment, we 
>> don't care about errors.
>>  JAVA=$(cat /proc/$(cat "${PID_FILE}" 2>/dev/null)/environ 
>> 2>/dev/null | grep -z ^JAVACMD= | cut -d= -f2)
>>   

Re: Elasticsearch on java7u55 ?

2014-04-18 Thread Jason Wee
will these two links help?
https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/SYSTEM_REQUIREMENTS.txt
http://people.apache.org/~mikemccand/lucenebench/indexing.html

lucene performance test is using java 1.70 u40. that's the same version i'm
using for lucene 4.6.0.

jason


On Fri, Apr 18, 2014 at 8:54 PM, Lukáš Vlček  wrote:

> Hi,
>
> is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
> safe and recommended?
>
> I found Robert and Uwe discussed this Java version here:
> http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
> I found couple of failed builds in
> http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16 that
> might be related to this version of Java but all seemed to be rather Solr
> related.
>
> Regards,
> Lukas
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itwFJV3g%2BEG0pLG%3DYP4Jv0V-xyPTX2zhi1EsdvyVT0ZAVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Need some help for creating my model

2014-04-18 Thread Stefan Kruse
Ok new try. Is it general possible to do this with the PHP API,  i dont find 
nothing in the docu. Maybe i dont see it. Regards Stefan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/deb41747-30d3-4e48-8bb3-86f861020560%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch on java7u55 ?

2014-04-18 Thread Lukáš Vlček
Hi,

is anybody using Oracle Java 1.7.0_55 with Elasticsearch (v0.90.5)? Is it
safe and recommended?

I found Robert and Uwe discussed this Java version here:
http://lucene.472066.n3.nabble.com/Update-lucene-apache-org-java-recommendations-with-java7u55-td4131353.html
I found couple of failed builds in
http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/ after Apr 16 that
might be related to this version of Java but all seemed to be rather Solr
related.

Regards,
Lukas

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUbM_HdMBeRKoqCEAZvixiLyF%3Dkh2T6WwEZEn4SRWXuc%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Word count per document

2014-04-18 Thread Itamar Syn-Hershko
You should be able to do this using the aggregations framework:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html

The idea is that you bucket on document ID, and then on terms, then do a
count

But I'm not sure it was designed to handle this scenario, where you have
tens of thousands of buckets and then many unique terms in each bucket.
Maybe someone from ES core can chime in on that.

--

Itamar Syn-Hershko
http://code972.com | @synhershko 
Freelance Developer & Consultant
Author of RavenDB in Action 


On Fri, Apr 18, 2014 at 3:40 PM, Aharon Twizer wrote:

> Thanks Itamar.
>
> But with the Term Vector I'll have to make a separate call for each
> document (I can have up to 20K documents).
>
> I want to be able to make a single call with the word I'm looking for and
> to get the statistics for each document.
>
>
> On Friday, April 18, 2014 2:52:53 PM UTC+3, Aharon Twizer wrote:
>>
>> Hi,
>>
>> I'm new to ElasticSearch.
>>
>> What I want to do is to upload a few hundred documents and then look for
>> words in those documents.
>>
>> The most important part is to get the count of the each word per
>> document. e.g. If I look for the word "boy", the answer I'll get is that it
>> appears 3 times in document A and 5 times in document B.
>>
>> Can I do that with ElasticSearch?
>>
>> Thanks in advanced!
>>
>> Cheers,
>> Aharon.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4e6e0ed5-3e3f-44a4-b11f-7f8efee2bbeb%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtQiwBa17exGbhoiGR%2B3-hvYMK4_3ueci1V_Lu7TS23WA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Word count per document

2014-04-18 Thread Aharon Twizer
Thanks Itamar.

But with the Term Vector I'll have to make a separate call for each 
document (I can have up to 20K documents).

I want to be able to make a single call with the word I'm looking for and 
to get the statistics for each document.


On Friday, April 18, 2014 2:52:53 PM UTC+3, Aharon Twizer wrote:
>
> Hi,
>
> I'm new to ElasticSearch.
>
> What I want to do is to upload a few hundred documents and then look for 
> words in those documents.
>
> The most important part is to get the count of the each word per document. 
> e.g. If I look for the word "boy", the answer I'll get is that it appears 3 
> times in document A and 5 times in document B.
>
> Can I do that with ElasticSearch?
>
> Thanks in advanced!
>
> Cheers,
> Aharon.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4e6e0ed5-3e3f-44a4-b11f-7f8efee2bbeb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Word count per document

2014-04-18 Thread Itamar Syn-Hershko
Yes, take a  look here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-termvectors.html

--

Itamar Syn-Hershko
http://code972.com | @synhershko 
Freelance Developer & Consultant
Author of RavenDB in Action 


On Fri, Apr 18, 2014 at 2:52 PM, Aharon Twizer wrote:

> Hi,
>
> I'm new to ElasticSearch.
>
> What I want to do is to upload a few hundred documents and then look for
> words in those documents.
>
> The most important part is to get the count of the each word per document.
> e.g. If I look for the word "boy", the answer I'll get is that it appears 3
> times in document A and 5 times in document B.
>
> Can I do that with ElasticSearch?
>
> Thanks in advanced!
>
> Cheers,
> Aharon.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f716d555-071f-44da-b868-6bc9ddd6455d%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Ztj0yDSS%2BAT8%3DM-DG7_JrjfsrLuK725RzTPEF57s6wRPQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Word count per document

2014-04-18 Thread Aharon Twizer
Hi,

I'm new to ElasticSearch.

What I want to do is to upload a few hundred documents and then look for 
words in those documents.

The most important part is to get the count of the each word per document. 
e.g. If I look for the word "boy", the answer I'll get is that it appears 3 
times in document A and 5 times in document B.

Can I do that with ElasticSearch?

Thanks in advanced!

Cheers,
Aharon.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f716d555-071f-44da-b868-6bc9ddd6455d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Setting Node ID

2014-04-18 Thread Michael Salmon
I'm planning on trying out multiple nodes on one host and I'd like to be able 
to control the node id but as far as I can see this is set in NodeEnvironment 
to the first unused value. The reason for setting the id is so that I would 
like to include it in the node name which I currently set to the hostname.

How do others handle this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4bba7cbf-5e0d-4401-931b-6ef442d3c87d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is ElasticSearch the Right Tool for This

2014-04-18 Thread Clinton Gormley
Hiya

It's a bit more verbose, but yes you can do queries like that easily.  I've
assumed that all of your fields are "exact value" not_analyzed string
fields, rather than full text fields:

GET /_search
{
  "_source": [ "col1", "col2" ],
  "query": {
"filtered": {
  "filter": {
"bool": {
  "must": [
{
  "bool": {
"should": [
  { "terms": { "col3": [ "some", "value" ]}},
  { "missing": { "field": "col3" }}
]
  }
},
{
  "bool": {
"should": [
  { "terms": { "col4": [ "another", "set", "values" ]}},
  { "missing": { "field": "col4" }}
]
  }
},
{ "term": { "col5": "hello" }}
  ],
  "must_not": [
{ "term": { "col6": "world" }}
  ]
}
  }
}
  },
  "sort": "col7"
}

All of those lookups use filters, so would be cached, making all future
executions very fast indeed.


On 18 April 2014 08:37, Paul  wrote:

> Hi,
>
> We're looking to move our infrastructure to ElasticSearch and I have some
> concerns.  We plan on using this more as a database and less than a search
> engine.  I know there are some companies out there that are doing this, but
> I have some queries that, with one SQL command I can get the results I
> need, whereas ElasticSearch I would need to do filters of queries, etc.
>
>
> An example, using SQL parlance, how would I do the following statement:
>
> select col1, col2 from mytable where col3 in ["", "some", "value] and col4
> in ["another", "set", "", "values"] and col5 = "hello" and col6 not in
> "world" order by col7.
>
> This is an example of some data I would be querying, and I would be
> performing 1000's of queries at a time.
>
>
>
> So my question:  Can ElasticSearch do this and if so, how can I do the
> above query.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a59fcffe-5671-4ee0-a6bf-d49aedd3189b%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKSKZ4uzfD3BuFtWqpnAg97Yc7m4cEtGBBbrYOoN5x7n0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wildcard query is not working.

2014-04-18 Thread Dan Tuffery
You're setting the size parameter to 0 in your queries so it won't return 
anything. Also, you need to have an copy of the URL value in your index 
that is not analyzed which you can use for your wildcard query. In your 
mapping you need to specify that you want to index the URL value verbatim:

"URL": {
"type": "string",
"fields": {
"untouched": {
"type": "string",
"index": "not_analyzed"
},
}
}

Using the mapping above the URL value will be indexed using default 
standard analyzer, it will also index a verbatim copy of the value as 
specified by the 'untouched' field which you would use in the wildcard 
query:

curl -XGET 'http://localhost:9200/message_index/message_indext/_search' -d 
'{"query":{"wildcard":{"URL.untouched":"http://www.mohit-kumar-yadav.com*"}}}'

Dan

On Thursday, April 17, 2014 8:55:20 PM UTC+1, Mohit Kumar Yadav wrote:
>
> hi folks,
> In my document there is a field which contians only URL as it value. 
> forexample {"URL" : 
> "http://www.mohit-kumar-yadav.com\123124343\login_user.html"; 
> }
> {"URL" : "http://www.mohit-kumar-yadav.com\home_user.html"}
> how can i search these documents.
> I am using following query :- 
>
> 1. Curl -XGET '
> http://localhost:9200/message_index/message_indext/_search?size=0' 
> -d'{"query":{"wildcard":{"URL":"*mohit-kumar-yadav*"}}}'
>
> no result.. query return zero hits
>
> 2. Curl -XGET '
> http://localhost:9200/message_index/message_indext/_search?size=0' 
> -d'{"query":{"field":{"URL":"http://www.mohit-kumar-yadav.com"}}}'
>
> no result.. query return zero hits
>
> 3.  Curl -XGET '
> http://localhost:9200/message_index/message_indext/_search?size=0' 
> -d'{"fuzzy_like_this_field" : {"URL" : {"like_text" : "
> www.mohit-kumar-yadav.com","max_query_terms" : 25}}}'
>
> no result.. query return zero hits
>
>
> please suggest me where i am doing wrong..
>
> Thanks in advance..!!!
>
> Regrads
> Mohit
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/84f57d26-3072-407c-bcfa-cdb40400788b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch AWS cloud plugin 2.1.1 released

2014-04-18 Thread David Pilato
Heya,


We are pleased to announce the release of the Elasticsearch AWS cloud plugin, 
version 2.1.1.

The Amazon Web Service (AWS) Cloud plugin allows to use AWS API for the unicast 
discovery mechanism and add S3 repositories..

https://github.com/elasticsearch/elasticsearch-cloud-aws/

Release Notes - elasticsearch-cloud-aws - Version 2.1.1



Update:
 * [74] - cloud-aws 2.1.0 doesn't support elasticsearch 1.1.1 
(https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/74)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-cloud-aws project repository: 
https://github.com/elasticsearch/elasticsearch-cloud-aws/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5350ef75.643c9869.13c60%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Is ElasticSearch the Right Tool for This

2014-04-18 Thread Pulkit Agrawal


Sent from my iPhone

On 18-Apr-2014, at 12:07 PM, Paul  wrote:

> Hi,
> 
> We're looking to move our infrastructure to ElasticSearch and I have some 
> concerns.  We plan on using this more as a database and less than a search 
> engine.  I know there are some companies out there that are doing this, but I 
> have some queries that, with one SQL command I can get the results I need, 
> whereas ElasticSearch I would need to do filters of queries, etc.
> 
> 
> An example, using SQL parlance, how would I do the following statement:
> 
> select col1, col2 from mytable where col3 in ["", "some", "value] and col4 in 
> ["another", "set", "", "values"] and col5 = "hello" and col6 not in "world" 
> order by col7.
> 
> This is an example of some data I would be querying, and I would be 
> performing 1000's of queries at a time.
> 
> 
> 
> So my question:  Can ElasticSearch do this and if so, how can I do the above 
> query.
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a59fcffe-5671-4ee0-a6bf-d49aedd3189b%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5519FAE2-009C-40E7-BED8-0F344F6A22BC%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ELK stack needs tuning

2014-04-18 Thread R. Toma
Hi Jörg,

Thank you for pointing me to this article. I needed to read it twice, but I 
think I understand it now.

I believe shard overallocating works for use-cases where you want to store 
& search 'users' or  'products'. Such data allows you to divide all 
documents into groups to be stored in different shards using routing. All 
shards get indexed & searched.

But how does this work for logstash indices? I could create 1 index with 
365 shards (if I want 1 year of retention) and use alias routing (alias per 
date with routing to a shard) to index into a different shard every day, 
but after 1 year I need to purge a shard. And purging a shard is not easy. 
It would require a delete of every document in the shard.

Or am I missing something?

Regards,
Renzp


Op donderdag 17 april 2014 16:15:43 UTC+2 schreef Jörg Prante:
>
> "17 new indices every day" - whew. Why don't you use shard overallocating?
>
>
> https://groups.google.com/forum/#!msg/elasticsearch/49q-_AgQCp8/MRol0t9asEcJ
>
> Jörg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/88a6f992-400b-4fb5-80e5-7b024b17ffd6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can we perform the text search presnet in the images or pdf files through elasticsearch

2014-04-18 Thread Rafał Kuć
Hello!

The attachment plugin will use Tika to extract the text from binary
file content that you send in the base64. Tika does a good job with
text extraction, however you have to test it yourself, if your files
are parsed well enough for your use case.

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


> So can I say that the mapper-attachment plugin is made to work like below:
> Whether I am sending text file or pdf file or image file to ES , the plugin
> will extract the *text content* in all three scenarios and will store it
> into the ES and then it will be available for search as well?



> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054374.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/241416263.20140418094630%40alud.com.pl.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana-auth install under RHEL6 server ?

2014-04-18 Thread Andrea Martines

>
> No one ?
>

:( I keep trying but there's always a tool that does not work :/ 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b9dfff2-7b8c-441e-8fd6-fee0402fcdc5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.