Re: "now throttling indexing"

2015-03-13 Thread Eric Jain
On Fri, Mar 13, 2015 at 6:09 AM, Michael McCandless  wrote:
> That is the right setting to disable store throttling, but even without 
> throttling writes MB/sec for merges, the merges can still fall behind, 
> leading to index throttling.  ES does this to protect the health of the index 
> because too many segments will cause all sorts of trouble.
>
> What IO system is your index on?  If you're on spinning disks you could try 
> setting index.merge.scheduler.max_thread_count to 1 since spinning disks 
> struggle with concurrent merges.  See 
> http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html
>
> Also, do you leave enough (at least 50%) free RAM to the OS for buffering 
> pages?  This can make a difference with spinning disks since the OS has more 
> freedom to do read-ahead on the files being merged...

The index is on an SSD (EC2 m3.large). I'll look into what else is
going on, had just noticed a few of those log entries, and thought
setting `indices.store.throttle.type: none` might be a quick fix :-)

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHte5%2B%2BqrQzoQAicwz9Q%2BMprcNVhVqXWKGp_CkKYefKNeWnO_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


"now throttling indexing"

2015-03-12 Thread Eric Jain
I set `indices.store.throttle.type: none` in the elasticsearch.yml, and yet 
this shows up in the logs:

  now throttling indexing: numMergesInFlight=5, maxNumMerges=4
  stop throttling indexing: numMergesInFlight=3, maxNumMerges=4

Did I misunderstand the purpose of this setting?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b0f57e1-f78e-4782-ad4d-2dfe42bb5c17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Offline storage to keep data for several months, how?

2015-03-06 Thread Eric Fontana
I'm still pretty new to this stack, looking for pointers.   We have a 
production 8 node cluster which is keeping 14 days worth of indices open.
Keeping open more indices seems to require more memory than we have 
available, we use the elasticsearch-curator script to close indices
older than 14 days.  

People are looking to be able to search (via Kibana) data from several 
months back, I've read about snapshots and was thinking
I'd like to start moving snapshots to Amazon/S3 storage and then spin up a 
Kibana/Elasticsearch pointing to the data living there.
Is this a good methodology?  What exactly is the procedure for doing this? 
Can Elasticsearch read snapshots directly?  

Thanks.
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/304db535-782e-4b5c-9e30-4b96a99abc21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch-http-basic with ES 1.4.2

2015-02-19 Thread Eric
Has anyone had any luck with using http-basic with ES 1.4.2? I just want to 
put some basic security on my ES instance from outside of the clusters and 
this appears to be the easiest way with just white listing my other nodes. 
When I install it and configure it, it shows it going to the http-basic 
plugin but it always accepts the username/password from localhost even if I 
put the wrong info in there. It also never prompts for username/password 
from other IPs connecting to it. 

Locally it shows this:

*[root@elasticsearch1 http-basic]# curl -v --user bob:wrongpassword 
localhost:9200*
* About to connect() to localhost port 9200 (#0)
*   Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 9200 (#0)
* Server auth using Basic with user 'bob'
> GET / HTTP/1.1
> Authorization: Basic Ym9iOnBhc3N3b3JkMTIzNTU1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 
NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: localhost:9200
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain; charset=UTF-8
< Content-Length: 9
<
* Connection #0 to host localhost left intact
* Closing connection #0

>From external sources it shows this in the logs.

[2015-02-19 14:56:29,816][INFO 
][com.asquera.elasticsearch.plugins.http.HttpBasicServer] [elasticsearch1] 
Authorization:null, Host:192.168.1.4:9200, Path:/, :null, 
Request-IP:192.168.1.4, Client-IP:null, X-Client-IPnull

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d2f2ac4-a8fd-4538-bc21-e0cde135a84d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Master Node vs. Data Node Architecture

2015-02-13 Thread Eric
According to Elastic HQ I currently am keeping 12 days of logs
 - 8 nodes (3 data nodes that can all be master and 5 Logstash)
 - 399 shards
 - 59 indices
 - 1,540,343,998 documents
 - 380 GB

On Thursday, February 12, 2015 at 4:41:01 PM UTC-5, Mark Walkom wrote:
>
> Except that is overkill when you only have 3 nodes.
>
> How much data do you have in the cluster?
>
> On 13 February 2015 at 01:15, Itamar Syn-Hershko  > wrote:
>
>> See this: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html
>>
>> Basically, the recommended pattern talks about isolating 
>> responsibilities. A node should either be a data node, master-eligible 
>> node, or an external gateway to the cluster (client node)
>>
>> --
>>
>> Itamar Syn-Hershko
>> http://code972.com | @synhershko <https://twitter.com/synhershko>
>> Freelance Developer & Consultant
>> Lucene.NET committer and PMC member
>>
>> On Thu, Feb 12, 2015 at 4:08 PM, Eric > 
>> wrote:
>>
>>> Hello,
>>>
>>> Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM 
>>> with 16 gig RAM. The basic config is:
>>>
>>> - All nodes can be master and are data nodes.
>>> - 3 shards and 1 replica
>>> - 6 different indexes
>>>
>>> I'm starting to run into issues of ElasticSearch bogging down on 
>>> searches and is completely freezing sometimes at night. I've dedicated 9 
>>> gig to heap size and it says i'm using ~60% of the heap RAM and about 70% 
>>> of the overall heap. So even though I'm using quite a bit of the heap, I'm 
>>> not maxed out. I've attached a screenshot of the exact stats from Elastic 
>>> HQ. I'm averaging around 10,000 events/sec coming into the cluster from 6 
>>> different Logstash instances on another server.
>>>
>>> My question is what can I do to help the stability and speed of my 
>>> cluster. Currently I'm having issues with 1 node going down and it taking 
>>> everything else down. The HA portion isn't working very well. I'm debating 
>>> about either adding 1 more node with the exact same stats or adding 2 more 
>>> smaller VMs that will act as master nodes only. I didn't know which one was 
>>> recommended or where I would get the biggest bang for the buck.
>>>
>>> Any information would be greatly appreciated.
>>>
>>> Thanks,
>>> Eric
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZureLROJMaO7gVprFjA2OmRZA0ZYyH1v%2Bges06u_V__6w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/10133d13-ff4d-4798-80e2-4fffabfeb53c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Master Node vs. Data Node Architecture

2015-02-12 Thread Eric
Hello,

Currently I have a 3 node ElasticSearch cluster. Each node is a RHEL VM 
with 16 gig RAM. The basic config is:

- All nodes can be master and are data nodes.
- 3 shards and 1 replica
- 6 different indexes

I'm starting to run into issues of ElasticSearch bogging down on searches 
and is completely freezing sometimes at night. I've dedicated 9 gig to heap 
size and it says i'm using ~60% of the heap RAM and about 70% of the 
overall heap. So even though I'm using quite a bit of the heap, I'm not 
maxed out. I've attached a screenshot of the exact stats from Elastic HQ. 
I'm averaging around 10,000 events/sec coming into the cluster from 6 
different Logstash instances on another server.

My question is what can I do to help the stability and speed of my cluster. 
Currently I'm having issues with 1 node going down and it taking everything 
else down. The HA portion isn't working very well. I'm debating about 
either adding 1 more node with the exact same stats or adding 2 more 
smaller VMs that will act as master nodes only. I didn't know which one was 
recommended or where I would get the biggest bang for the buck.

Any information would be greatly appreciated.

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/354a2326-5532-4239-87ea-f02af64fe71f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ClassCast when sorting on a long field (AtomicFieldData$WithOrdinals$1 cannot be cast to AtomicNumericFieldData)

2015-02-11 Thread Eric Wittmann
Also note:  I'm using the java TransportClient to index the documents.  I 
get the ClassCast problem when querying using the TransportClient, but I 
*also* get the error when querying 
using http://localhost:9200/_plugin/marvel/sense/index.html.

fwiw

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/94305ece-3e91-403e-9fd6-44979be74ac5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ClassCast when sorting on a long field (AtomicFieldData$WithOrdinals$1 cannot be cast to AtomicNumericFieldData)

2015-02-11 Thread Eric Wittmann
This problem is really strange because my data has two long fields, and my 
search/filter works for one of them but not the other.

Even weirder, I tried to create a curl reproducer for this with the exact 
same data and that works fine!

Details here:

https://gist.github.com/EricWittmann/86864fd897a6f7496fd4

That shows the mapping of the auditEntry document type.  It then shows two 
queries, the first using "id" as the sort.  This query fails with:

"QueryPhaseExecutionException[[apiman_manager][0]: 
query[ConstantScore(cache(_type:auditEntry))],from[0],size[20],sort[!]:
 
Query Failed [Failed to execute main query]]; nested: 
ClassCastException[org.elasticsearch.index.fielddata.AtomicFieldData$WithOrdinals$1
 
cannot be cast to org.elasticsearch.index.fielddata.AtomicNumericFieldData];
"

The second is the same query but sorting on "createdOn".  This query seems 
to work fine.

Any help appreciated!

-Eric

PS: Originally found this using ES 1.3.2, but also tested using 1.3.8.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/28c75509-417d-4355-ac07-f20a30118b87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Sorting and range filtering semantic versions

2015-01-25 Thread Eric Smith
I am trying to figure out some sort of indexing scheme where I can do range 
filters on semantic versions .  Values look like these:

"1.0.2.5", "1.10.2.5", "2.3.434.1"

I know that I can add a separate field with the numbers padded out, but I 
was hoping to have a single field where I could do things like this:

"version:>1.0" "version:1.0.2.5" "version:1.0" "version:[1.0 TO 2.0]"

I have created some pattern capture filters to allow querying partial 
version numbers. I even created some pattern replacement filters to pad the 
values out so that they could be lexicographically sorted, but those 
filters only control the tokens that are indexed and not the value that is 
used for sorting and range filters.

Is there a way to customize the value that is used for sorting and range 
filters?  It seems like it just uses the original value and I don't have 
any control of it?

Any help would be greatly appreciated!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6a3535da-76d8-4dff-b2e6-114ea83cd639%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index Templates

2015-01-14 Thread Eric Howard
I added the template file to each node and restarted elasticsearch on each node 
but do not see the template when i issue "curl -XGET 
localhost:9200/_template/?pretty". 

If i use the API, are the changes persistent over reboots?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3526c5f7-62db-4b29-a2ea-4e32d32e8d1d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


elasticsearch went nuts because we had a client trying to send to a closed index??

2015-01-14 Thread Eric Fontana
Someone's redis queue was really backed up, and was trying to send (using 
logstash elasticsearch_http plugin) messages
to a closed index.

Which resulted in thousands of these:

{:timestamp=>"2015-01-14T10:24:19.883000-0500", :message=>"Failed to flush 
outgoing items", :outgoing_count=>1000, :exception=>#, 
:backtrace=>["/opt/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:127:in
 
`bulk_ftw'", 
"/opt/logstash/lib/logstash/outputs/elasticsearch/protocol.rb:80:in 
`bulk'", "/opt/logstash/lib/logstash/outputs/elasticsearch.rb:321:in 
`flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in
 
`buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in
 
`buffer_flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in
 
`buffer_flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:159:in
 
`buffer_receive'", 
"/opt/logstash/lib/logstash/outputs/elasticsearch.rb:317:in `receive'", 
"/opt/logstash/lib/logstash/outputs/base.rb:86:in `handle'", 
"/opt/logstash/lib/logstash/outputs/base.rb:78:in `worker_setup'"], 
:level=>:warn}

{:timestamp=>"2015-01-14T10:36:03.399000-0500", :message=>"Failed to flush 
outgoing items", :outgoing_count=>400, :exception=>RuntimeError, 
:backtrace=>["/opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:240:in 
`post'", "/opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:213:in 
`flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in
 
`buffer_flush'", "org/jruby/RubyHash.java:1339:in `each'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in
 
`buffer_flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:193:in
 
`buffer_flush'", 
"/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:159:in
 
`buffer_receive'", 
"/opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:191:in 
`receive'", "/opt/logstash/lib/logstash/outputs/base.rb:86:in `handle'", 
"/opt/logstash/lib/logstash/outputs/base.rb:78:in `worker_setup'"], 
:level=>:warn}
{:timestamp=>"2015-01-14T10:36:03.577000-0500", :message=>"Error writing 
(bulk) to elasticsearch", :response=>#"application/json; 
charset=UTF-8", "content-length"=>"77"}>, @body=, @status=404, @reason="Not 
Found", @logger=#, @data={}, 
@metrics=#, @metrics={}, @metrics_lock=#>, @subscribers={}, 
@level=:info>, @version=1.1>, 
:response_body=>"{\"error\":\"IndexMissingException[[logstash-2014.12.27] 
missing]\",\"status\":404}", :request_body=>"", :level=>:error}


I happened to notice the index name 'logstash-2014.12.17' 

This caused everything to backup.  Is there a setting somewhere that I can 
tell elasticsearch to drop that on the floor?

Thanks.

 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c5e6de27-d87f-4b67-99ce-d3f1972ad8d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Index Templates

2015-01-14 Thread Eric Howard
At 
www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html
 it states that "Index templates can also be placed within the config location 
(path.conf) under the templates directory (note, make sure to place them on all 
master eligible nodes). For example, a file called template_1.json can be 
placed under config/templates..."

I do not see a config/templates directory. Do i need to create it somewhere? 
I'm using elasticsearch 1.0.1-1.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01f58b9f-df80-4908-b3c1-6fc349bc72ed%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: kibana empty dashboard

2015-01-05 Thread Eric
I solved my problem. The documentation elasticsearch.org didn't work 
(http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-http.html),
 
but it wasn't entirely their fault.

The options to use in your /etc/elasticsearch/elasticsearch.yml file if 
you're using elasticsearch 1.4.x with kibana 3.x are:

http.cors.allow-origin: "/.*/" 
http.cors.enabled: true

Source: http://stackoverflow.com/a/26884367/2015890

In the documentation on elasticsearch.org, it says to just use an asterisk, 
* , but that didn't work.

This didn't work:

http.cors.allow-origin: * 

But this did:

http.cors.allow-origin: "*" 


Figures.

Furthermore, I would like to get SSL to work, but I think this will have to do 
for now



On Monday, January 5, 2015 5:39:43 AM UTC-5, Eric wrote:
>
> Here are the versions that I'm running:
>
> # Kibana version
> Kibana 3.1.2-07bbd7e
> eeded13255f154eaeceb4cf83105e4b4  kibana-3.1.2.tar.gz
>
> # Logstash version
> [root@elk ~]# /opt/logstash/bin/logstash version
> logstash 1.4.2-modified
> 1db9f0864ff4b89380b39c39bc419031  logstash-1.4.2-1_2c0f5a1.noarch.rpm
>
> # Elasticsearch version
> [root@elk ~]# /usr/share/elasticsearch/bin/elasticsearch -v
> Version: 1.4.2, Build: 927caff/2014-12-16T14:11:12Z, JVM: 1.7.0_51
> 6e2061f0734f9dbab263c1616701c1fe  elasticsearch-1.4.2.noarch.rpm
>
> # OS
> CentOS (CentOS-7.0-1406-x86_64-Everything.iso)
> Installed packages: Basic Web Server + Development tools
>
> Logstash runs fine. Elasticsearch runs fine. Kibana runs, but only shows 
> the screenshot shown below at, https://logstasht/#/dashboard
>
>
>
> <https://lh3.googleusercontent.com/-8mIiX5lKJ_U/VKpmMkRSftI/AAACYWM/v4LxHMzEAGI/s1600/kibana.png>
>
>
>
>
> On Wednesday, May 14, 2014 6:56:03 PM UTC-4, Mark Walkom wrote:
>>
>> I think you have extra quotes causing a problem, try - elasticsearch: "
>> http://192.168.10.25:9200";,
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 15 May 2014 05:58,  wrote:
>>
>>> I have the following is showing up when I pull up my kibana dashboard: 
>>>
>>> http://192.168.10.25/#/dashboard
>>>
>>>  {{dashboard.current.title}} 
>>>
>>> When I tail my logs I see the following 
>>> 2014/05/14 13:31:45 [error] 17152#0: *7 open() 
>>> "/var/www/kibana/app/diashboards/dashboard" failed (2: No such file or 
>>> directory), client: 192.168.11.53, server: 192.168.10.25, request: "GET 
>>> /app/diashboards/dashboard HTTP/1.1", host: "192.168.10.25" 
>>>
>>> I have been pulling my hair out over this, all help would be appreciated 
>>>
>>> This is my config.js 
>>>
>>>  /** @scratch /configuration/config.js/2 
>>>* === Parameters 
>>>*/ 
>>>   return new Settings({ 
>>>
>>> /** @scratch /configuration/config.js/5 
>>>  *  elasticsearch 
>>>  * 
>>>  * The URL to your elasticsearch server. You almost certainly don't 
>>>  * want +>> href="http://localhost:9200+";>http://localhost:9200+ here. Even if Kibana 
>>> and Elasticsearch are on 
>>>  * the same host. By default this will attempt to reach ES at the 
>>> same host you have 
>>>  * kibana installed on. You probably want to set it to the FQDN of 
>>> your 
>>>  * elasticsearch host 
>>>  */ 
>>> elasticsearch: "http://"192.168.10.25":9200";, 
>>> /*elasticsearch: "http://"+window.location.hostname+":9200";, 
>>>
>>> /** @scratch /configuration/config.js/5 
>>>  *  default_route 
>>>  * 
>>>  * This is the default landing page when you don't specify a 
>>> dashboard to load. You can specify 
>>>  * files, scripts or saved dashboards here. For example, if you had 
>>> saved a dashboard called 
>>>  * `WebLogs' to elasticsearch you might use: 
>>>  * 
>>>  * +default_route: '/dashboard/elasticsearch/WebLogs',+ 
>>>  */ 
>>> default_route : '/dashboard/file/default.json', 
>>>
>>> /** @scratch /configuration/config.js/5 
>>>  *  kibana-int 
>>>  * 
>>>  * The default ES index to use for storing Kibana specific object 
>>>  * su

Re: kibana empty dashboard

2015-01-05 Thread Eric


Here are the versions that I'm running:

# Kibana version
Kibana 3.1.2-07bbd7e
eeded13255f154eaeceb4cf83105e4b4  kibana-3.1.2.tar.gz

# Logstash version
[root@elk ~]# /opt/logstash/bin/logstash version
logstash 1.4.2-modified
1db9f0864ff4b89380b39c39bc419031  logstash-1.4.2-1_2c0f5a1.noarch.rpm

# Elasticsearch version
[root@elk ~]# /usr/share/elasticsearch/bin/elasticsearch -v
Version: 1.4.2, Build: 927caff/2014-12-16T14:11:12Z, JVM: 1.7.0_51
6e2061f0734f9dbab263c1616701c1fe  elasticsearch-1.4.2.noarch.rpm

# OS
CentOS (CentOS-7.0-1406-x86_64-Everything.iso)
Installed packages: Basic Web Server + Development tools

Logstash runs fine. Elasticsearch runs fine. Kibana runs, but only shows 
the screenshot shown below at, https://logstasht/#/dashboard







On Wednesday, May 14, 2014 6:56:03 PM UTC-4, Mark Walkom wrote:
>
> I think you have extra quotes causing a problem, try - elasticsearch: "
> http://192.168.10.25:9200";,
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 15 May 2014 05:58, > wrote:
>
>> I have the following is showing up when I pull up my kibana dashboard: 
>>
>> http://192.168.10.25/#/dashboard
>>
>>  {{dashboard.current.title}} 
>>
>> When I tail my logs I see the following 
>> 2014/05/14 13:31:45 [error] 17152#0: *7 open() 
>> "/var/www/kibana/app/diashboards/dashboard" failed (2: No such file or 
>> directory), client: 192.168.11.53, server: 192.168.10.25, request: "GET 
>> /app/diashboards/dashboard HTTP/1.1", host: "192.168.10.25" 
>>
>> I have been pulling my hair out over this, all help would be appreciated 
>>
>> This is my config.js 
>>
>>  /** @scratch /configuration/config.js/2 
>>* === Parameters 
>>*/ 
>>   return new Settings({ 
>>
>> /** @scratch /configuration/config.js/5 
>>  *  elasticsearch 
>>  * 
>>  * The URL to your elasticsearch server. You almost certainly don't 
>>  * want +> href="http://localhost:9200+";>http://localhost:9200+ here. Even if Kibana 
>> and Elasticsearch are on 
>>  * the same host. By default this will attempt to reach ES at the 
>> same host you have 
>>  * kibana installed on. You probably want to set it to the FQDN of 
>> your 
>>  * elasticsearch host 
>>  */ 
>> elasticsearch: "http://"192.168.10.25":9200";, 
>> /*elasticsearch: "http://"+window.location.hostname+":9200";, 
>>
>> /** @scratch /configuration/config.js/5 
>>  *  default_route 
>>  * 
>>  * This is the default landing page when you don't specify a 
>> dashboard to load. You can specify 
>>  * files, scripts or saved dashboards here. For example, if you had 
>> saved a dashboard called 
>>  * `WebLogs' to elasticsearch you might use: 
>>  * 
>>  * +default_route: '/dashboard/elasticsearch/WebLogs',+ 
>>  */ 
>> default_route : '/dashboard/file/default.json', 
>>
>> /** @scratch /configuration/config.js/5 
>>  *  kibana-int 
>>  * 
>>  * The default ES index to use for storing Kibana specific object 
>>  * such as stored dashboards 
>>  */ 
>> kibana_index: "kibana-int", 
>>
>> /** @scratch /configuration/config.js/5 
>>  *  panel_name 
>>  * 
>>  * An array of panel modules available. Panels will only be loaded 
>> when they are defined in the 
>>  * dashboard, but this list is used in the "add panel" interface. 
>>  */ 
>> panel_names: [ 
>>   'histogram', 
>>   'map', 
>>   'pie', 
>>   'table', 
>>   'filtering', 
>>   'timepicker', 
>>   'text', 
>>   'hits', 
>>   'column', 
>>   'trends', 
>>   'bettermap', 
>>   'query', 
>>   'terms', 
>>   'stats', 
>>   'sparklines' 
>> ] 
>>   }); 
>> }); 
>>
>> ngix (default)
>>
>> /** @scratch /configuration/config.js/1
>>  * == Configuration
>>  * config.js is where you will find the core Kibana configuration. This 
>> file contains parameter that
>>  * must be set before kibana is run for the first time.
>>  */
>> define(['settings'],
>> function (Settings) {
>>
>>
>>   /** @scratch /configuration/config.js/2
>>* === Parameters
>>*/
>>   return new Settings({
>>
>> /** @scratch /configuration/config.js/5
>>  *  elasticsearch
>>  *
>>  * The URL to your elasticsearch server. You almost certainly don't
>>  * want +http://localhost:9200+ here. Even if Kibana and 
>> Elasticsearch are on
>>  * the same host. By default this will attempt to reach ES at the 
>> same host you have
>>  * kibana installed on. You probably want to set it to the FQDN of 
>> your
>>  * elasticsearch host
>>  */
>> elasticsearch: "http://"192.168.10.25":9200";,
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> 

Re: Geohash grid aggregation broken in 1.4.0!

2014-11-17 Thread Eric Jain
On Mon, Nov 17, 2014 at 10:48 AM, Eric Jain  wrote:
> If you are using the "geohash grid" aggregation, and your index can contain
> documents with more than one value, you may want to hold off migrating to
> 1.4.0 [...]

The correct issue link is
https://github.com/elasticsearch/elasticsearch/issues/8507

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHte5%2B%2BpCD18VVYRaeY1mMwzbz95MXPyndAdJXfDgo8CzXg9og%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Geohash grid aggregation broken in 1.4.0!

2014-11-17 Thread Eric Jain
If you are using the "geohash grid" aggregation, and your index can contain 
documents with more than one value, you may want to hold off migrating to 
1.4.0 (see comment at see 
https://github.com/elasticsearch/elasticsearch/issues/8512)...

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5ce977ae-8e92-44a3-8f5e-ae78ba160ab9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-14 Thread Eric Jain
On Fri, Nov 14, 2014 at 3:41 AM,   wrote:
> I'm also seing this problem when a 1.4.0 node tries joining a 1.3.4 cluster
> with cloud-aws plugin version 2.4.0. Is there a workaround to use during
> upgrade, since I assume it's not a problem when they're all upgraded to
> 1.4.0.

I ended up starting a new cluster (ignoring all the warnings logged on
startup), and restoring from a snapshot. Once all the 1.3.4 nodes were
gone, no issues.

-- 
Eric Jain
Got data? Get answers at zenobase.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHte5%2BJ%2BfnBnus%3DOX6itdwcaB9%2Bh_KDMwDYRBWU-4fWL0CohJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: 1.4.0 data node can't join existing 1.3.4 cluster

2014-11-13 Thread Eric Jain
On Thu, Nov 13, 2014 at 10:05 AM, joergpra...@gmail.com
 wrote:
> Do not mix 1.3 with 1.4 nodes, it does not work.

If that is so, that seems like something the release notes should mention?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHte5%2BKyW3nZ82uf1w1C_i3O9oW%3D%3DhG7f3WUC2-g40%3DwUn5Fgw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Strategies for having changes in graph data being reflected in ElasticSearch index

2014-11-13 Thread Eric van der Staaij
 

Hi all,

I'm investigating possible strategies for the following situation:

I have data in a graph where the nodes and edges represent knowledge about 
certain topics. These topics may occur in unstructured text. The knowledge 
about these topics is used in an analysis process to make sense of 
unstructured text. The analysis results are indexed in ElasticSearch. The 
graph is stored simply in MySQL for now. It's not really large (about 4000 
nodes and 4000 edges/relationships), but the expectation is that this will 
grow substantially.

The most important part of the analysis process involves identifying how 
well topics are represented in the unstructured text. This is done based on 
a number of rules which are represented in the knowledge graph. The 
analysis results of a single piece of unstructured text consists of a list 
of identified topics as well as a number of characteristics per topic. A 
topic is considered to be well-represented when it is found by more rules 
coming from the knowledge graph. I.e. a piece of text can have a topic to 
be represented if it meets a single rule, but if a second piece of text has 
the same topic represented by meeting 10 rules, the seconds document should 
score better in search results.

Searching the analysis results through ElasticSearch is performed using a 
combination of filters and queries. Score is calculated using a function 
score query. The script score part of this uses document fields (the 
characteristics for each topic) as well as a number of parameters in the 
formula.

When I search for the data, the query contains a number of topics I wish to 
search for (let's say 40 topics) and finds documents that match best. I am 
getting the right results when I search the data, which is great.

The only issue I have is the following: The knowledge in the graph is 
updated regularly. Updates to the graph are required to be reflected in the 
scoring of documents in the ElasticSearch index, leading to better search 
results.

There are different strategies to have the changes to the graph reflected 
in the scoring by ElasticSearch:

- *Periodically re-analyse all pieces of unstructured text and index the 
results in ElasticSearch again* - A lot of precalculations are performed 
and stored in the ElasticSearch index. An index alias could be used to 
switch between a "live" and "rebuilding" index. The benefit here is that it 
is easy to implement and the queries are really fast as like <50ms as much 
is precalculated. The drawback here is that changes in the graph are only 
reflected in the ElasticSearch search scoring after a period of time (in my 
case about 8 hours) as the analysis process takes long to perform.

-  *Move parts of the analysis process to query-execution time* by 
dynamically building a filter+query using the knowledge graph to identify 
the topics and calculate the characteristics where possible on the fly 
using function score queries with script scores. The benefit is that the 
changes in the graph do not always require periodic updates to the entire 
index. The drawback here is that if a graph section used to build the query 
has lots of related nodes, the resulting query DSL becomes huge and has 
lots of bool clauses. This requires overhead to programmatically construct 
the query, provide it to ElasticSearch and ElasticSearch also takes longer 
to perform the query (800 milliseconds). Going this route I have queries 
which are about 2 megabytes and contain 4000+ boolean clauses.

My wish is that I have changes updated asap in ElasticSearch. Within a 
couple of seconds is fine.

I am wondering if there are other strategies possible. I hope the above 
clarifies my challenges enough for you to answer, but ask away if you have 
questions. I just can't detail too much because of non-disclosure :)

I'm open to using other technologies aside ElasticSearch, and ElasticSearch 
plugins.

Kind regards,

Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3d1092a7-a708-4706-bc59-df4523cab47c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


1.4.0 data node can't join existing 1.3.4 cluster

2014-11-13 Thread Eric Jain
(using elasticsearch-cloud-aws 2.4)

This should work, right? Or do I need to upgrade the cluster to 1.3.5 first?

The connection fails after a few errors like:

2014-11-13 07:18:22,498 [WARN] org.elasticsearch.discovery.zen.ping.unicast 
- [Porcupine] failed to send ping to 
[[#cloud-i-b743e456-0][530-1d][inet[/10.186.145.210:9300]]]
org.elasticsearch.transport.RemoteTransportException: 
[Nomad][inet[/10.186.145.210:9300]][internal:discovery/zen/unicast]
Caused by: org.elasticsearch.transport.ActionNotFoundTransportException: No 
handler for action [internal:discovery/zen/unicast]
at 
org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:210)
 
~[org.elasticsearch.elasticsearch-1.4.0.jar:na]
at 
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:111)
 
~[org.elasticsearch.elasticsearch-1.4.0.jar:na]
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 
~[org.elasticsearch.elasticsearch-1.4.0.jar:na]

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/efd3232c-b22f-4f2b-94c4-c942b75e81bf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Geo Spatial Search returning 0 results

2014-10-23 Thread Eric Uldall
So, I decided to switch to a simpler mapping and it resolved the problem. 
Instead of using "pin.location" i'm just using location now. Works for me!

Good day!

On Thursday, October 23, 2014 4:47:31 PM UTC-7, Eric Uldall wrote:
>
> Hi friends,
>
> I've done a bit of research and have yet to achieve the AHA! moment in my 
> geo spatial searching troubles.
>
> I've created a gist of the way i'm structuring my indexes for your review: 
> https://gist.github.com/ericuldall/a2e08503b1321b9fcc67
>
> I am able to see my documents when I do a search against all docs in my 
> index, but when I add the geo_distance filter it returns 0, even if use the 
> exact lat, lon of a record in my index.
>
> Any help is much appreciated.
>
> I'll be eagerly awaiting any feedback.
>
>
> Thank you,
>
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9374e58-04ed-45b3-9995-d465e353bb23%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Geo Spatial Search returning 0 results

2014-10-23 Thread Eric Uldall
Hi friends,

I've done a bit of research and have yet to achieve the AHA! moment in my 
geo spatial searching troubles.

I've created a gist of the way i'm structuring my indexes for your 
review: https://gist.github.com/ericuldall/a2e08503b1321b9fcc67

I am able to see my documents when I do a search against all docs in my 
index, but when I add the geo_distance filter it returns 0, even if use the 
exact lat, lon of a record in my index.

Any help is much appreciated.

I'll be eagerly awaiting any feedback.


Thank you,

Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b38141f-4847-486c-a02a-74a3febf2f5f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Wildcard in an exact phrase query_string search with escaped quotes

2014-10-22 Thread Eric Sloan
Updating a post from 2012.

I have a requirement to allow a wildcard within an exact phrase 
query_string.  

POST _search
{
"query": {
 "query_string": {
   "query": "\"coors brew*\"",
   "analyze_wildcard": true
 }
}
}


I get the following zero results set. 

{
   "took": 94,
   "timed_out": false,
   "_shards": {
  "total": 5,
  "successful": 5,
  "failed": 0
   },
   "hits": {
  "total": 0,
  "max_score": null,
  "hits": []
   }
}


My expectation is to get variations of the exact match (below) looking 
through all fields in our document.

   - Coors Brewing
   - Coors Brewery
   - Coors Brews
   - etc 
   - etc

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0198bd5d-62e4-4bde-8e81-eae6b465f777%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Wildcards in exact phrase in query_string search

2014-10-22 Thread Eric
Dara,

Realizing that this is an old post, but I am having this same issue.  

Was there a suggested solution that got you through

Eric





--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Wildcards-in-exact-phrase-in-query-string-search-tp4020826p4065258.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1414017635319-4065258.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


how to catch/report elasticsearch mapping errors?

2014-10-16 Thread Eric Fontana
hello there. I noticed that when Elasticsearch has a mapping error the only 
place it gets reported is in the elasticsearch.log file as an exception 
(mult-line log entry), is there an API to get these? I'd like to log these 
errors so I can report them back to the creator, whats the best way to 
handle this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8eef0d05-bc13-485c-bdeb-fb523330ba95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


problem adding rivers.

2014-09-28 Thread Eric Stillwagon
Hello, 
I'm working in a 12 server cluster, and I'm trying to restart Rivers and 
running into a problem.  I have 3 in particular that always seem to want to 
go to the same host, and I can get two of them to work, but when I go to 
add the 3rd, it doesn't get assigned to as host and never starts.  If I 
restart Elasticsearch on the host that it seems to prefer, then the 3rd one 
starts, but the first two die.  There are no errors in the Elasticsearch 
log, it just stops receiving updates.  
Is there a way to maybe force these 3 Rivers to go to other hosts?  Or can 
I look into this to see why they don't want to cooperate on the host 
they're being assigned to?  These are small subset of the total number of 
Rivers that we're running, but it seems to either a problem with this host, 
or these 3 rivers, but I don't have a clue where to start looking.  

Any information you can provide would be appreciated.  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/10ddd553-3d92-459d-bf0d-ee0e3de97f30%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Types - Array vs Nested vs Object

2014-09-13 Thread Eric Rodriguez
Hi,

I'm trying to understand the difference and limitation of these "multi 
field" core types:
- 
Array: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-array-type.html
- 
Nested: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html
- 
Object: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-object-type.html

So far I found this 
article 
http://obtao.com/blog/2014/04/elasticsearch-advanced-search-and-nested-objects/

Do you have other advices, presentations or resources to know when you'd 
better use each type?

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aa2ce828-c5df-49eb-be91-9d551fdaf125%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Determine if search term is a noun?

2014-08-27 Thread Eric Greene
Wondering if there is a way to determine if search terms are nouns.

Could some sort of dictionary list be put together and stored, then gives 
some weight to items in this list?

Anyone ever done anything such as this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77141403-c119-49c2-9952-ca98e60252b0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How do I start elasticsearch as a service?

2014-08-26 Thread Eric Greene
Thanks Mark, I found that if I comment out the line in elasticsearch.yml 
that sets the data path, it works.

I will upgrade as you have suggested, thanks for that.


On Tuesday, August 26, 2014 4:04:05 PM UTC-7, Mark Walkom wrote:
>
> Check the logs under /var/log/elasticsearch, they should have something.
>
> Also please be aware that 1.2.0 has a critical bug and you should be using 
> 1.2.1 instead.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 27 August 2014 08:42, Eric Greene > 
> wrote:
>
>> Forgive me I'm a little lost.
>>
>> I am working on deploying elasticsearch on a AWS server.  Previously in 
>> development I have started elasticsearch using ./bin/elasticsearch 
>> -Des.config=/etc/elasticsearch/elasticsearch.yml
>>
>> But in live deployment, I want to keep elasticsearch running as a 
>> service...
>>
>> I have 1.2.0 installed on Ubuntu 12.04 on my AWS instance.
>>
>> I run sudo /etc/init.d/elasticsearch start and I get:
>> * Starting Elasticsearch server
>>
>> I check sudo /etc/init.d/elasticsearch status and I get:
>> * elasticsearch is not running
>>
>> I'm not sure how to troubleshoot.  Any advice or suggestions?  Thanks
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/1c191994-ba5c-495d-b5e8-4e0bed3c4845%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/1c191994-ba5c-495d-b5e8-4e0bed3c4845%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58806f15-0a63-44f6-9a35-85a460384fa5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How do I start elasticsearch as a service?

2014-08-26 Thread Eric Greene
Forgive me I'm a little lost.

I am working on deploying elasticsearch on a AWS server.  Previously in 
development I have started elasticsearch using ./bin/elasticsearch 
-Des.config=/etc/elasticsearch/elasticsearch.yml

But in live deployment, I want to keep elasticsearch running as a service...

I have 1.2.0 installed on Ubuntu 12.04 on my AWS instance.

I run sudo /etc/init.d/elasticsearch start and I get:
* Starting Elasticsearch server

I check sudo /etc/init.d/elasticsearch status and I get:
* elasticsearch is not running

I'm not sure how to troubleshoot.  Any advice or suggestions?  Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1c191994-ba5c-495d-b5e8-4e0bed3c4845%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search terms matching order of precedence?

2014-08-22 Thread Eric Greene
Hi Vineeth thanks so much this looks like it will help me.

I have another question, if you don't mind... (or should I post a new 
question?)

I would like to specify my top results based on:

1) A description field and tags both are hits.
2) Description field only is a hit.
3) Tags only have a hit. 

Is there something I can learn about to understand this?  Thanks Eric




On Friday, August 22, 2014 10:50:48 AM UTC-7, vineeth mohan wrote:
>
> Hello Eric , 
>
> Please explore phrase query - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html#_phrase
>
> Also there is query_string type which has support for AND , OR etc and 
> even phrase query - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-dsl-query-string-query
>
> Hope that helps.
>
> Thanks
>Vineeth
>
>
> On Fri, Aug 22, 2014 at 11:17 PM, Eric Greene  > wrote:
>
>> Hi everyone, I would like to take a search query with multiple terms and 
>> possibly define an order of precedence in the following way.
>>
>> (bare with me as I become familiar with elasticsearch lingo!)
>>
>> Can I specify that the exact match is first (The search words "word A 
>> word B" matches "word A + word B"), 
>> then it matches "word B + word A",
>> then just "word B" is found,
>> then either "word A or word B"
>>
>> I'd like to understand how to shuffle the above variations as well?
>>
>> Thanks much. 
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/405bb3f0-d496-4c3d-8aa8-ec7afaf2075f%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/405bb3f0-d496-4c3d-8aa8-ec7afaf2075f%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3614746-8aa8-43d6-bda6-00787bc8abfa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search terms matching order of precedence?

2014-08-22 Thread Eric Greene
Hi everyone, I would like to take a search query with multiple terms and 
possibly define an order of precedence in the following way.

(bare with me as I become familiar with elasticsearch lingo!)

Can I specify that the exact match is first (The search words "word A word 
B" matches "word A + word B"), 
then it matches "word B + word A",
then just "word B" is found,
then either "word A or word B"

I'd like to understand how to shuffle the above variations as well?

Thanks much. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/405bb3f0-d496-4c3d-8aa8-ec7afaf2075f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Replicating TermsFacet functionality with aggregations

2014-08-08 Thread Eric Jain
https://github.com/elasticsearch/elasticsearch/issues/7213

On Thursday, February 13, 2014 3:31:34 PM UTC-8, Adrien Grand wrote:
>
> Indeed, I don't think this is possible with aggregations right now. Can 
> you open an issue on Github?
>
>
> On Wed, Feb 12, 2014 at 6:38 PM, Brian Hudson  > wrote:
>
>> Loving aggregations so far, great work guys.
>>
>> I'm trying to convert some code I have which uses TermsFacet to use 
>> aggregations instead, however I am not sure how to  replicate the 
>> TermFacet.getOtherCount() functionality.To replicate 
>> TermsFacet.getMissingCount() I am simply adding a "missing" aggregation, 
>> but I don't see any aggregation equivalent for other count.
>>
>> Does anyone have any suggestions on how to replicate this functionality 
>> using aggregations?
>>
>> Brian
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4931eb6f-5b30-4baf-aa42-4092ea4e6156%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b48fb844-ab5b-4005-875d-2f84e9bae77f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Correct place to report bug with Sense (included in Marvel plugin)?

2014-07-28 Thread Eric Brunson
I wasn't able to find a repo on github, how are bugs reported?

Thanks,
e.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/573edff2-7044-4ab4-b21e-0f77fca78cf2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search by cardinality of a field

2014-07-28 Thread Eric Brunson
In case anyone else needs the answer, I was able to make it work with:

{
  "filtered": {
"filter": {
  "script": {
"script": "doc['currentPatchSet.parents'].values.size() == 
1"
  }
}
  }
}

Hope that helps.

e.

On Monday, July 28, 2014 10:36:05 AM UTC-6, Eric Brunson wrote:
>
> I found what I think should work in a script filter, but I get an access 
> exception trying to use it.
>
> Adding the following filter:
>
> {
>   "filtered": {
> "filter": {
>   "script": {
> "lang": "mvel",
> "script": "doc['currentPatchSet.parents'].values.length < 
> param1",
> "params": {
>   "param1": 2
> }
>   }
> }
>   }
> }
>
>
> Ends up with the following error(s) at the bottom of the traceback:
>
> {
>"error": "SearchPhaseExecutionException[Failed to execute phase 
> [query], all shards failed; shardFailures 
> {[yCmFfug8TdK15SxsAUSrww][gerrit_v2][0]: 
> QueryPhaseExecutionException[[gerrit_v2][0]: query[filtered(+status:merged 
> +ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
> +ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
> param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
> execute main query]]; nested: 
> IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
>  
> }{[in46F5VZQLCoUF0NyBv_Kg][gerrit_v2][1]: RemoteTransportException[[En 
> Sabah Nur][inet[/10.226.73.179:9300]][search/phase/query]]; nested: 
> QueryPhaseExecutionException[[gerrit_v2][1]: query[filtered(+status:merged 
> +ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
> +ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
> param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
> execute main query]]; nested: 
> IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
>  
> }{[gEShLR_-SnK2d7RiQaaMjA][gerrit_v2][3]: 
> RemoteTransportException[[Bounty][inet[/10.226.73.178:9300]][search/phase/query]];
>  
> nested: QueryPhaseExecutionException[[gerrit_v2][3]: 
> query[filtered(+status:merged 
> +ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
> +ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
> param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
> execute main query]]; nested: 
> IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
>  
> }{[o6oUK9rhRSinAyFDaAni5g][gerrit_v2][2]: 
> RemoteTransportException[[Jolt][inet[/10.226.73.177:9300]][search/phase/query]];
>  
> nested: QueryPhaseExecutionException[[gerrit_v2][2]: 
> query[filtered(+status:merged 
> +ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
> +ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
> param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
> execute main query]]; nested: 
> IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
>  
> }{[in46F5VZQLCoUF0NyBv_Kg][gerrit_v2][4]: RemoteTransportException[[En 
> Sabah Nur][inet[/10.226.73.179:9300]][search/phase/query]]; nested: 
> QueryPhaseExecutionException[[gerrit_v2][4]: query[filtered(+status:merged 
> +ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
> +ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
> param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
> execute main query]]; nested: 
> *IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1]*;
>  
> }]",
>"status": 500
> }
>
>
> It that some sort of typing I can get around?
>
> Thanks for any help.
>
> Sincerely,
> e.
>
>
> On Friday, July 25, 2014 4:03:45 PM UTC-6, Eric Brunson wrote:
>>
>> I have a doc type which includes a field that is a list of strings.  I'd 
>> like to query/filter based on the number of items in the list, either 
>> exactly equal to n or greater than/less than.  Is that possible?  I haven't 
>> found anything in the Query DSL that seems to lend itself to that.
>>
>> Thanks!
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22570a37-b4a4-413a-8bb5-390dba0b7bff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search by cardinality of a field

2014-07-28 Thread Eric Brunson
I found what I think should work in a script filter, but I get an access 
exception trying to use it.

Adding the following filter:

{
  "filtered": {
"filter": {
  "script": {
"lang": "mvel",
"script": "doc['currentPatchSet.parents'].values.length < 
param1",
"params": {
  "param1": 2
}
  }
}
  }
}


Ends up with the following error(s) at the bottom of the traceback:

{
   "error": "SearchPhaseExecutionException[Failed to execute phase [query], 
all shards failed; shardFailures {[yCmFfug8TdK15SxsAUSrww][gerrit_v2][0]: 
QueryPhaseExecutionException[[gerrit_v2][0]: query[filtered(+status:merged 
+ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
+ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
execute main query]]; nested: 
IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
 
}{[in46F5VZQLCoUF0NyBv_Kg][gerrit_v2][1]: RemoteTransportException[[En 
Sabah Nur][inet[/10.226.73.179:9300]][search/phase/query]]; nested: 
QueryPhaseExecutionException[[gerrit_v2][1]: query[filtered(+status:merged 
+ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
+ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
execute main query]]; nested: 
IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
 
}{[gEShLR_-SnK2d7RiQaaMjA][gerrit_v2][3]: 
RemoteTransportException[[Bounty][inet[/10.226.73.178:9300]][search/phase/query]];
 
nested: QueryPhaseExecutionException[[gerrit_v2][3]: 
query[filtered(+status:merged 
+ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
+ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
execute main query]]; nested: 
IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
 
}{[o6oUK9rhRSinAyFDaAni5g][gerrit_v2][2]: 
RemoteTransportException[[Jolt][inet[/10.226.73.177:9300]][search/phase/query]];
 
nested: QueryPhaseExecutionException[[gerrit_v2][2]: 
query[filtered(+status:merged 
+ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
+ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
execute main query]]; nested: 
IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1];
 
}{[in46F5VZQLCoUF0NyBv_Kg][gerrit_v2][4]: RemoteTransportException[[En 
Sabah Nur][inet[/10.226.73.179:9300]][search/phase/query]]; nested: 
QueryPhaseExecutionException[[gerrit_v2][4]: query[filtered(+status:merged 
+ConstantScore(cache(BooleanFilter(currentPatchSet.parents:[* TO *]))) 
+ConstantScore(ScriptFilter(doc['currentPatchSet.parents'].values.length < 
param1)))->cache(_type:changes)],from[0],size[10]: Query Failed [Failed to 
execute main query]]; nested: 
*IllegalAccessError[org/elasticsearch/index/fielddata/ScriptDocValues$Strings$1]*;
 
}]",
   "status": 500
}


It that some sort of typing I can get around?

Thanks for any help.

Sincerely,
e.


On Friday, July 25, 2014 4:03:45 PM UTC-6, Eric Brunson wrote:
>
> I have a doc type which includes a field that is a list of strings.  I'd 
> like to query/filter based on the number of items in the list, either 
> exactly equal to n or greater than/less than.  Is that possible?  I haven't 
> found anything in the Query DSL that seems to lend itself to that.
>
> Thanks!
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de428aca-b17f-4eff-b9c4-0653ee261301%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is this link still applicable on EC2? http://www.elasticsearch.org/tutorials/elasticsearch-on-ec2/

2014-07-25 Thread Eric Jain
The "S3 Gateway" has been dropped, so you'll either need to use EBS, or set 
up some mechanism to do snapshots to S3. Other than that, no major changes.

On Thursday, July 24, 2014 10:12:25 PM UTC-7, vjbangis wrote:
>
> Hi,
>
> Is this link 
> still 
> applicable on EC2, cause it's since August 2011 and the release of ES is 
> 0.19..? But I used it as a guidelines. 
> (http://www.elasticsearch.org/tutorials/elasticsearch-on-ec2/)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3cd63bf-973e-414e-896b-ff600a39fb28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Register snapshot repositories via config file?

2014-07-25 Thread Eric Jain
Would it make sense to allow snapshot repositories to be registered via the 
config file?

The docs have an example, but it's for running the tests only.

repositories:
s3:
bucket: "bucket_name"
region: "us-west-2"


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/25f44dbe-34d5-4b5d-8a64-dc534b978357%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search by cardinality of a field

2014-07-25 Thread Eric Brunson
I have a doc type which includes a field that is a list of strings.  I'd 
like to query/filter based on the number of items in the list, either 
exactly equal to n or greater than/less than.  Is that possible?  I haven't 
found anything in the Query DSL that seems to lend itself to that.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57f440c9-ae9a-4dd6-80ab-cef92348ce5c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES v1.1 continuous young gc pauses old gc, stops the world when old gc happens and splits cluster

2014-06-18 Thread Eric Brandes
I'd just like to chime in with a "me too".  Is the answer just more nodes?  
In my case this is happening every week or so.

On Monday, April 21, 2014 9:04:33 PM UTC-5, Brian Flad wrote:
>
> My dataset currently is 100GB across a few "daily" indices (~5-6GB and 15 
> shards each). Data nodes are 12 CPU, 12GB RAM (6GB heap).
>
>
> On Mon, Apr 21, 2014 at 6:33 PM, Mark Walkom  > wrote:
>
> How big are your data sets? How big are your nodes?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 22 April 2014 00:32, Brian Flad > 
> wrote:
>
> We're seeing the same behavior with 1.1.1, JDK 7u55, 3 master nodes (2 min 
> master), and 5 data nodes. Interestingly, we see the repeated young GCs 
> only on a node or two at a time. Cluster operations (such as recovering 
> unassigned shards) grinds to a halt. After restarting a GCing node, 
> everything returns to normal operation in the cluster.
>
> Brian F
>
>
> On Wed, Apr 16, 2014 at 8:00 PM, Mark Walkom  > wrote:
>
> In both your instances, if you can, have 3 master eligible nodes as it 
> will reduce the likelihood of a split cluster as you will always have a 
> majority quorum. Also look at discovery.zen.minimum_master_nodes to go with 
> that.
> However you may just be reaching the limit of your nodes, which means the 
> best option is to add another node (which also neatly solves your split 
> brain!).
>
> Ankush it would help if you can update java, most people recommend u25 but 
> we run u51 with no problems.
>
>
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 17 April 2014 07:31, Dominiek ter Heide  > wrote:
>
> We are seeing the same issue here. 
>
> Our environment:
>
> - 2 nodes
> - 30GB Heap allocated to ES
> - ~140GB of data
> - 639 indices, 10 shards per index
> - ~48M documents
>
> After starting ES everything is good, but after a couple of hours we see 
> the Heap build up towards 96% on one node and 80% on the other. We then see 
> the GC take very long on the 96% node:
>
>
>
>
>
>
>
>
>
> TOuKgmlzaVaFVA][elasticsearch1.trend1.bottlenose.com][inet[/192.99.45.125:
> 9300]]])
>
> [2014-04-16 12:04:27,845][INFO ][discovery] 
> [elasticsearch2.trend1] trend1/I3EHG_XjSayz2OsHyZpeZA
>
> [2014-04-16 12:04:27,850][INFO ][http ] [
> elasticsearch2.trend1] bound_address {inet[/0.0.0.0:9200]}, 
> publish_address {inet[/192.99.45.126:9200]}
>
> [2014-04-16 12:04:27,851][INFO ][node ] 
> [elasticsearch2.trend1] started
>
> [2014-04-16 12:04:32,669][INFO ][indices.store] 
> [elasticsearch2.trend1] updating indices.store.throttle.max_bytes_per_sec 
> from [20mb] to [1gb], note, type is [MERGE]
>
> [2014-04-16 12:04:32,669][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating 
> [cluster.routing.allocation.node_initial_primaries_recoveries] from [4] 
> to [50]
>
> [2014-04-16 12:04:32,670][INFO ][indices.recovery ] 
> [elasticsearch2.trend1] updating [indices.recovery.max_bytes_per_sec] from 
> [200mb] to [2gb]
>
> [2014-04-16 12:04:32,670][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating 
> [cluster.routing.allocation.node_initial_primaries_recoveries] from [4] 
> to [50]
>
> [2014-04-16 12:04:32,670][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating 
> [cluster.routing.allocation.node_initial_primaries_recoveries] from [4] 
> to [50]
>
> [2014-04-16 15:25:21,409][WARN ][monitor.jvm  ] 
> [elasticsearch2.trend1] [gc][old][11876][106] duration [1.1m], 
> collections [1]/[1.1m], total [1.1m]/[1.4m], memory [28.7gb]->[22gb]/[
> 29.9gb], all_pools {[young] [67.9mb]->[268.9mb]/[665.6mb]}{[survivor] [
> 60.5mb]->[0b]/[83.1mb]}{[old] [28.6gb]->[21.8gb]/[29.1gb]}
>
> [2014-04-16 16:02:32,523][WARN ][monitor.jvm  ] [
> elasticsearch2.trend1] [gc][old][13996][144] duration [1.4m], collections 
> [1]/[1.4m], total [1.4m]/[3m], memory [28.8gb]->[23.5gb]/[29.9gb], 
> all_pools {[young] [21.8mb]->[238.2mb]/[665.6mb]}{[survivor] [82.4mb]->[0b
> ]/[83.1mb]}{[old] [28.7gb]->[23.3gb]/[29.1gb]}
>
> [2014-04-16 16:14:12,386][WARN ][monitor.jvm  ] [
> elasticsearch2.trend1] [gc][old][14603][155] duration [1.3m], collections 
> [2]/[1.3m], total [1.3m]/[4.4m], memory [29.2gb]->[23.9gb]/[29.9gb], 
> all_pools {[young] [289mb]->[161.3mb]/[665.6mb]}{[survivor] [58.3mb]->[0b
> ]/[83.1mb]}{[old] [28.8gb]->[23.8gb]/[29.1gb]}
>
> [2014-04-16 16:17:55,480][WARN ][monitor.jvm  ] [
> elasticsearch2.trend1] [gc][old][14745][158] duration [1.3m], collections 
> [1]/[1.3m], total [1.3m]/[5.7m], memory [29.7gb]->[24.1gb]/[29.9gb], 
> all_pools {[young] [633.8mb]->[149.7mb]/[665.6mb]}{[survivor] [68.6mb]->[
> 0b]/[83.1mb]}{[old] [29gb]->[24gb]/[29.1gb]}
>
> [2014-04-16 16:21:17,950][WARN ][monitor.

Re: Replica node

2014-06-12 Thread Eric Cornelius
We're having this same problem, and it looks like the 
index-modules-allocation doesn't actually provide any mechanism to 
accomplish this request.

The primary use-case on our end is that we have a two-tiered cluster with a 
set of indexers and a set of search nodes.  We'd like to zone primaries to 
the indexing zone and use asynchronous replication to the search zone. 
 Sadly, the shard allocation filtering doesn't seem to provide any 
mechanism to directly enforce this constraint.  

The best we've been able to come up with so far, is to use a forced 
awareness attribute and initially zone a new time bin to the indexing zone. 
 This causes ES to allocate primaries in the indexing zone with no initial 
replicas.  After creation, if we then go back and include the search zone 
for the index (after the primaries are allocated) - replicas are created 
there and everything works as expected.

This strikes me as both a very useful feature to provide fine-grained 
hardware control for different operations, and also an extremely hacky 
work-around the present limitations.

On Thursday, June 12, 2014 12:58:07 AM UTC-4, Mark Walkom wrote:
>
> You can force it using this sort of process - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
>
> Though unless you have a good reason, it's best to just let ES do it's own 
> thing.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>  
>
> On 12 June 2014 14:50, Tommi Lätti > 
> wrote:
>
>> Hi,
>>
>> Is it possible to configure the ES so that a single node will always get 
>> the replica shards assigned? When I was in a single-node configuration I 
>> just upped the number of replicas for every index to 1 and brought a 
>> data-only node to the cluster and of course the replicas all got created on 
>> that single node.
>>
>> But since the indexes rotate every night today I discovered that the next 
>> index has it's primary shards on this second server which is not exactly 
>> what I'd like to see...
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/fb45ad0d-7057-4df3-88f5-2a5a03310d7e%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d6059898-3eac-4d67-9331-ef3e452eea28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-30 Thread Eric Brandes
The three nodes are connected by an Azure virtual network. They are all 
part of a single cloud service, operating in a load balanced set.  I am not 
currently using any kind of FQDN, so the unicast host names are 
"es-machine-1", "es-machine-2" etc. No domain suffix whatsoever.  As far as 
I know that is end-arounding the public load balancer (since none of those 
hostnames are publicly accessible to machines outside the virtual 
network).  But I've been wrong before :)  I actually can't find any kind of 
fully qualified domain name for those machines, other than the public 
facing cloudapp.net one, so I assume this is OK?  I've also tried using the 
internal virtual network IP addresses on a similarly specced development 
cluster, and I see the same timeouts there.

On Friday, May 30, 2014 1:40:47 AM UTC-5, Michael Delaney wrote:
>
> Are u using internal fully qualified domain names, e.g 
> es01.myelasticsearcservice.f3.internal.net 
> If you use public load balancer end points you'll get timeouts. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dd26798c-66ef-4881-88ea-72d9df2e16a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread Eric Brandes
I'm using the unicast list of nodes at the moment. I have multicast turned 
off as well.  I have not changed the default ping timeout or anything.

On Thursday, May 29, 2014 7:37:38 PM UTC-5, David Pilato wrote:
>
> Just checking: are you using azure cloud plugin or unicast list of nodes?
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
>
> Le 30 mai 2014 à 02:12, Eric Brandes > 
> a écrit :
>
> I have a 3 node cluster running ES 1.0.1 in Azure.  They're windows VMs 
> with 7GB of RAM.  The JVM heap size is allocated at 4GB per node.  There is 
> a single index in the cluster with 50 shards and 1 replica.  The total 
> number of documents on primary shards is 29 million with a store size of 
> 60gb (including replicas).
>
> Almost every day now I get a random node disconnecting from the cluster.  
> The usual suspect is a ping timeout.  The longest GC in the logs is about 1 
> sec, and the boxes don't look resource constrained really at all. CPU never 
> goes above 20%. The used JVM heap size never goes above 6gb (the total on 
> the cluster is 12gb) and the field data cache never gets over 1gb.  The 
> node that drops out is different every day.  I have 
> minimum_number_master_nodes set so there's not any kind of split brain 
> scenario, but there are times where the disconnected node NEVER rejoins 
> until I bounce the process.
>
> Has anyone seen this before?  Is it an Azure networking issue?  How can I 
> tell?  If it's resource problems, what's the best way for me to turn on 
> logging to diagnose them?  What else can I tell you or what other steps can 
> I take to figure this out?  It's really quite maddening :(
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7671194d-3059-4220-9da5-c4e1aa169072%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Random node disconnects in Azure, no resource issues as near as I can tell

2014-05-29 Thread Eric Brandes
I have a 3 node cluster running ES 1.0.1 in Azure.  They're windows VMs 
with 7GB of RAM.  The JVM heap size is allocated at 4GB per node.  There is 
a single index in the cluster with 50 shards and 1 replica.  The total 
number of documents on primary shards is 29 million with a store size of 
60gb (including replicas).

Almost every day now I get a random node disconnecting from the cluster.  
The usual suspect is a ping timeout.  The longest GC in the logs is about 1 
sec, and the boxes don't look resource constrained really at all. CPU never 
goes above 20%. The used JVM heap size never goes above 6gb (the total on 
the cluster is 12gb) and the field data cache never gets over 1gb.  The 
node that drops out is different every day.  I have 
minimum_number_master_nodes set so there's not any kind of split brain 
scenario, but there are times where the disconnected node NEVER rejoins 
until I bounce the process.

Has anyone seen this before?  Is it an Azure networking issue?  How can I 
tell?  If it's resource problems, what's the best way for me to turn on 
logging to diagnose them?  What else can I tell you or what other steps can 
I take to figure this out?  It's really quite maddening :(

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f85c254-9d53-4507-a340-4c8f2a4a078d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


jdbc river nested json object - split on delimeters!

2014-05-02 Thread Eric Sims
so, i've got a table structure in sql for my 'movies' db that has some of 
the columns with comma-delimited data.

for example:

  *sql server table pseudo-structure*: [column : row data]
   movieid: 1, actors: 'Mel Gibson, Danny Glover', genre: 'action', etc.

  *simplified json mapping*:
  {
   "movies": {
  "_id" : {
"path" : "movieid"
  },
  "properties": {
"movieid": {"type": "string"},
"actors": {
   "properties": {
  "name": {"type": "string"},
   }
}, 
   "genre": {"type": "string"},   
}
 }
  }

my question is, how would i split those actors into separate arrays of 
actors in the json object using jdbc-rivers?

i could write a .net program that generates a bulk file for api (or use a 
.net client for elasticsearch that does the puts), but i want to do it 
using rivers so that the table can be monitored for changes.

please help.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d7ea2be3-c4a3-42fa-aafb-b02ea3468a95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: faceted results from delmited string

2014-05-01 Thread Eric Sims
i like where you are going with that. however, at least initially, i am 
using jdbc-river to get my data from sql into elasticsearch. i'm unclear on 
how i would create that array from a column in my sql db.

On Thursday, May 1, 2014 4:25:10 PM UTC-4, Adrien Grand wrote:
>
> Hi Eric,
>
> Wouldn't it be easier to do this on client-side and to provide 
> Elasticsearch with the tags in an array?
>
> Otherwise, you should be able to achieve this behavior thanks to the 
> patter tokenizer[1].
>
> [1] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
>
>
> On Thu, May 1, 2014 at 9:56 PM, Eric Sims 
> > wrote:
>
>> so i've got some data in a field (originally from my sql db) that is a 
>> delimited string.
>>
>> in the example of music. i've got a field of genres associated to an 
>> album.
>>
>> a document may look something like : 'genres': 'Rock | Pop | Alternative'.
>>
>> i want to get a faceted result with those values being split up.
>>
>> something like:
>>
>> "facets": {
>>   "tags": {
>>  "_type": "terms",
>>  "missing": 841,
>>  "total": 159,
>>  "other": 3,
>>  "terms": [
>> {
>>"term": "Rock",
>>"count": 89
>> },
>> {
>>"term": " Pop,
>>"count": 42
>> },
>> {
>>"term": " Alternative",
>>"count": 16
>> },
>>  ]
>>   }
>>}
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f665bd3c-5546-4029-939c-09e7d60d2286%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f665bd3c-5546-4029-939c-09e7d60d2286%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/990c4177-98cb-4dd6-9a65-5f4a08ac8784%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


faceted results from delmited string

2014-05-01 Thread Eric Sims
so i've got some data in a field (originally from my sql db) that is a 
delimited string.

in the example of music. i've got a field of genres associated to an album.

a document may look something like : 'genres': 'Rock | Pop | Alternative'.

i want to get a faceted result with those values being split up.

something like:

"facets": {
  "tags": {
 "_type": "terms",
 "missing": 841,
 "total": 159,
 "other": 3,
 "terms": [
{
   "term": "Rock",
   "count": 89
},
{
   "term": " Pop,
   "count": 42
},
{
   "term": " Alternative",
   "count": 16
},
 ]
  }
   }

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f665bd3c-5546-4029-939c-09e7d60d2286%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-05-01 Thread Eric Sims
that worked! awesome!!!

yes, the current documentation still leaves a LOT to be desired. small 
snippets of code are there in the documentation, but i would have never 
known to put that _id outside of the properties declaration! full examples 
would be much more helpful. also needs to be more step-based. (step 1, do 
this, step 2 do that). too much is assumed.

one more thing i noticed was that the naming of the properties are 
case-sensitive to how they are in the db. for example, if my db column says 
AlbumID and i create a mapping called albumid, then run a 'select * from 
myalbumtable' in the jdbc put statement, it would create two json mappings: 
AlbumID and albumid. only one of which would be populated with the data.

learning!

thanks again for your help and maybe this will help others.


On Thursday, May 1, 2014 3:57:45 AM UTC-4, Jörg Prante wrote:
>
> The _id redirecting is a special feature (I have not been aware of this!)
>
> Please use something like this
>
> PUT /myindex/album/_mapping
> {
>"album": {
>  "_id" : {
>  "path" : "AlbumID"
>  },
>  "properties": {
>"AlbumDescription": {"type": "string"},
>"AlbumID": {"type": "string"},
>"Artist": {"type": "string"},
>"Genre": {"type": "string","index" : "not_analyzed"},
>"Label": {"type": "string"},
>"Title": {"type": "string"}
> }
>    }
> }
>
> Jörg
>
>
>
> On Wed, Apr 30, 2014 at 11:57 PM, Eric Sims 
> 
> > wrote:
>
>> i'm able to get the mappings working as you suggested. However, the 
>> custom _id mapping is not working.
>>
>> it's still generating a dynamic _id.
>>
>> any ideas?
>>
>>
>> On Wednesday, April 30, 2014 5:07:57 PM UTC-4, Eric Sims wrote:
>>>
>>> should i keep the mapping in the original PUT /_river/mytest_river/_meta 
>>> statement or removing in lieu of the other separate mapping statement?
>>>
>>> because i tried what you just suggested and it didn't seem to make a 
>>> difference with having removed the mapping statement within the river.
>>>
>>> On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:
>>>>
>>>> i can't seem to understand how to fully set up my type mappings while 
>>>> using jdbc rivers and sql server.
>>>>
>>>> here's an example.
>>>>
>>>> PUT /_river/mytest_river/_meta
>>>> {
>>>> "type": "jdbc",
>>>> "jdbc": {
>>>>   "url":"jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase",
>>>>   "user":"myuser",
>>>>   "password":"xxx",
>>>>   "sql":"select * from dbo.musicalbum (nolock)",
>>>>   "strategy" : "oneshot",
>>>>   "index" : "myindex",
>>>>   "type" : "album",
>>>>   "bulk_size" : 100,
>>>>   "max_retries": 5,
>>>>   "max_retries_wait":"30s",
>>>>   "max_bulk_requests" : 5,
>>>>   "bulk_flush_interval" : "5s",
>>>>   "type_mapping": {
>>>>   "album": {"properties": {
>>>>"AlbumDescription": {"type": "string"},
>>>>"AlbumID": {"type": "string"},
>>>>"Artist": {"type": "string"},
>>>>"Genre": {"type": "string","index" : "not_analyzed"},
>>>>"Label": {"type": "string"},
>>>>"Title": {"type": "string"},
>>>>"_id" : {"path" : "AlbumID"}
>>>> }
>>>>   }
>>>>}
>>>> }
>>>> }
>>>>
>>>> so you can see i've specified both a select statement (which normally 
>>>> would dynamical

Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
i'm able to get the mappings working as you suggested. However, the custom 
_id mapping is not working.

it's still generating a dynamic _id.

any ideas?


On Wednesday, April 30, 2014 5:07:57 PM UTC-4, Eric Sims wrote:
>
> should i keep the mapping in the original PUT /_river/mytest_river/_meta 
> statement or removing in lieu of the other separate mapping statement?
>
> because i tried what you just suggested and it didn't seem to make a 
> difference with having removed the mapping statement within the river.
>
> On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:
>>
>> i can't seem to understand how to fully set up my type mappings while 
>> using jdbc rivers and sql server.
>>
>> here's an example.
>>
>> PUT /_river/mytest_river/_meta
>> {
>> "type": "jdbc",
>> "jdbc": {
>>   "url":"jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase",
>>   "user":"myuser",
>>   "password":"xxx",
>>   "sql":"select * from dbo.musicalbum (nolock)",
>>   "strategy" : "oneshot",
>>   "index" : "myindex",
>>   "type" : "album",
>>   "bulk_size" : 100,
>>   "max_retries": 5,
>>   "max_retries_wait":"30s",
>>   "max_bulk_requests" : 5,
>>   "bulk_flush_interval" : "5s",
>>   "type_mapping": {
>>   "album": {"properties": {
>>"AlbumDescription": {"type": "string"},
>>"AlbumID": {"type": "string"},
>>"Artist": {"type": "string"},
>>"Genre": {"type": "string","index" : "not_analyzed"},
>>"Label": {"type": "string"},
>>"Title": {"type": "string"},
>>"_id" : {"path" : "AlbumID"}
>> }
>>   }
>>}
>> }
>> }
>>
>> so you can see i've specified both a select statement (which normally 
>> would dynamically produce the mapping for me) and also a type mapping. in 
>> the type mapping i've tried to specify that i want the _id to be the same 
>> as AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
>> throwing multiple errors, only indexing one document, and not creating my 
>> full mapping.
>>
>> here's what the mapping ends up looking like: (skipping some of the 
>> columns altogether!)
>>
>> {
>>"myindex": {
>>   "mappings": {
>>  "album": {
>> "properties": {
>>"AlbumDescription": {
>>   "type": "string"
>>},
>>"AlbumID": {
>>   "type": "string"
>>},
>>"Artist": {
>>   "type": "string"
>>},
>>"Genre": {
>>   "type": "string"
>>},
>>"Title": {
>>   "type": "string"
>>}
>> }
>>  }
>>   }
>>}
>> }
>>
>> any assistance would be helpful. it's driving me nuts.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/03aec83a-b365-44d7-bbe0-89dd9e486ac1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
forget that part - i didn't need the 
"myindex" : { "mappings" { 
part.

other issue still stands though.

On Wednesday, April 30, 2014 5:17:00 PM UTC-4, Eric Sims wrote:
>
> here's another weird bit. it doesn't seem to show the mappings right after 
> i set them:
>
> PUT /myindex/album/_mapping
> {
>   "myindex": {
> "mappings": {
>"album": {
>   "properties": {
>"albumdescription": {"type": "string"},
>"albumid": {"type": "string"},
>"artist": {"type": "string"},
>"genre": {"type": "string", "index" : "not_analyzed"},
>"label": {"type": "string", "analyzer": "whitespace"},
>"title": {"type": "string"},
>"time": {"type" : "string"},
>"_id" : {
> "index_name" : "album.AlbumID", 
> "path" : "full", 
> "type" : "string"
>}
> }
>}
> }
>   }
> }
>
>
> GET /myindex/album/_mapping
>
> returns this:
>
> {
>"myindex": {
>   "mappings": {
>  "album": {
> "properties": {}
>  }
>   }
>}
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf3b18c2-3393-413c-98c8-e6e84ac7c5ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
here's another weird bit. it doesn't seem to show the mappings right after 
i set them:

PUT /myindex/album/_mapping
{
  "myindex": {
"mappings": {
   "album": {
  "properties": {
   "albumdescription": {"type": "string"},
   "albumid": {"type": "string"},
   "artist": {"type": "string"},
   "genre": {"type": "string", "index" : "not_analyzed"},
   "label": {"type": "string", "analyzer": "whitespace"},
   "title": {"type": "string"},
   "time": {"type" : "string"},
   "_id" : {
"index_name" : "album.AlbumID", 
"path" : "full", 
"type" : "string"
   }
}
   }
}
  }
}


GET /myindex/album/_mapping

returns this:

{
   "myindex": {
  "mappings": {
 "album": {
"properties": {}
 }
  }
   }
}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/55b7887e-43e3-4836-bef7-55e4c9c6c8e5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
should i keep the mapping in the original PUT /_river/mytest_river/_meta 
statement or removing in lieu of the other separate mapping statement?

because i tried what you just suggested and it didn't seem to make a 
difference with having removed the mapping statement within the river.

On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:
>
> i can't seem to understand how to fully set up my type mappings while 
> using jdbc rivers and sql server.
>
> here's an example.
>
> PUT /_river/mytest_river/_meta
> {
> "type": "jdbc",
> "jdbc": {
>   "url":"jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase",
>   "user":"myuser",
>   "password":"xxx",
>   "sql":"select * from dbo.musicalbum (nolock)",
>   "strategy" : "oneshot",
>   "index" : "myindex",
>   "type" : "album",
>   "bulk_size" : 100,
>   "max_retries": 5,
>   "max_retries_wait":"30s",
>   "max_bulk_requests" : 5,
>   "bulk_flush_interval" : "5s",
>   "type_mapping": {
>   "album": {"properties": {
>"AlbumDescription": {"type": "string"},
>"AlbumID": {"type": "string"},
>"Artist": {"type": "string"},
>"Genre": {"type": "string","index" : "not_analyzed"},
>"Label": {"type": "string"},
>"Title": {"type": "string"},
>"_id" : {"path" : "AlbumID"}
> }
>   }
>}
> }
> }
>
> so you can see i've specified both a select statement (which normally 
> would dynamically produce the mapping for me) and also a type mapping. in 
> the type mapping i've tried to specify that i want the _id to be the same 
> as AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
> throwing multiple errors, only indexing one document, and not creating my 
> full mapping.
>
> here's what the mapping ends up looking like: (skipping some of the 
> columns altogether!)
>
> {
>"myindex": {
>   "mappings": {
>  "album": {
> "properties": {
>"AlbumDescription": {
>   "type": "string"
>},
>"AlbumID": {
>   "type": "string"
>},
>"Artist": {
>   "type": "string"
>},
>"Genre": {
>   "type": "string"
>},
>"Title": {
>   "type": "string"
>}
> }
>  }
>   }
>}
> }
>
> any assistance would be helpful. it's driving me nuts.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6198cd66-c2b9-42e9-a8a8-f8ca2fba9ee5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
no. i just tried deleting all indexes, then i did:

PUT /myindex

then 

PUT /myindex/album/_mapping
{
  "myindex": {
"mappings": {
   "album": {
  "properties": {
   "AlbumDescription": {"type": "string"},
   "AlbumID": {"type": "string"},
   "Artist": {"type": "string"},
   "Genre": {"type": "string","index" : "not_analyzed"},
   "Label": {"type": "string"},
   "Title": {"type": "string"},
   "_id" : {"path" : "AlbumID"}
}
   }
}
  }
}

then i ran the PUT statement in my previous post.

it still treats it as dynamic mappings

On Wednesday, April 30, 2014 3:56:22 PM UTC-4, Eric Sims wrote:
>
> i can't seem to understand how to fully set up my type mappings while 
> using jdbc rivers and sql server.
>
> here's an example.
>
> PUT /_river/mytest_river/_meta
> {
> "type": "jdbc",
> "jdbc": {
>   "url":"jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase",
>   "user":"myuser",
>   "password":"xxx",
>   "sql":"select * from dbo.musicalbum (nolock)",
>   "strategy" : "oneshot",
>   "index" : "myindex",
>   "type" : "album",
>   "bulk_size" : 100,
>   "max_retries": 5,
>   "max_retries_wait":"30s",
>   "max_bulk_requests" : 5,
>   "bulk_flush_interval" : "5s",
>   "type_mapping": {
>   "album": {"properties": {
>"AlbumDescription": {"type": "string"},
>"AlbumID": {"type": "string"},
>"Artist": {"type": "string"},
>"Genre": {"type": "string","index" : "not_analyzed"},
>"Label": {"type": "string"},
>"Title": {"type": "string"},
>"_id" : {"path" : "AlbumID"}
> }
>   }
>}
> }
> }
>
> so you can see i've specified both a select statement (which normally 
> would dynamically produce the mapping for me) and also a type mapping. in 
> the type mapping i've tried to specify that i want the _id to be the same 
> as AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
> throwing multiple errors, only indexing one document, and not creating my 
> full mapping.
>
> here's what the mapping ends up looking like: (skipping some of the 
> columns altogether!)
>
> {
>"myindex": {
>   "mappings": {
>  "album": {
> "properties": {
>"AlbumDescription": {
>   "type": "string"
>},
>"AlbumID": {
>   "type": "string"
>},
>"Artist": {
>   "type": "string"
>},
>"Genre": {
>   "type": "string"
>},
>"Title": {
>   "type": "string"
>}
> }
>  }
>   }
>}
> }
>
> any assistance would be helpful. it's driving me nuts.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1bda2b24-8fc4-4706-a43f-cadf820ebc6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


help with jdbc rivers and type mapping

2014-04-30 Thread Eric Sims
i can't seem to understand how to fully set up my type mappings while using 
jdbc rivers and sql server.

here's an example.

PUT /_river/mytest_river/_meta
{
"type": "jdbc",
"jdbc": {
  "url":"jdbc:sqlserver://mydbserver:1433;databaseName=mydatabase",
  "user":"myuser",
  "password":"xxx",
  "sql":"select * from dbo.musicalbum (nolock)",
  "strategy" : "oneshot",
  "index" : "myindex",
  "type" : "album",
  "bulk_size" : 100,
  "max_retries": 5,
  "max_retries_wait":"30s",
  "max_bulk_requests" : 5,
  "bulk_flush_interval" : "5s",
  "type_mapping": {
  "album": {"properties": {
   "AlbumDescription": {"type": "string"},
   "AlbumID": {"type": "string"},
   "Artist": {"type": "string"},
   "Genre": {"type": "string","index" : "not_analyzed"},
   "Label": {"type": "string"},
   "Title": {"type": "string"},
   "_id" : {"path" : "AlbumID"}
}
  }
   }
}
}

so you can see i've specified both a select statement (which normally would 
dynamically produce the mapping for me) and also a type mapping. in the 
type mapping i've tried to specify that i want the _id to be the same as 
AlbumID, and also that i want the Genre to be not_analyzed. it ends up 
throwing multiple errors, only indexing one document, and not creating my 
full mapping.

here's what the mapping ends up looking like: (skipping some of the columns 
altogether!)

{
   "myindex": {
  "mappings": {
 "album": {
"properties": {
   "AlbumDescription": {
  "type": "string"
   },
   "AlbumID": {
  "type": "string"
   },
   "Artist": {
  "type": "string"
   },
   "Genre": {
  "type": "string"
   },
   "Title": {
  "type": "string"
   }
}
 }
  }
   }
}

any assistance would be helpful. it's driving me nuts.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c9af783-cf6c-4e41-a287-83ff5589350e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


trouble with plugins - phonetic, etc

2014-04-28 Thread Eric Sims
sorry for a noob question. i'm trying to understand phonetic searches - how 
to install and use them.

perhaps phonetics isn't the right way for my instance.

i'm trying to return music artist results for 'lil wayne', but account for 
the user to type 'little wayne'.

i've created and populated an index called /music/artist

so i've installed the phonetic plugin and config is like so:
 - (i created another index (since it won't allow me to put it into 
/music/artist))

PUT /music_admin
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_analyzer" : {
"tokenizer" : "standard",
"filter" : ["standard", "lowercase", "my_metaphone"]
}
},
"filter" : {
"my_metaphone" : {
"type" : "phonetic",
"encoder" : "metaphone",
"replace" : false
}
}
}
}
}

this feels wrong. i know. i'm confused at this point as to how to use the 
search. i have a field called 'artist' that i would be searching in.

please help!

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c4b78cef-e10c-4f0b-9b5f-07c7cc8d03f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch 1.1.1 initialization failed

2014-04-18 Thread Eric Jain
This issue has been resolved with cloud-aws 2.1.1:

  https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/74


On Thursday, April 17, 2014 6:32:05 PM UTC-7, Eric Jain wrote:
>
> Just tried to upgrade elasticsearch 1.1.0 to 1.1.1 (with the cloud-aws 
> plugin 2.1.0), and am no longer able to start any nodes:
>
> 2014-04-18 01:19:42,754 [INFO] node - [Skywalker] version[1.1.1], 
> pid[22901], build[f1585f0/2014-04-16T14:27:12Z]
> 2014-04-18 01:19:42,767 [INFO] node - [Skywalker] initializing ...
> 2014-04-18 01:19:42,802 [INFO] plugins - [Skywalker] loaded [cloud-aws], 
> sites []
> 2014-04-18 01:19:50,019 [ERROR] bootstrap - {1.1.1}: Initialization Failed 
> ...
> 1) 
> NoSuchMethodError[org.elasticsearch.gateway.blobstore.BlobStoreGateway.(Lorg/elasticsearch/common/settings/Settings;Lorg/elasticsearch/threadpool/ThreadPool;Lorg/elasticsearch/cluster/ClusterService;)V]
>
> Anyone else see this issue?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb86660c-82d1-4580-8b72-d1e78866a6c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: S3 gateway issues

2014-04-17 Thread Eric Jain
The S3 gateway from the cloud-aws 2.1.0 plugin works fine up to 
elasticsearch 1.1.0, but appears to be broken with 1.1.1, see my other post.


On Friday, April 11, 2014 1:01:34 AM UTC-7, David Pilato wrote:
>
> What is the cloud-aws plugin version please?
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 11 avril 2014 à 07:43:49, Ankur Goel (ankr...@gmail.com ) 
> a écrit:
>
> Hi David ,
>
> thanks for replying ,
>
> I am using version
>
> "number" : "1.0.0",
> we have AWS plugin, we have removed S3 gateway for now ,
> will switch to local but just wanted to make sure why we are getting this 
> error, 
> It will be really helpful to avoid any surprises in future. 
>
>
> On Thursday, 10 April 2014 18:11:48 UTC+5:30, Ankur Goel wrote: 
>>
>> hi,
>>
>> I am using s3 gateway in a application , elastic search version 1.x  , I 
>> had a strange exception while starting my nodes , please take a look
>>
>>
>>
>> Error injecting constructor, java.lang.UnsupportedOperationException
>>   at org.elasticsearch.gateway.s3.S3Gateway.(Unknown Source)
>>   while locating org.elasticsearch.gateway.s3.S3Gateway
>>   while locating org.elasticsearch.gateway.Gateway
>> Caused by: java.lang.UnsupportedOperationException
>> at 
>> org.elasticsearch.cluster.metadata.RestoreMetaData$Factory.fromXContent(RestoreMetaData.java:462)
>> at 
>> org.elasticsearch.cluster.metadata.RestoreMetaData$Factory.fromXContent(RestoreMetaData.java:400)
>> at 
>> org.elasticsearch.cluster.metadata.MetaData$Builder.fromXContent(MetaData.java:1323)
>> at 
>> org.elasticsearch.gateway.blobstore.BlobStoreGateway.readMetaData(BlobStoreGateway.java:213)
>> at 
>> org.elasticsearch.gateway.blobstore.BlobStoreGateway.findLatestIndex(BlobStoreGateway.java:198)
>> at 
>> org.elasticsearch.gateway.blobstore.BlobStoreGateway.initialize(BlobStoreGateway.java:73)
>> at 
>> org.elasticsearch.gateway.s3.S3Gateway.(S3Gateway.java:97)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
>> Method)
>> at 
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> at 
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> at 
>> org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:54)
>> at 
>> org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
>> at 
>> org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:98)
>> at 
>> org.elasticsearch.common.inject.FactoryProxy.get(FactoryProxy.java:52)
>> at 
>> org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:45)
>> at 
>> org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:837)
>> at 
>> org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:42)
>> at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:57)
>> at 
>> org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:45)
>> at 
>> org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:200)
>> at 
>> org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:193)
>> at 
>> org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:830)
>> at 
>> org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
>> at 
>> org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
>> at 
>> org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
>> at 
>> org.elasticsearch.common.inject.Guice.createInjector(Guice.java:93)
>> at 
>> org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70)
>> at 
>> org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:59)
>> at 
>> org.elasticsearch.node.internal.InternalNode.(InternalNode.java:187)
>> at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:159)
>>
>>
>> I am trying to understand what is happening here ,  the exception looks 
>> like it has happened while trying to recover index data but beyond that but 
>> I cannot get a clue , please help
>>
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com 

elasticsearch 1.1.1 initialization failed

2014-04-17 Thread Eric Jain
Just tried to upgrade elasticsearch 1.1.0 to 1.1.1 (with the cloud-aws 
plugin 2.1.0), and am no longer able to start any nodes:

2014-04-18 01:19:42,754 [INFO] node - [Skywalker] version[1.1.1], 
pid[22901], build[f1585f0/2014-04-16T14:27:12Z]
2014-04-18 01:19:42,767 [INFO] node - [Skywalker] initializing ...
2014-04-18 01:19:42,802 [INFO] plugins - [Skywalker] loaded [cloud-aws], 
sites []
2014-04-18 01:19:50,019 [ERROR] bootstrap - {1.1.1}: Initialization Failed 
...
1) 
NoSuchMethodError[org.elasticsearch.gateway.blobstore.BlobStoreGateway.(Lorg/elasticsearch/common/settings/Settings;Lorg/elasticsearch/threadpool/ThreadPool;Lorg/elasticsearch/cluster/ClusterService;)V]

Anyone else see this issue?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d5e85a3a-eac1-4a5e-b16d-69fa825b0ebb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Function Score Query and Native scripts

2014-04-12 Thread Eric T
Hi,

The function score documentation doesn't mention any support for native 
scripts, does it still work for the Function Score Query, if so is it the 
same syntax? 
I'm using the custom_filters_score query with a native script but the query 
is deprecated in the latest ES version. I'm still using 0.90.3 but I plan 
to upgrade to the latest version. 

It says that the script_score function for function_score is cached. Does 
this provide the same performance as the Native script? I'm wondering if 
it's necessary to still use a native script or convert it to the 
script_score function

thanks
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ac58df6-742f-4bcb-8ac2-856adb15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Question about scoring behaviour

2014-03-30 Thread Eric T
I created a new index that includes both the old "autocomplete" multi-field 
and a new multi-field called "autocompletenew" that contains omit_norms : 
true

I did the same query on the two fields and the results are here
https://gist.github.com/ewltang/33ab829c404130c935ac

The scoring is consistent for both but I find the query on the original 
field seems to return results that make more sense to me. For example 
"PaulJones" is the first result and then followed by PaulJones with one 
numerical digit. The second result is more random with "PaulJones" being 
second. The rest of the results contain longer variations of PaulJones.  

I was expecting that the query on autocompletenew to return the results 
that the query on the original field returns. I also didn't expect the 
first query to return the results that I want since the multi-field doesn't 
have omit_norms: true.  Is this the expected behaviour? 



On Friday, March 28, 2014 12:18:02 AM UTC-4, Eric T wrote:
>
> Hi Ivan,
>
> No I don't apply any boost at index time. 
>
> I did not disable norms on the uname.autocomplete field, I will have to 
> get back to you on the result. I'm using 0.90.2.
>
> thanks
> Eric
>
>
> On Thu, Mar 27, 2014 at 8:55 PM, Ivan Brusic  wrote:
>
>> The difference is the fieldNorm. This field holds any boosts (both 
>> document and field level) and any length normalization. It is only 1 byte, 
>> so it is incredibly lossy. Did you apply an index time boost to either the 
>> field or document?
>>
>> Have you tried disabling norms on ngram fields? Which version of 
>> elasticsearch are you using? I noticed you used the old format 
>> "omit_norms":true
>> instead of  
>> "norms": { "enabled": false }
>>
>> -- 
>> Ivan
>>
>>
>> On Thu, Mar 27, 2014 at 1:28 PM, Eric T  wrote:
>>
>>> Hello,
>>>
>>> I'm running a test of my query and mapping shown here:
>>> https://gist.github.com/ewltang/9c00155525784b620ca9
>>>
>>> I'm searching for "pauljones" in the uname field. In the results the 
>>> fifth document containing "pauljones10297" has a score of 16.027834, while 
>>> the 6th document containing "PaulJones" has a score of 5.008698.
>>> Why is the score for the 5th document so much higher than the 6th? 
>>>
>>> Regards,
>>> Eric
>>>
>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/elasticsearch/4LoViXRFa7A/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e124bce1-ff32-484d-9c30-3231cb508e96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is it possible to get the forward index table?

2014-03-28 Thread Eric Lu
hi,

Our system needs to calculate the TFs(term frequency) of the result of 
search. i.e, we get a set of documents by a "query_string" search, and we 
want to get the TF of every document.
As we know, elasticseach provides the inverted index(term>documents), 
but it do produce forward index table(document--->terms) when index a 
document. is it possible to get it?

If it is not possible, we have to manually process the search result to get 
the documents' TF. It actually is what elasticsearch has done before.

Or any suggestions? thank you.

eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96fac536-22de-4448-a00b-5c0b7660a743%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Question about scoring behaviour

2014-03-27 Thread Eric T
Hi Ivan,

No I don't apply any boost at index time.

I did not disable norms on the uname.autocomplete field, I will have to get
back to you on the result. I'm using 0.90.2.

thanks
Eric


On Thu, Mar 27, 2014 at 8:55 PM, Ivan Brusic  wrote:

> The difference is the fieldNorm. This field holds any boosts (both
> document and field level) and any length normalization. It is only 1 byte,
> so it is incredibly lossy. Did you apply an index time boost to either the
> field or document?
>
> Have you tried disabling norms on ngram fields? Which version of
> elasticsearch are you using? I noticed you used the old format
> "omit_norms":true
> instead of
> "norms": { "enabled": false }
>
> --
> Ivan
>
>
> On Thu, Mar 27, 2014 at 1:28 PM, Eric T  wrote:
>
>> Hello,
>>
>> I'm running a test of my query and mapping shown here:
>> https://gist.github.com/ewltang/9c00155525784b620ca9
>>
>> I'm searching for "pauljones" in the uname field. In the results the
>> fifth document containing "pauljones10297" has a score of 16.027834, while
>> the 6th document containing "PaulJones" has a score of 5.008698.
>> Why is the score for the 5th document so much higher than the 6th?
>>
>> Regards,
>> Eric
>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/4LoViXRFa7A/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAP43LAsyfAAhy173AYg4QW0%3DZ%2BQAVpu%3DNPHzD4pDwwkX0Ftk3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Question about scoring behaviour

2014-03-27 Thread Eric T
Hello,

I'm running a test of my query and mapping shown here:
https://gist.github.com/ewltang/9c00155525784b620ca9

I'm searching for "pauljones" in the uname field. In the results the fifth 
document containing "pauljones10297" has a score of 16.027834, while the 
6th document containing "PaulJones" has a score of 5.008698.
Why is the score for the 5th document so much higher than the 6th? 

Regards,
Eric


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Windows Elasticsearch cluster performance tuning

2014-03-23 Thread Eric Brandes
Interesting - so in general would you recommend consolidating all 400 
indexes in to a single index and using aliases/filters to address them?  
(they're currently broken out by user, and all operations are scoped to a 
specific user)

If I were to consolidate to a single index, how many shards would be 
recommended?

On Sunday, March 23, 2014 2:00:18 PM UTC-5, David Pilato wrote:
>
> Forget to say that you should extra large instances and not large.
> With larges, you could suffer from noisy neighbors.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 23 mars 2014 à 19:54, David Pilato > a 
> écrit :
>
> IMHO 800 shards per node is far too much. And with only 4gb of memory...
>
> I guess you have lot of GC or you forget to disable SWAP.
>
> My 2 cents.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 23 mars 2014 à 18:08, Eric Brandes > 
> a écrit :
>
> Hey all, I have a 3 node Elasticsearch 1.0.1 cluster running on Windows 
> Server 2012 (in Azure).  There's about 20 million documents that take up a 
> total of 40GB (including replicas).  There's about 400 indexes in total, 
> with some having millions of documents and some having just a few.  Each 
> index is set to have 3 shards and 1 replica.   The main cluster is running 
> on three  4 core machines with 7GB of ram.  The min/max JVM heap size is 
> set to 4GB.  
>
> The primary use case for this cluster is faceting/aggregations over the 
> documents.  There's almost no full text searching, so everything is pretty 
> much based on exact values (which are stored but not analyzed at index time)
>
> When doing some term facets on a few of these indexes (the biggest one 
> contains about 8 million documents) I'm seeing really long response times 
> (> 5 sec).  There are potentially thousands of distinct values for the term 
> I'm faceting on, but I would have still expected faster performance.
>
> So my goal is to speed up these queries to get the responses sub second if 
> possible.  To that end I had some questions:
> 1) Would switching to Linux give me better performance in general?
> 2) I could collapse almost all of these 400 indexes in to a single big 
> index and use aliases + filters instead.  Would this be advisable?
> 3) Would mucking with the field data cache yield any better results?
>
>
> If I can add any more data to this discussion please let me know!
> Thanks!
> Eric
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/eb5fb6bf-be2c-4d5f-b73a-edc1ef5813f1%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/eb5fb6bf-be2c-4d5f-b73a-edc1ef5813f1%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/C6157A06-390B-45C0-8425-3723F37D3766%40pilato.fr<https://groups.google.com/d/msgid/elasticsearch/C6157A06-390B-45C0-8425-3723F37D3766%40pilato.fr?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/04ea8acb-a62f-4232-a483-bdde916c48c2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Windows Elasticsearch cluster performance tuning

2014-03-23 Thread Eric Brandes
Hey all, I have a 3 node Elasticsearch 1.0.1 cluster running on Windows 
Server 2012 (in Azure).  There's about 20 million documents that take up a 
total of 40GB (including replicas).  There's about 400 indexes in total, 
with some having millions of documents and some having just a few.  Each 
index is set to have 3 shards and 1 replica.   The main cluster is running 
on three  4 core machines with 7GB of ram.  The min/max JVM heap size is 
set to 4GB.  

The primary use case for this cluster is faceting/aggregations over the 
documents.  There's almost no full text searching, so everything is pretty 
much based on exact values (which are stored but not analyzed at index time)

When doing some term facets on a few of these indexes (the biggest one 
contains about 8 million documents) I'm seeing really long response times 
(> 5 sec).  There are potentially thousands of distinct values for the term 
I'm faceting on, but I would have still expected faster performance.

So my goal is to speed up these queries to get the responses sub second if 
possible.  To that end I had some questions:
1) Would switching to Linux give me better performance in general?
2) I could collapse almost all of these 400 indexes in to a single big 
index and use aliases + filters instead.  Would this be advisable?
3) Would mucking with the field data cache yield any better results?


If I can add any more data to this discussion please let me know!
Thanks!
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb5fb6bf-be2c-4d5f-b73a-edc1ef5813f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Facets & multi-valued, numeric fields

2014-03-13 Thread Eric Jain
For sorting, elasticsearch lets me specify how I want to deal with fields 
that contain multiple numeric values, so I can have elasticsearch use e.g. 
the max value in each document.

Is there a similar option I can use when aggregating documents? For 
example, I might want to get the average of the max value in each document.

  http://stackoverflow.com/questions/22368807/facets-multi-valued-fields

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fa3ab4f6-1c95-4742-85b0-a13d414163de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3,000 events/sec Architecture

2014-03-12 Thread Eric
Yes, currently logstash is reading files that syslog-ng created. We already 
had the syslog-ng architecture in place so just kept rolling with that.


On Tuesday, March 11, 2014 11:16:42 PM UTC-4, Otis Gospodnetic wrote:
>
> Hi,
>
> Is that Logstash instance reading files that are produces by syslog-ng 
> servers?  Maybe not but if yes, have you considered using Rsyslog with 
> omelasticsearch instead to simplify the architecture?
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Tuesday, March 4, 2014 10:11:59 AM UTC-5, Eric wrote:
>>
>> Hello,
>>
>> I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
>> months now and everything has worked out pretty good and we are ready to 
>> move it to production. Before building out the infrastructure, I want to 
>> make sure my shard/node/index setup is correct as that is the main part 
>> that I'm still a bit fuzzy on. Overall my setup is this:
>>
>> Servers
>> Networking Gear   
>>   syslog-ng server
>> End Points   ->   Load Balancer 
>>  >   syslog-ng server  --> Logs 
>> stored in 5 flat files on SAN storage
>> Security Devices 
>> syslog-ng server
>> Etc.
>>
>> I have logstash running on one of the syslog-ng servers and is basically 
>> reading the input of 5 different files and sending them to ElasticSearch. 
>> So within ElasticSearch, I am creating 5 different indexes a day so I can 
>> do granular user access control within Kibana.
>>
>> unix-$date
>> windows-$date
>> networking-$date
>> security-$date
>> endpoint-$date
>>
>> My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
>> them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
>> main concern is how I setup the ElasticSearch servers so they are as 
>> efficient as possible. With my 5 different indexes a day, and I plan on 
>> keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
>> master node and the other 2 be just basic setups that are data and 
>> searching? Also, will 1 replica be sufficient for this setup or should I do 
>> 2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
>> something weird happened and I lost data for a while so wanted to try to 
>> limit that as much as possible. We'll also have quite a few users 
>> potentially querying the system so I didn't know if I should setup a 
>> dedicated search node for one of these.
>>
>> Besides the ES cluster, I think everything else should be fine. I have 
>> had a few concerns about logstash keeping up with the amount of entries 
>> coming into syslog-ng but haven't seen much in the way of load balancing 
>> logstash or verifying if it's able to keep up or not. I've spot checked the 
>> files quite a bit and everything seems to be correct but if there is a 
>> better way to do this, I'm all ears.
>>
>> I'm going to have my KIbana instance installed on the master ES node, 
>> which shouldn't be a big deal. I've played with the idea of putting the ES 
>> servers on the syslog-ng servers and just have a separate NIC for the ES 
>> traffic but didn't want to bog down the servers a whole lot. 
>>
>> Any thoughts or recommendations would be greatly appreciated.
>>
>> Thanks,
>> Eric
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f1637e2-c712-4e56-91be-32116d92a3ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES Stops Works Randomly

2014-03-07 Thread Eric
Hello,

I have 2 ES servers that are being fed by 1 logstash server and viewing the 
logs in Kibana. This is a POC to work out any issues before going into 
production. The system has ran for ~1 month and every few days, Kibana will 
stop showing logs at some random time in the middle of the night. Last 
night, the last log entry I received in Kibana was around 18:30. When I 
checked on the ES servers, it showed the master running and the secondary 
not running (from /sbin/service elasticsearch status), but I was able to do 
a curl on the localhost and it returned information. So not sure what's up 
with that. Anyway, when I do a status on the master node, I get this:

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "gis-elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 6,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 186,
  "active_shards" : 194,
  "relocating_shards" : 0,
  "initializing_shards" : 7,
  "unassigned_shards" : 249
}

When I view the indexes, via "ls ...nodes/0/indeces/" it shows all indexes 
being modified today for some reason and there are new file for today's 
date.So I think I'm starting to catch back up after I restarted both 
servers but not sure why it failed in the first place. When I look at the 
logs on the master, I only see 4 warning errors at 18:57 and then the 
2ndary leaving the cluster. I don't see any logs on the secondary (Pistol) 
on why it stopped working or what truly happened.

[2014-03-06 18:57:04,121][WARN ][transport] [ElasticSearch 
Server1] Transport response handler not found of id [64147630]
[2014-03-06 18:57:04,124][WARN ][transport] [ElasticSearch 
Server1] Transport response handler not found of id [64147717]
[2014-03-06 18:57:04,124][WARN ][transport] [ElasticSearch 
Server1] Transport response handler not found of id [64147718]
[2014-03-06 18:57:04,124][WARN ][transport] [ElasticSearch 
Server1] Transport response handler not found of id [64147721]

[2014-03-06 19:56:08,467][INFO ][cluster.service  ] [ElasticSearch 
Server1] removed 
{[Pistol][sIAMHNj6TMCmrMJGW7u97A][inet[/10.1.1.10:9301]]{client=true, 
data=false},}, reason: 
zen-disco-node_failed([Pistol][sIAMHNj6TMCmrMJGW7u97A][inet[/10.13.3.46:9301]]{client=true,
 
data=false}), reason failed to ping, tried [3] times, each with maximum 
[30s] timeout
[2014-03-06 19:56:12,304][INFO ][cluster.service  ] [ElasticSearch 
Server1] added 
{[Pistol][sIAMHNj6TMCmrMJGW7u97A][inet[/10.1.1.10:9301]]{client=true, 
data=false},}, reason: zen-disco-receive(join from 
node[[Pistol][sIAMHNj6TMCmrMJGW7u97A][inet[/10.13.3.46:9301]]{client=true, 
data=false}])

Any idea on additional logging or troubleshooting I can turn on to keep 
this from happening in the future? Since the shards are not caught up, 
right now I"m just seeing a lot o debug messages about failed to parse. I'm 
assuming that will be corrected once we catch up.

[2014-03-07 10:06:52,235][DEBUG][action.search.type   ] [ElasticSearch 
Server1] All shards failed for phase: [query]
[2014-03-07 10:06:52,223][DEBUG][action.search.type   ] [ElasticSearch 
Server1] [windows-2014.03.07][3], node[W6aEFbimR5G712ddG_G5yQ], [P], 
s[STARTED]: Failed to execute 
[org.elasticsearch.action.search.SearchRequest@74ecbbc6] lastShard [true]
org.elasticsearch.search.SearchParseException: [windows-2014.03.07][3]: 
from[-1],size[-1]: Parse Failure [Failed to parse source 
[{"facets":{"0":{"date_histogram":{"field":"@timestamp","interval":"10m"},"global":true,"facet_filter":{"fquery":{"query":{"filtered":{"query":{"query_string":{"query":"(ASA
 
AND 
Deny)"}},"filter":{"bool":{"must":[{"range":{"@timestamp":{"from":1394118412373,"to":"now"}}}],"size":0}]]

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0a693898-16e3-449f-9bf5-6adc97251e09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3,000 events/sec Architecture

2014-03-04 Thread Eric Luellen
Zach,

Thanks for the information. With my POC, I have 2 10 gig VMs and I'm 
keeping 7 days of logs with no issues but that is a fairly large jump and I 
could see where it may pose an issue. 

As far as the 150 indexes, I'm not sure on the shards per index/replicas. 
That is the part that I'm the weakest on in ES setup. I'm not exactly sure 
how I should set up the ES cluster as far as the shards, replicas, master 
node, data node, search node etc.

I fully agree with the logstash directly to ES. I have 1 logstash instance 
right now failing 5 files and directly feeding in to ES and I've enjoyed 
not having another application to have to worry about.

Eric


On Tuesday, March 4, 2014 10:32:26 AM UTC-5, Zachary Lammers wrote:
>
> Based on my experience, I think you may have an issue with OOM trying to 
> keep a month of logs with ~10gb ram / server.
>
> Say, for instance, 5 indexes a day for 30 days = 150 indexes.  How many 
> shards per index/replicas?
>
> I ran some tests with 8GB assigned to my 20x ES data nodes, and after a ~7 
> days of single index per day of all log data, my cluster would crash due to 
> data nodes going OOM.  I know I can't perfectly compare, and I'm someone 
> new to ES myself, but as soon as I removed the 'older' servers from the 
> cluster that had smaller ram, and gave ES 16GB for each data node, I've not 
> gone OOM since.  I was working with higher data rates, but I'm not sure the 
> volume mattered as much as my shard count per index per node.
>
> For reference, my current lab config is 36 data nodes, running single 
> index per day (18 shards/1 replica), and I can index near 40,000 per second 
> at beginning of day, closer to 30,000 per second near end of day when index 
> is much larger.  I used to run 36 shards/1 replica, but I wanted the 
> shards/index/per node to be minimal, as I'd really like to keep 60 days 
> (except I'm running out of disk space on my old servers first!)  To pipe 
> the data in, I'm running 45 separate logstash instances, each monitoring a 
> single FIFO that I have scripts simply catting data into.  Eash LS instance 
> is joining the ES cluster (no redis/etc, I've had too many issues not going 
> direct to ES).  I recently started over after keeping steady with 25B log 
> events over ~12 days (but ran out of disk so had to delete old indexes).  I 
> tried updating to LS1.4b2/ES1.0.1, but it failed miserably, LS1.4b2 was 
> extremely, extremely slow in indexing, so I'm still LS 1.3.3 and ES0.90.9.
>
> As for master question, I can't answer.  I'm only running one right now 
> for this lab cluster, which I know is not recommended, but I have zero idea 
> how many I should truly have.  Like I said, I'm new to this :)
>
> -Zachary
>
> On Tuesday, March 4, 2014 9:11:59 AM UTC-6, Eric Luellen wrote:
>>
>> Hello,
>>
>> I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
>> months now and everything has worked out pretty good and we are ready to 
>> move it to production. Before building out the infrastructure, I want to 
>> make sure my shard/node/index setup is correct as that is the main part 
>> that I'm still a bit fuzzy on. Overall my setup is this:
>>
>> Servers
>> Networking Gear   
>>   syslog-ng server
>> End Points   ->   Load Balancer 
>>  >   syslog-ng server  --> Logs 
>> stored in 5 flat files on SAN storage
>> Security Devices 
>> syslog-ng server
>> Etc.
>>
>> I have logstash running on one of the syslog-ng servers and is basically 
>> reading the input of 5 different files and sending them to ElasticSearch. 
>> So within ElasticSearch, I am creating 5 different indexes a day so I can 
>> do granular user access control within Kibana.
>>
>> unix-$date
>> windows-$date
>> networking-$date
>> security-$date
>> endpoint-$date
>>
>> My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
>> them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
>> main concern is how I setup the ElasticSearch servers so they are as 
>> efficient as possible. With my 5 different indexes a day, and I plan on 
>> keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
>> master node and the other 2 be just basic setups that are data and 
>> searching? Also, will 1 replica be sufficient for this setup or should

3,000 events/sec Architecture

2014-03-04 Thread Eric Luellen
Hello,

I've been working on a POC for Logstash/ElasticSearch/Kibana for about 2 
months now and everything has worked out pretty good and we are ready to 
move it to production. Before building out the infrastructure, I want to 
make sure my shard/node/index setup is correct as that is the main part 
that I'm still a bit fuzzy on. Overall my setup is this:

Servers
Networking Gear 
syslog-ng server
End Points   ->   Load Balancer 
 >   syslog-ng server  --> Logs 
stored in 5 flat files on SAN storage
Security Devices   
  syslog-ng server
Etc.

I have logstash running on one of the syslog-ng servers and is basically 
reading the input of 5 different files and sending them to ElasticSearch. 
So within ElasticSearch, I am creating 5 different indexes a day so I can 
do granular user access control within Kibana.

unix-$date
windows-$date
networking-$date
security-$date
endpoint-$date

My plan is to have 3 ElasticSearch servers with ~10 gig of RAM each on 
them. For my POC I have 2 and it's working fine for 2,000 events/second. My 
main concern is how I setup the ElasticSearch servers so they are as 
efficient as possible. With my 5 different indexes a day, and I plan on 
keeping ~1 month of logs within ES, is 3 servers enough? Should I have 1 
master node and the other 2 be just basic setups that are data and 
searching? Also, will 1 replica be sufficient for this setup or should I do 
2 to be safe? In my POC, I've had a few issues where I ran out of memory or 
something weird happened and I lost data for a while so wanted to try to 
limit that as much as possible. We'll also have quite a few users 
potentially querying the system so I didn't know if I should setup a 
dedicated search node for one of these.

Besides the ES cluster, I think everything else should be fine. I have had 
a few concerns about logstash keeping up with the amount of entries coming 
into syslog-ng but haven't seen much in the way of load balancing logstash 
or verifying if it's able to keep up or not. I've spot checked the files 
quite a bit and everything seems to be correct but if there is a better way 
to do this, I'm all ears.

I'm going to have my KIbana instance installed on the master ES node, which 
shouldn't be a big deal. I've played with the idea of putting the ES 
servers on the syslog-ng servers and just have a separate NIC for the ES 
traffic but didn't want to bog down the servers a whole lot. 

Any thoughts or recommendations would be greatly appreciated.

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13a76e46-91b5-41fe-9667-f674706fe127%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: "No module named elasticsearch"

2014-03-03 Thread Eric Greene
Hello Honza, it does work in a virtualenv, which is suitable for my 
purposes.Thanks for your help - Eric

On Monday, March 3, 2014 10:37:44 AM UTC-8, Eric Greene wrote:
>
>
> <https://lh3.googleusercontent.com/-EzeBiRu2EhE/UxTLoSk13CI/B6I/EKQVNd3zifM/s1600/no-module-elasticsearch.jpg>
> Hi I am getting started using elasticsearch.
>
> I want to try to use the python elasticsearch 
> client<http://elasticsearch-py.readthedocs.org/en/latest/#>
> .
>
> I ran pip install elasticsearch.  Installation seems successful.  But when 
> I try to import elasticsearch, I get an error "No module named 
> elasticsearch".
>
> What could be the trouble here?
>
> Thanks for any help.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6433e343-b5ab-42ca-b560-ad81a0ea2b21%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: No indexing with JDBC River plugin

2014-03-03 Thread Eric Greene
I have figured it out.  CHMOD of the mySQL-connector-river jar file to 755 
fixed my issue.  - Eric



On Monday, March 3, 2014 10:17:28 AM UTC-8, Eric Greene wrote:
>
> Hi I am getting started with elasticsearch and the jdbc river plugin.  I 
> want to sync to a mysql database.  I appear to have everything set up 
> correctly... ES starts with no trouble, I have installed the plugin and 
> copied the mysql connector.  However when I begin with a simple test 
> command:
>
> curl -XPUT 'localhost:9200/my_index/videos/_meta' -d '{
> "type" : "jdbc",
> "jdbc" : {
> "url" : "jdbc:mysql://localhost:3306/my_db",
> "user" : "eric",
> "password" : "my_password",
> "sql" : "select title, description, created, active from video",
> "index" : "my_index",
> "type" : "videos"
> }
> }'
>
>
> Nothing from the mysql db is indexed.  Instead it just indexes the 
> information above instead of the mysql data.
>
> So if I follow the above with 
> curl -XGET 'localhost:9200/my_index/_search?pretty&q=*
>
>
> I get the following for hits:
>
> "hits" : {
> "total" : 1,
> "max_score" : 1.0,
> "hits" : [ {
>   "_index" : "my_index",
>   "_type" : "videos",
>   "_id" : "_meta",
>   "_score" : 1.0, "_source" : {
> "type" : "jdbc",
> "jdbc" : {
> "url" : "jdbc:mysql://localhost:3306/my_db",
> "user" : "eric",
> "password" : "my_password",
> "sql" : "select title, description, created, active from video",
> "index" : "my_index",
> "type" : "videos"
> }
>
>
> Logs show that the index was created.  But it isn't an index from mysql.  
>
> I have also double and tripled checked the mysql credentials are working 
> and that the sql statement works.  I have also verified there is in fact 
> data in the video table.
>
> It just seems like the river is being ignored entirely.
>
> The following is from the log file:
>
> [2014-03-03 10:09:28,565][INFO ][cluster.metadata ] [es_node] 
> [my_index] creating index, cause [auto(index api)], shards [5]/[1], 
> mappings []
> [2014-03-03 10:09:28,800][INFO ][cluster.metadata ] [es_node] 
> [my_index] update_mapping [videos] (dynamic)
>
> What could I be missing here? 
>
> I am using the following versions:
>
> ES version 1.0.1
> JDBC River version 1.0.0.1
> mySQL-connector-java version 5.1.28
>
> on Ubuntu 12.04
>
> Thanks for any help.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce4e-1ef2-46a8-9bea-c9a7371abbaa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


"No module named elasticsearch"

2014-03-03 Thread Eric Greene



Hi I am getting started using elasticsearch.

I want to try to use the python elasticsearch 
client
.

I ran pip install elasticsearch.  Installation seems successful.  But when 
I try to import elasticsearch, I get an error "No module named 
elasticsearch".

What could be the trouble here?

Thanks for any help.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a20f2c4-d234-4bfb-8bcd-5e37845e90cb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


No indexing with JDBC River plugin

2014-03-03 Thread Eric Greene
Hi I am getting started with elasticsearch and the jdbc river plugin.  I 
want to sync to a mysql database.  I appear to have everything set up 
correctly... ES starts with no trouble, I have installed the plugin and 
copied the mysql connector.  However when I begin with a simple test 
command:

curl -XPUT 'localhost:9200/my_index/videos/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/my_db",
"user" : "eric",
"password" : "my_password",
"sql" : "select title, description, created, active from video",
"index" : "my_index",
"type" : "videos"
}
}'


Nothing from the mysql db is indexed.  Instead it just indexes the 
information above instead of the mysql data.

So if I follow the above with 
curl -XGET 'localhost:9200/my_index/_search?pretty&q=*


I get the following for hits:

"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "my_index",
  "_type" : "videos",
  "_id" : "_meta",
  "_score" : 1.0, "_source" : {
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://localhost:3306/my_db",
"user" : "eric",
"password" : "my_password",
"sql" : "select title, description, created, active from video",
"index" : "my_index",
"type" : "videos"
}


Logs show that the index was created.  But it isn't an index from mysql.  

I have also double and tripled checked the mysql credentials are working 
and that the sql statement works.  I have also verified there is in fact 
data in the video table.

It just seems like the river is being ignored entirely.

The following is from the log file:

[2014-03-03 10:09:28,565][INFO ][cluster.metadata ] [es_node] 
[my_index] creating index, cause [auto(index api)], shards [5]/[1], 
mappings []
[2014-03-03 10:09:28,800][INFO ][cluster.metadata ] [es_node] 
[my_index] update_mapping [videos] (dynamic)

What could I be missing here? 

I am using the following versions:

ES version 1.0.1
JDBC River version 1.0.0.1
mySQL-connector-java version 5.1.28

on Ubuntu 12.04

Thanks for any help.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/71a431cb-58a6-4fd6-87fb-6ec08ee37bc4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


"Illegal character in path"

2014-03-01 Thread Eric Jain
Just had this error (elasticsearch 1.0.1), don't recall seeing it before:

  java.net.URISyntaxException: Illegal character in path at index 29: 
/cache/11_/seg0/index12450141[12450141].ts

Should I be concerned?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/34bb0d6a-598a-4ae7-9fa1-200e51da92a2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


debug logs when indexing by bulks

2014-02-19 Thread Eric Lu
Hi,
 
I'm using bulks to index billions of docs. I check the logs and I find that 
it keeps logging like this:
[2014-02-20 00:00:01,325][DEBUG][action.bulk  ] [Will o' the 
Wisp] [...][21] failed to execute bulk item (index) index {...}
java.lang.ArrayIndexOutOfBoundsException
[2014-02-20 00:00:04,345][DEBUG][action.bulk  ] [Will o' the 
Wisp] [...][29] failed to execute bulk item (index) index {...}
java.lang.ArrayIndexOutOfBoundsException
...

but my indices are keep increasing. how to solve it out?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/382ee60a-4e63-41cc-a5da-da2e0df8689b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Index Mapping/Routing Help

2014-02-19 Thread Eric Luellen
Thanks! What was throwing me off is that I'm still having the UNIX logs 
also write to logstash-date as well so I was seeing that information in my 
main dashboard. I wasn't thinking about it writing 2 different times. 
Thanks again.

On Tuesday, February 18, 2014 4:52:38 PM UTC-5, Binh Ly wrote:
>
> Yup, you will need to go into your Kibana dashboard - top right corner - 
> Configure Dashboard | Index and change the settings there to point to your 
> new index(es) instead of the default logstash-* indexes.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aa721942-fe0b-4c7c-a2dc-8e95a1abcc02%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Index Mapping/Routing Help

2014-02-18 Thread Eric Luellen
Thanks for that information. When I'm looking in Kibana now, it's showing 
the correct type but it still shows the index as the original 
logstash-2014-02-18. Not sure why it isn't showing the unix-date index. If 
I look at ElasticSearch, I can see that it did create the new index I told 
it to though.


On Tuesday, February 18, 2014 12:53:22 PM UTC-5, Binh Ly wrote:
>
> You should be able to use the input type to direct log events to specific 
> indexes. For example:
>
> input {
>   file { 
> type => "unixlogs"
> path => "/var/log/UNIX/*.log"
>   } 
> }
>
> output {
>   if [type] == "unixlogs" {
> elasticsearch { 
>   host => "localhost"
>   index => "unix-%{+.MM.dd}"
> }
>   }
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f034ce7-68c9-4f56-918b-bc4c887f74fb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Index Mapping/Routing Help

2014-02-17 Thread Eric Luellen
Hello,

Currently I have the following setup.

Syslog --> Logstash --> ElasticSearch --> Kibana

Logstash is creating a daily index 
"/etc/elasticsearch/data/test-elasticsearch/nodes/0/indices/logstash-2014.02.04"
 
and I'm viewing all of the logs through Kibana. We want to set up some user 
based access control using the kibana-authentication-proxy setup due to it 
supporting 

   - Per-user kibana index supported. now you can use index 
   kibana-int-userA for user A and kibana-int-userB for user B

I'd like to make it where all logs coming in from logstash with a location 
of "/var/log/UNIX/*.log" get sent to a new index of unix-2014.02.04 instead 
of the logstash one. That way I can use the Kibana auth proxy to give my 
UNIX users access only to their logs. I've read a little about creating the 
mappings but wasn't sure how to tie it all together. I saw you could do 
various things with API calls but was curious if I could set all of this up 
in the elasticsearch.yml file from the start.

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/56e2fc09-c179-4839-a23f-67a805f563ce%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: "term_stats" equivalent in aggregations? (ES 1.0)

2014-02-13 Thread Eric Nelson
Perfect! Thanks Binh.

---Eric

On Thursday, February 13, 2014 12:18:37 PM UTC-7, Binh Ly wrote:
>
> You should be able to do a stats sub under terms like this:
>
> {
>   "aggs": {
> "terms": {
>   "terms": {
> "field": "term_field"
>   },
>   "aggs": {
> "stats": {
>   "stats": {
> "field": "stats_field"
>   }
> }
>   }
> }
>   }
> }
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3cfac2f-4002-4762-94d2-aa805fe01674%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


"term_stats" equivalent in aggregations? (ES 1.0)

2014-02-13 Thread Eric Nelson
Hi all. LOVE the new 1.0 release, and learning about the new aggregation 
framework. Can you tell me if there is an aggregation equivalent to the 
term_stats facet? The term aggregation isn't quite the same. I want to 
bucket on one field, and calculate stats on another. Any help on this would 
be greatly appreciated.

--Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a2a4286e-18ac-4e2b-906f-2279dd915512%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: facets.total and hits.total dont match

2014-01-20 Thread Eric Rodriguez
Hi,

I don't have the link right now but IIRC when you have more than 1 shard there 
is no certainty about facet count accuracy...
The best "workaround" is either 1 shard to get exact count or extend the number 
of results asked for facet to achieve better (still not exact) count.

Eric

Sent from my iPhone

> On 21 Jan 2014, at 07:58, Chetana  wrote:
> 
> I have indexed some records by making test_field to be 'analyzed'. If the 
> analyzed field causing this issue, is there any other facet type/work around 
> which can solve the problem?
>  
> 
>> On Tuesday, January 21, 2014 12:15:45 PM UTC+5:30, Chetana wrote:
>> I have an application where I need both search results and facet 
>> information. Everytime a query is framed based on some filter condition and 
>> query words and it is passed to both facet and search request as given 
>> below. The field (test_field) on which the facet to be applied is present in 
>> all documents.
>>  
>> BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
>> SearchRequestBuilder srb = client.prepareSearch("Test");
>> srb.setSearchType(SearchType.DFS_QUERY_THEN_FETCH).setQuery( 
>> boolQueryBuilder);
>> and
>> TermsFacetBuilder facBuilder = FacetBuilders.termsFacet("test_field");
>>   facBuilder.facetFilter(FilterBuilders
>>   .queryFilter(boolQueryBuilder));
>>   facBuilder.fields("test_field");
>>   facBuilder.global(true);   // I tried commenting this too, but 
>> I get the same result
>>   srb.addFacet(facBuilder);
>>  
>> "hits" : {
>> "total" : 117,
>> "max_score" : null,
>> "hits" : [ {
>>  }]
>> 
>>   "facets" : {
>> "assettype" : {
>>   "_type" : "terms",
>>   "missing" : 5,
>>   "total" : 119,
>>   "other" : 0,
>>   "terms" : [ {
>> }]
>>  
>> But the hit count is different from the facet count. Can anyone please 
>> explain me why this discrepancy?
>>  
>>  
>> Thanks
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a8ed55c0-6599-4612-995d-28d3340e69f7%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E2A7D14B-CCA0-48D8-BB70-56FC0ACA3112%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Bulk indexing slow down when data amount increase

2014-01-14 Thread Eric Lu
I have set the replica to 0 and queue to 50. and it can index about 7 - 8 
millions documents per hour now. It's acceptable . Though i don't know 
which change makes it.

Thank you all.

在 2014年1月13日星期一UTC+8下午9时04分35秒,Eric Lu写道:
>
> I observed the GC occured once every 15 seconds when  heap mem was 75% of 
> the heap size. Is it too frequent? there is no OOMs.
>
> I set refresh interval to 30s. 
>
> I'll try to use a smaller queue and set replica to 0
>
> Thank you.
>
> 在 2014年1月13日星期一UTC+8下午8时42分56秒,Jörg Prante写道:
>>
>> 12 hours is an absurdly long time for indexing 10 million docs.
>>
>> queue:1000 is much too high for production. For test it may be ok (it 
>> effectively disables queue rejections) but on production, you play with the 
>> risk of starving your cluster resources.
>>
>> Do you rmonitor the resource usage of ES, especially the heap? Is GC 
>> starving your cluster? Do you see OOMs?
>>
>> Do you evaluate the bulk responses for errors? Do you throttle bulk 
>> request concurrency? 
>>
>> Do you set refresh interval to -1? 
>>
>> Hint: if 5 nodes is your maximum, you can also bulk index with 5 shards 
>> and replica level 0, after bulk, you can increase replica level to 1.
>>
>> Jörg
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8b9fab05-fa3e-455c-b8ba-1253b72c9e46%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Bulk indexing slow down when data amount increase

2014-01-13 Thread Eric Lu
I observed the GC occured once every 15 seconds when  heap mem was 75% of 
the heap size. Is it too frequent? there is no OOMs.

I set refresh interval to 30s. 

I'll try to use a smaller queue and set replica to 0

Thank you.

在 2014年1月13日星期一UTC+8下午8时42分56秒,Jörg Prante写道:
>
> 12 hours is an absurdly long time for indexing 10 million docs.
>
> queue:1000 is much too high for production. For test it may be ok (it 
> effectively disables queue rejections) but on production, you play with the 
> risk of starving your cluster resources.
>
> Do you rmonitor the resource usage of ES, especially the heap? Is GC 
> starving your cluster? Do you see OOMs?
>
> Do you evaluate the bulk responses for errors? Do you throttle bulk 
> request concurrency? 
>
> Do you set refresh interval to -1? 
>
> Hint: if 5 nodes is your maximum, you can also bulk index with 5 shards 
> and replica level 0, after bulk, you can increase replica level to 1.
>
> Jörg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8ba26c0a-00cd-46ed-9610-eeb5b5f6243b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Bulk indexing slow down when data amount increase

2014-01-13 Thread Eric Lu
Hi, guys
I'm using elasticsearch to index a large number  of documents. A document 
is about 0.5KB. 
My elasticsearch cluster has 5 nodes(all data nodes). Each nodes are 
running oracle Java version: 1.7.0_13 and both have 16GB RAM with 8GB 
allocated to the JVM. And the index has 50 shards and 1 replicas.
I set the bulk thread pool to size:30 and queue:1000.
I use one thread to indexing documents by bulk,  bulk size is 1000.
In the beginning, the performance is very good. It can index about 10 
million documents per hour. But with the increasing of indexing document, 
it slows down. When the cluster has 500 million document indexed, i noticed 
that it spent about 12 hours to index 10 million documents.

Is it normal? Or what is the bottleneck that throttling it?

Any help?

Regards
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a381d703-3657-4669-8104-918d82c6c0be%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Is there a help document about bigdesk plugin?

2014-01-09 Thread Eric Lu
Or some detail introduction about the various charts?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b9a893a8-9c16-4d2d-9efe-7218a4895896%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch Missing Data

2014-01-09 Thread Eric Luellen
Alexander,

1. The only odd log entry was at 19:00 on 1/7/14, which was about 1 hr. 
before logs stopped. These logs are on the master and She-Hulk is the only 
other node.

[2014-01-07 19:00:02,947][DEBUG][indices.recovery ] [ElasticSearch 
Server1] [logstash-2014.01.08][0] recovery completed from 
[She-Hulk][_MtrVsSmQIaM-BErhEtg9w][inet[/10.1.11.111:9300]], took[333ms]
   phase1: recovered_files [1] with total_size of [71b], took [68ms], 
throttling_wait [0s]
 : reusing_files   [0] with total_size of [0b]
   phase2: start took [13ms]
 : recovered [17] transaction log operations, took [12ms]
   phase3: recovered [0] transaction log operations, took [164ms]
[2014-01-07 19:00:03,375][DEBUG][indices.recovery ] [ElasticSearch 
Server1] [logstash-2014.01.08][2] recovery completed from 
[She-Hulk][_MtrVsSmQIaM-BErhEtg9w][inet[/10.1.11.111:9300]], took[502ms]
   phase1: recovered_files [1] with total_size of [71b], took [30ms], 
throttling_wait [0s]
 : reusing_files   [0] with total_size of [0b]
   phase2: start took [6ms]
 : recovered [6] transaction log operations, took [38ms]
   phase3: recovered [13] transaction log operations, took [20ms]
[2014-01-07 19:00:06,898][INFO ][cluster.metadata ] [ElasticSearch 
Server1] [logstash-2014.01.08] update_mapping [logs] (dynamic)

Also, on She-Hulk I got an error stating that the master_left at 20:52 
because it wasn't pingable, but not sure why.

2.I am not sure. I was thinking that the shard should still be there but 
just unassigned and once it came back up, it'd start processing it.
3. On both my master and my 2ndary, the config is in 
/etc/elasticsearch/elasticsearch.yml and it is ran by 
/etc/init.d/elasticsearch. On the master, it works fine and make the 
correct node name, cluster name, data directory, etc. It is an identical 
setup on the 2ndary but it only grabs the cluster name. Everything else 
defaults to some other location.On the secondary, the only data location is 
in /var/lib/elasticsearch/node-name. In the config I tell it to go to 
/etc/elasticsearch/data. On the master it is in the correct location of 
/etc/elasticsearch/data. 

So overall, I guess the first issue was something weird happened to my 
server and not much I can do about that. I'm more interested in the 3rd 
question now since I still don't know why it's not reading that full config 
file but obviously part of it since it's part of my cluster.




On Thursday, January 9, 2014 3:30:40 AM UTC-5, Alexander Reelsen wrote:
>
> Hey,
>
> a couple of things:
>
> 1. Did you check the log files? Most likely in /var/log/elasticsearch if 
> you use the packages. Is there anything suspicious at the time of your 
> outage? Please check your master node as well, if you have one (not sure if 
> it is a master or client node from the cluster health).
> 2. Why should elasticsearch pull your data? Any special configuration you 
> didnt mention? Or what exactly do you mean here?
> 3. Happy to debug your issue with the init script. The elasticsearch.yml 
> file should be in /etc/elasticsearch/ and not in /etc - anything manually 
> moved around? Can you still reproduce it?
>
>
> --Alex
>
>
>
>
> On Wed, Jan 8, 2014 at 8:10 PM, Eric Luellen 
> > wrote:
>
>> Hello,
>>
>> I've had my elasticsearch instance running for about a week with no 
>> issues, but last night it stopped working. When I went to look in Kibana, 
>> it stops logging around 20:45 on 1/7/14. I then restarted the service on 
>> both both elasticsearch servers and it started logging again and back 
>> pulled some logs from 07:10 that morning, even though I restarted the 
>> service around 10:00. So my questions are:
>>
>> 1. Why did it stop working? I don't see any obvious errors.
>> 2. When I restarted it, why didn't it go back and pull all of the data 
>> and not just some of it? I see that there are no unassigned shards.
>>
>> curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
>> {
>>   "cluster_name" : "my-elasticsearch",
>>   "status" : "green",
>>   "timed_out" : false,
>>   "number_of_nodes" : 3,
>>   "number_of_data_nodes" : 2,
>>   "active_primary_shards" : 40,
>>   "active_shards" : 80,
>>   "relocating_shards" : 0,
>>   "initializing_shards" : 0,
>>   "unassigned_shards" : 0
>>
>> Are there any additional queries or logs I can look at to see what is 
>> going on? 
>>
>> On a slight side note, when I restarted my 2nd elasticsearch server it 
>> isn't reading from the /etc/elasticsearch.yml file like it should. It isn't 
>>

Elasticsearch Missing Data

2014-01-08 Thread Eric Luellen
Hello,

I've had my elasticsearch instance running for about a week with no issues, 
but last night it stopped working. When I went to look in Kibana, it stops 
logging around 20:45 on 1/7/14. I then restarted the service on both both 
elasticsearch servers and it started logging again and back pulled some 
logs from 07:10 that morning, even though I restarted the service around 
10:00. So my questions are:

1. Why did it stop working? I don't see any obvious errors.
2. When I restarted it, why didn't it go back and pull all of the data and 
not just some of it? I see that there are no unassigned shards.

curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "my-elasticsearch",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 40,
  "active_shards" : 80,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0

Are there any additional queries or logs I can look at to see what is going 
on? 

On a slight side note, when I restarted my 2nd elasticsearch server it 
isn't reading from the /etc/elasticsearch.yml file like it should. It isn't 
creating the node name correctly or putting the data files in the spot I 
have configured. I'm using CentOS and doing everything via 
/etc/init.d/elasticsearch on both servers and the elasticsearch1 server 
reads everything correctly but elasticsearch2 does not.

Thanks for your help.
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc191ee4-b312-4c52-89d9-de04c4309b65%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ElasticSearch Index Wrong Date

2014-01-02 Thread Eric Luellen
Not sure what happened, but after restarting Logstash everything is working 
fine. I guess it just wasn't a fan of the change in  year.

On Thursday, January 2, 2014 10:03:05 AM UTC-5, Eric Luellen wrote:
>
> Hello,
>
> I recently setup my elasticsearch instance and everything has been working 
> fine. However, when I looked at Kibana today I saw that the logs stopped 
> showing up as soon as 2014 hit. When looking at my data on the cluster, I 
> see this:
>
> ls -altr data/my-cluster/nodes/0/indices/
> total 44
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 20 09:39 kibana-int
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 25 14:00 
> logstash-2013.12.26
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 26 14:00 
> logstash-2013.12.27
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 27 14:00 
> logstash-2013.12.28
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 28 14:00 
> logstash-2013.12.29
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 29 14:00 
> logstash-2013.12.30
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 30 14:00 
> logstash-2013.12.31
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 31 14:00 
> logstash-2013.01.01
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 31 14:00 
> logstash-2014.01.01
> drwxr-xr-x  8 elasticsearch elasticsearch 4096 Jan  1 14:00 
> logstash-2013.01.02
>
> As you can see, there is one 2014 file and 2 2013 files for the new year 
> that shouldn't be there. For some reason, elasticsearch thinks it's 2013 
> still and creating folders with the wrong date. I confirmed that all of my 
> servers have the correct time on them. How can I fix this on 
> elasticsearch's end?
>
> Thanks,
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57c938de-7b7c-471f-afd7-df9c409fda5c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


ElasticSearch Index Wrong Date

2014-01-02 Thread Eric Luellen
Hello,

I recently setup my elasticsearch instance and everything has been working 
fine. However, when I looked at Kibana today I saw that the logs stopped 
showing up as soon as 2014 hit. When looking at my data on the cluster, I 
see this:

ls -altr data/my-cluster/nodes/0/indices/
total 44
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 20 09:39 kibana-int
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 25 14:00 
logstash-2013.12.26
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 26 14:00 
logstash-2013.12.27
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 27 14:00 
logstash-2013.12.28
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 28 14:00 
logstash-2013.12.29
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 29 14:00 
logstash-2013.12.30
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 30 14:00 
logstash-2013.12.31
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 31 14:00 
logstash-2013.01.01
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Dec 31 14:00 
logstash-2014.01.01
drwxr-xr-x  8 elasticsearch elasticsearch 4096 Jan  1 14:00 
logstash-2013.01.02

As you can see, there is one 2014 file and 2 2013 files for the new year 
that shouldn't be there. For some reason, elasticsearch thinks it's 2013 
still and creating folders with the wrong date. I confirmed that all of my 
servers have the correct time on them. How can I fix this on 
elasticsearch's end?

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4d2ed1eb-e7b4-4c51-8b14-15e065d05592%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Unassigned Shards

2013-12-20 Thread Eric Luellen
I got the initial issue fixed of me getting data back again. However I 
still don't understand how to fix the unassigned shards issue and how to 
properly restart elasticsearch without it complaining.

On Friday, December 20, 2013 9:28:53 AM UTC-5, Eric Luellen wrote:
>
> Mark,
>
> I used the rpm install. I'll take a look at the plugins. Thanks.
>
> On Thursday, December 19, 2013 5:07:53 PM UTC-5, Mark Walkom wrote:
>>
>> Did you install ES via a rpm/deb or using the zip? I ask because your 
>> data store directory is custom.subl
>>
>> Check out these plugins for monitoring - elastichq, kopf, bigdesk. They 
>> will give you an overview of your cluster and might give you insight into 
>> where your problem lies. The other best place to check is the ES logs.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 20 December 2013 08:52, Eric Luellen  wrote:
>>
>>> I think I made my situation even worse. I tried deleting the shards and 
>>> starting over and now elasticsearch isn't even creating the 
>>> /etc/elasticsearch/data/my-cluster/node folder.
>>>
>>>
>>> On Thursday, December 19, 2013 4:04:41 PM UTC-5, Eric Luellen wrote:
>>>>
>>>> Hello,
>>>>
>>>> Currently I have my syslog-ng --> logstash --> elasticsearch1 & 
>>>> elastisearch2 setup working pretty good. It's accepting over 300 events 
>>>> per 
>>>> second and hasn't bogged the systems down at all. However I'm running into 
>>>> 2 issues that I don't quite understand. 
>>>>
>>>> 1. When viewing the information in Kibana, it appears to be anywhere 
>>>> from 15 min to an hr behind on the "all events" view. Sometimes when I 
>>>> search for new logs it shows up correctly but overall it seems like it's 
>>>> lagging behind trying to keep up with what logstash is sending it. That 
>>>> being said, I'm concerned that logs are being dropped and I don't know 
>>>> about it. Are there any commands I can use to validate this type of 
>>>> information or what I can do to make sure elasticsearch/KIbana is keeping 
>>>> up?
>>>>
>>>> 2. I've had to restart elasticsearch a few times and every time I do, 
>>>> it completely breaks things. Once it starts back up it doesn't continue to 
>>>> show the logs in Kibana correctly and when I run a health check, it says 
>>>> there are unassigned shards. I've not been able to fix this and in the 
>>>> past 
>>>> I've always just had to delete them and start from scratch again.
>>>>
>>>> Any idea what is going on with this or how I can more cleanly restart 
>>>> or reboot the servers and recover from it?
>>>>
>>>> Thanks,
>>>> Eric
>>>>
>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/1f9a1c4a-94cf-49d7-a4d1-22ffb0b64727%40googlegroups.com
>>> .
>>>
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9c2df0af-0f11-485d-b892-ca8d7b8a108f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Unassigned Shards

2013-12-20 Thread Eric Luellen
Mark,

I used the rpm install. I'll take a look at the plugins. Thanks.

On Thursday, December 19, 2013 5:07:53 PM UTC-5, Mark Walkom wrote:
>
> Did you install ES via a rpm/deb or using the zip? I ask because your data 
> store directory is custom.subl
>
> Check out these plugins for monitoring - elastichq, kopf, bigdesk. They 
> will give you an overview of your cluster and might give you insight into 
> where your problem lies. The other best place to check is the ES logs.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 20 December 2013 08:52, Eric Luellen 
> > wrote:
>
>> I think I made my situation even worse. I tried deleting the shards and 
>> starting over and now elasticsearch isn't even creating the 
>> /etc/elasticsearch/data/my-cluster/node folder.
>>
>>
>> On Thursday, December 19, 2013 4:04:41 PM UTC-5, Eric Luellen wrote:
>>>
>>> Hello,
>>>
>>> Currently I have my syslog-ng --> logstash --> elasticsearch1 & 
>>> elastisearch2 setup working pretty good. It's accepting over 300 events per 
>>> second and hasn't bogged the systems down at all. However I'm running into 
>>> 2 issues that I don't quite understand. 
>>>
>>> 1. When viewing the information in Kibana, it appears to be anywhere 
>>> from 15 min to an hr behind on the "all events" view. Sometimes when I 
>>> search for new logs it shows up correctly but overall it seems like it's 
>>> lagging behind trying to keep up with what logstash is sending it. That 
>>> being said, I'm concerned that logs are being dropped and I don't know 
>>> about it. Are there any commands I can use to validate this type of 
>>> information or what I can do to make sure elasticsearch/KIbana is keeping 
>>> up?
>>>
>>> 2. I've had to restart elasticsearch a few times and every time I do, it 
>>> completely breaks things. Once it starts back up it doesn't continue to 
>>> show the logs in Kibana correctly and when I run a health check, it says 
>>> there are unassigned shards. I've not been able to fix this and in the past 
>>> I've always just had to delete them and start from scratch again.
>>>
>>> Any idea what is going on with this or how I can more cleanly restart or 
>>> reboot the servers and recover from it?
>>>
>>> Thanks,
>>> Eric
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/1f9a1c4a-94cf-49d7-a4d1-22ffb0b64727%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/384e753f-6e9a-49d4-9db9-c03af81d9ba3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Unassigned Shards

2013-12-19 Thread Eric Luellen
I think I made my situation even worse. I tried deleting the shards and 
starting over and now elasticsearch isn't even creating the 
/etc/elasticsearch/data/my-cluster/node folder.

On Thursday, December 19, 2013 4:04:41 PM UTC-5, Eric Luellen wrote:
>
> Hello,
>
> Currently I have my syslog-ng --> logstash --> elasticsearch1 & 
> elastisearch2 setup working pretty good. It's accepting over 300 events per 
> second and hasn't bogged the systems down at all. However I'm running into 
> 2 issues that I don't quite understand. 
>
> 1. When viewing the information in Kibana, it appears to be anywhere from 
> 15 min to an hr behind on the "all events" view. Sometimes when I search 
> for new logs it shows up correctly but overall it seems like it's lagging 
> behind trying to keep up with what logstash is sending it. That being said, 
> I'm concerned that logs are being dropped and I don't know about it. Are 
> there any commands I can use to validate this type of information or what I 
> can do to make sure elasticsearch/KIbana is keeping up?
>
> 2. I've had to restart elasticsearch a few times and every time I do, it 
> completely breaks things. Once it starts back up it doesn't continue to 
> show the logs in Kibana correctly and when I run a health check, it says 
> there are unassigned shards. I've not been able to fix this and in the past 
> I've always just had to delete them and start from scratch again.
>
> Any idea what is going on with this or how I can more cleanly restart or 
> reboot the servers and recover from it?
>
> Thanks,
> Eric
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f9a1c4a-94cf-49d7-a4d1-22ffb0b64727%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Unassigned Shards

2013-12-19 Thread Eric Luellen
Hello,

Currently I have my syslog-ng --> logstash --> elasticsearch1 & 
elastisearch2 setup working pretty good. It's accepting over 300 events per 
second and hasn't bogged the systems down at all. However I'm running into 
2 issues that I don't quite understand. 

1. When viewing the information in Kibana, it appears to be anywhere from 
15 min to an hr behind on the "all events" view. Sometimes when I search 
for new logs it shows up correctly but overall it seems like it's lagging 
behind trying to keep up with what logstash is sending it. That being said, 
I'm concerned that logs are being dropped and I don't know about it. Are 
there any commands I can use to validate this type of information or what I 
can do to make sure elasticsearch/KIbana is keeping up?

2. I've had to restart elasticsearch a few times and every time I do, it 
completely breaks things. Once it starts back up it doesn't continue to 
show the logs in Kibana correctly and when I run a health check, it says 
there are unassigned shards. I've not been able to fix this and in the past 
I've always just had to delete them and start from scratch again.

Any idea what is going on with this or how I can more cleanly restart or 
reboot the servers and recover from it?

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cab9c4e5-4e1a-4acd-b3ac-77fdc7ef6bef%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Help with Cluster

2013-12-17 Thread Eric Luellen
I ran that command and saw some fairly old files that were no longer there 
that it was trying to read. I believe Elasticsearch got behind on indexing 
the files and they were removed before it could finish. I'm not sure but 
that's just a guess. I have removed all of the files and started fresh. 
Currently everything is green across the board. I guess my issue now is how 
to ensure that doesn't happen again and how to make sure syslog-ng --> 
logstash --> elasticsearch doesn't drop any packets or get backed up. 
Thanks.

On Tuesday, December 17, 2013 3:38:49 PM UTC-5, David Pilato wrote:
>
> What gives the following?
>
> curl -XGET 'http://localhost:9200/_cluster/state?pretty'
>
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 17 décembre 2013 at 20:34:43, Eric Luellen 
> (eric.l...@gmail.com) 
> a écrit:
>
> Hmmm. I'm not sure why my status is red then. The only thing I can see 
> from the cluster-health documentation page is that a specific shard is not 
> allocated in the cluster. When I look at my cluster health, I do  see this: 
>
>"unassigned_shards" : 60
>
> Guess I need to figure out why I have so many unassigned shards. I think I 
> am feeding too much data in elasticsearch at the moment. I've turned on the 
> logstash server shipping to elasticsearch and I'm still getting logs coming 
> in and it's been about 10 minutes. 
>
> As far as the logstash node goes, I have this config on the elasticsearch 
> portion.
>
>  output {
>   elasticsearch {
> embedded => "false"
> host => "192.168.0.20" cluster => "my-cluster"
>   }
> }
>  
> So there is no reason it should be there. However, as you said, I'm not 
> terribly worried about that now, but I am concerned about my red status.
>
>
> On Tuesday, December 17, 2013 2:07:29 PM UTC-5, David Pilato wrote: 
>>
>>  Yes you can rename it using 
>> http://logstash.net/docs/1.3.1/outputs/elasticsearch#node_name
>>  
>>  You have a real problem here as your cluster should not be red.
>>  But it should not be caused by the logstash node.
>>  
>> Did you set embedded to false (it's default on 1.3.1 but not sure about 
>> previous version)?
>>
>>  -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
>>  @dadoonet <https://twitter.com/dadoonet> | 
>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>  
>>
>> Le 17 décembre 2013 at 19:45:18, Eric Luellen (eric.l...@gmail.com) a 
>> écrit:
>>
>>  Thanks for the information. I don't mind it being there, I would just 
>> confused of why it was there. If it stays there, will my cluster status 
>> continue to show red on the health? That was my main concern. Also, if it 
>> stays there, I wish I could rename it from the default Lupo it is to the 
>> name of the server so I can distinguish it better. 
>>
>>
>> On Tuesday, December 17, 2013 10:46:56 AM UTC-5, David Pilato wrote: 
>>>
>>>  I'd not worry of the non data node.
>>>  It's only a node which connect to the cluster to give a client to 
>>> logstash.
>>>  
>>>  If you really don't want it, then you can use 
>>> http://logstash.net/docs/1.3.1/outputs/elasticsearch_http 
>>>  
>>>  HTH
>>>
>>>  -- 
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
>>>  @dadoonet <https://twitter.com/dadoonet> | 
>>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>>  
>>>
>>> Le 17 décembre 2013 at 16:32:33, Eric Luellen (eric.l...@gmail.com) a 
>>> écrit:
>>>
>>>  I am working on building out a small POC for Logstash and 
>>> Elasticsearch. To start, I have a 2 server setup. 
>>>
>>>  
>>>- Server 1 - logstash1 - running "java -jar 
>>>logstash-1.2.2-flatjar.jar agent -f indexer.conf" 
>>>- 
>>>   - This server is tailing logs from a syslog config file and then 
>>>   sending them to an ElasticSearch server. 
>>> - Server 2 - elasticsearch1 - running elasticsearch as a daemon 
>>>(CentOS box that i used an rpm instal - version - 0.90.3.) 
>>>- 
>>>   - This server is also running Kibana."java -jar 
>>>   /etc/logstash/logstash-1.2.2-flatjar.jar web" 
>>> 
>

Re: Help with Cluster

2013-12-17 Thread Eric Luellen
Thanks for the information. I don't mind it being there, I would just 
confused of why it was there. If it stays there, will my cluster status 
continue to show red on the health? That was my main concern. Also, if it 
stays there, I wish I could rename it from the default Lupo it is to the 
name of the server so I can distinguish it better.


On Tuesday, December 17, 2013 10:46:56 AM UTC-5, David Pilato wrote:
>
> I'd not worry of the non data node.
> It's only a node which connect to the cluster to give a client to logstash.
>
> If you really don't want it, then you can use 
> http://logstash.net/docs/1.3.1/outputs/elasticsearch_http 
>
> HTH
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 17 décembre 2013 at 16:32:33, Eric Luellen 
> (eric.l...@gmail.com) 
> a écrit:
>
> I am working on building out a small POC for Logstash and Elasticsearch. 
> To start, I have a 2 server setup. 
>
>  
>- Server 1 - logstash1 - running "java -jar logstash-1.2.2-flatjar.jar 
>agent -f indexer.conf" 
>- 
>   - This server is tailing logs from a syslog config file and then 
>   sending them to an ElasticSearch server. 
> - Server 2 - elasticsearch1 - running elasticsearch as a daemon 
>(CentOS box that i used an rpm instal - version - 0.90.3.) 
>- 
>   - This server is also running Kibana."java -jar 
>   /etc/logstash/logstash-1.2.2-flatjar.jar web" 
> 
> Overall things seem to be working pretty well. I started to do some 
> general diagnostics on the elasticsearch server to see how the cluster was 
> doing, and I saw that it was red.
>  
>  [root@elasticsearch1 elasticsearch]# curl -XGET '
>> http://localhost:9200/_cluster/health?pretty=true'
>> {
>>   "cluster_name" : "my-cluster",
>>   "status" : "red",
>>   "timed_out" : false,
>>   "number_of_nodes" : 2,
>>   "number_of_data_nodes" : 1,
>>   "active_primary_shards" : 35,
>>   "active_shards" : 35,
>>   "relocating_shards" : 0,
>>   "initializing_shards" : 0,
>>   "unassigned_shards" : 55
>
>
> When I saw that it was red and that there were 2 nodes, I was confused as 
> there should only be 1 elasticsearch node. Upon digging further, I see this:
>
>  [root@elasticsearch1 elasticsearch]# curl 
>> localhost:9200/_nodes/process?pretty
>> {
>>   "ok" : true,
>>   "cluster_name" : "my-cluster",
>>   "nodes" : {
>> "ab8COl6pTj-kJSzrXZTE2w" : {
>>   "name" : "Lupo",
>>   "transport_address" : "inet[/192.168.0.10:9300]",
>>   "hostname" : "logstash1",
>>   "version" : "0.90.3",
>>   "attributes" : {
>> "client" : "true",
>> "data" : "false"
>>   },
>>   "process" : {
>> "refresh_interval" : 1000,
>> "id" : 4380,
>> "max_file_descriptors" : 3200
>>   }
>> },
>> "FMgeliZPRdQZwy-IZ9MUIp" : {
>>   "name" : "ElasticSearch Server1",
>>   "transport_address" : "inet[/192.168.0.20:9300]",
>>   "hostname" : "elasticsearch1",
>>   "version" : "0.90.3",
>>   "http_address" : "inet[/192.168.0.20:9200]",
>>   "attributes" : {
>> "master" : "true"
>>   },
>>   "process" : {
>> "refresh_interval" : 1000,
>> "id" : 15653,
>> "max_file_descriptors" : 65535
>>   }
>> }
>>   }
>
>  
> I am confused why server1, logstash1, is showing up in the elasticsearch 
> cluster. I'm only running logstash as an indexer and not the built in 
> elasticsearch feature. How do I get this server to stop showing up in my 
> cluster? When I look on the logstash1 server, I don't see any elasticsearch 
> data or indexes like I do on my elasticsearch1 servers. So I don't think 
> data is truly going to it, but I don't want it to show up. 
>
> Thanks,
> Eric
>
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/79821bd7-3679-4fb9-b78f-8c4b292357c7%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b9275fb-8f59-4b59-b532-a153167e8ed1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Help with Cluster

2013-12-17 Thread Eric Luellen
Hmmm. I'm not sure why my status is red then. The only thing I can see from 
the cluster-health documentation page is that a specific shard is not 
allocated in the cluster. When I look at my cluster health, I do  see this:

  "unassigned_shards" : 60

Guess I need to figure out why I have so many unassigned shards. I think I 
am feeding too much data in elasticsearch at the moment. I've turned on the 
logstash server shipping to elasticsearch and I'm still getting logs coming 
in and it's been about 10 minutes. 

As far as the logstash node goes, I have this config on the elasticsearch 
portion.

output {
  elasticsearch {
embedded => "false"
host => "192.168.0.20" cluster => "my-cluster"
  }
}

So there is no reason it should be there. However, as you said, I'm not 
terribly worried about that now, but I am concerned about my red status.


On Tuesday, December 17, 2013 2:07:29 PM UTC-5, David Pilato wrote:
>
> Yes you can rename it using 
> http://logstash.net/docs/1.3.1/outputs/elasticsearch#node_name
>
> You have a real problem here as your cluster should not be red.
> But it should not be caused by the logstash node.
>
> Did you set embedded to false (it's default on 1.3.1 but not sure about 
> previous version)?
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 17 décembre 2013 at 19:45:18, Eric Luellen 
> (eric.l...@gmail.com) 
> a écrit:
>
> Thanks for the information. I don't mind it being there, I would just 
> confused of why it was there. If it stays there, will my cluster status 
> continue to show red on the health? That was my main concern. Also, if it 
> stays there, I wish I could rename it from the default Lupo it is to the 
> name of the server so I can distinguish it better. 
>
>
> On Tuesday, December 17, 2013 10:46:56 AM UTC-5, David Pilato wrote: 
>>
>>  I'd not worry of the non data node.
>>  It's only a node which connect to the cluster to give a client to 
>> logstash.
>>  
>>  If you really don't want it, then you can use 
>> http://logstash.net/docs/1.3.1/outputs/elasticsearch_http 
>>  
>>  HTH
>>
>>  -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
>>  @dadoonet <https://twitter.com/dadoonet> | 
>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>  
>>
>> Le 17 décembre 2013 at 16:32:33, Eric Luellen (eric.l...@gmail.com) a 
>> écrit:
>>
>>  I am working on building out a small POC for Logstash and 
>> Elasticsearch. To start, I have a 2 server setup. 
>>
>>  
>>- Server 1 - logstash1 - running "java -jar 
>>logstash-1.2.2-flatjar.jar agent -f indexer.conf" 
>>- 
>>   - This server is tailing logs from a syslog config file and then 
>>   sending them to an ElasticSearch server. 
>> - Server 2 - elasticsearch1 - running elasticsearch as a daemon 
>>(CentOS box that i used an rpm instal - version - 0.90.3.) 
>>- 
>>   - This server is also running Kibana."java -jar 
>>   /etc/logstash/logstash-1.2.2-flatjar.jar web" 
>> 
>> Overall things seem to be working pretty well. I started to do some 
>> general diagnostics on the elasticsearch server to see how the cluster was 
>> doing, and I saw that it was red.
>>  
>>  [root@elasticsearch1 elasticsearch]# curl -XGET '
>>> http://localhost:9200/_cluster/health?pretty=true'
>>> {
>>>   "cluster_name" : "my-cluster",
>>>   "status" : "red",
>>>   "timed_out" : false,
>>>   "number_of_nodes" : 2,
>>>   "number_of_data_nodes" : 1,
>>>   "active_primary_shards" : 35,
>>>   "active_shards" : 35,
>>>   "relocating_shards" : 0,
>>>   "initializing_shards" : 0,
>>>   "unassigned_shards" : 55
>>
>>
>> When I saw that it was red and that there were 2 nodes, I was confused as 
>> there should only be 1 elasticsearch node. Upon digging further, I see this:
>>
>>  [root@elasticsearch1 elasticsearch]# curl 
>>> localhost:9200/_nodes/process?pretty
>>> {
>>>   "ok" : true,
>>>   "cluster_name" : "my-cluster",
>>>   "nodes" : {
>>> "ab8COl6pTj-kJSzrXZTE2w" : {
>>>   "name" : &

Help with Cluster

2013-12-17 Thread Eric Luellen
I am working on building out a small POC for Logstash and Elasticsearch. To 
start, I have a 2 server setup.


   - Server 1 - logstash1 - running "java -jar logstash-1.2.2-flatjar.jar 
   agent -f indexer.conf" 
   - This server is tailing logs from a syslog config file and then sending 
  them to an ElasticSearch server.
   - Server 2 - elasticsearch1 - running elasticsearch as a daemon (CentOS 
   box that i used an rpm instal - version - 0.90.3.)
  - This server is also running Kibana."java -jar 
  /etc/logstash/logstash-1.2.2-flatjar.jar web"
   
Overall things seem to be working pretty well. I started to do some general 
diagnostics on the elasticsearch server to see how the cluster was doing, 
and I saw that it was red.

[root@elasticsearch1 elasticsearch]# curl -XGET 
> 'http://localhost:9200/_cluster/health?pretty=true'
> {
>   "cluster_name" : "my-cluster",
>   "status" : "red",
>   "timed_out" : false,
>   "number_of_nodes" : 2,
>   "number_of_data_nodes" : 1,
>   "active_primary_shards" : 35,
>   "active_shards" : 35,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 55


When I saw that it was red and that there were 2 nodes, I was confused as 
there should only be 1 elasticsearch node. Upon digging further, I see this:

[root@elasticsearch1 elasticsearch]# curl 
> localhost:9200/_nodes/process?pretty
> {
>   "ok" : true,
>   "cluster_name" : "my-cluster",
>   "nodes" : {
> "ab8COl6pTj-kJSzrXZTE2w" : {
>   "name" : "Lupo",
>   "transport_address" : "inet[/192.168.0.10:9300]",
>   "hostname" : "logstash1",
>   "version" : "0.90.3",
>   "attributes" : {
> "client" : "true",
> "data" : "false"
>   },
>   "process" : {
> "refresh_interval" : 1000,
> "id" : 4380,
> "max_file_descriptors" : 3200
>   }
> },
> "FMgeliZPRdQZwy-IZ9MUIp" : {
>   "name" : "ElasticSearch Server1",
>   "transport_address" : "inet[/192.168.0.20:9300]",
>   "hostname" : "elasticsearch1",
>   "version" : "0.90.3",
>   "http_address" : "inet[/192.168.0.20:9200]",
>   "attributes" : {
> "master" : "true"
>   },
>   "process" : {
> "refresh_interval" : 1000,
> "id" : 15653,
> "max_file_descriptors" : 65535
>   }
> }
>   }


I am confused why server1, logstash1, is showing up in the elasticsearch 
cluster. I'm only running logstash as an indexer and not the built in 
elasticsearch feature. How do I get this server to stop showing up in my 
cluster? When I look on the logstash1 server, I don't see any elasticsearch 
data or indexes like I do on my elasticsearch1 servers. So I don't think 
data is truly going to it, but I don't want it to show up. 

Thanks,
Eric

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/79821bd7-3679-4fb9-b78f-8c4b292357c7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.