Re: How many Java clients do I need?

2014-07-31 Thread David Pilato
Yeah. Only one client instance for the JVM.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 31 juil. 2014 à 06:48, Andrew Gaydenko andrew.gayde...@gmail.com a écrit :

As far as I understand, Java client instance is stateless, and it's methods are 
pure functions (I means operating methods rather those related to initial 
configuration just after instantiation). As a result, it is sufficient to have 
the only client for given cluster for given JVM. Is it true? Or - are there any 
benefits in, say, creating own client for every index?
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d38de8b6-6ed9-4e3f-9dfc-c9dedd1293d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5476E2C7-7804-4732-BAD3-FC91A3408406%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: How many Java clients do I need?

2014-07-31 Thread Andrew Gaydenko
On Thursday, July 31, 2014 10:18:52 AM UTC+4, David Pilato wrote:

 Yeah. Only one client instance for the JVM.


Now I'm happy to be sure, thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/700c7505-5480-4059-94a0-0e0cfc29890f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to get more than 10 terms bucket with Java API?

2014-07-31 Thread Alain Désilets
Thx. That did the trick. I now realize that I was setting the size on the 
return value of addAggregation(), when in fact, I needed to set it on the 
return value of the field() method.

On Tuesday, 29 July 2014 15:56:29 UTC-4, Adrien Grand wrote:

 Hi,

 You need to set the size on the terms aggregation:

 AggregationBuilders.terms(category_names).field(category).size(20)


 On Mon, Jul 28, 2014 at 9:22 PM, Alain Désilets alainde...@gmail.com 
 javascript: wrote:

 I have an ES where I have indexed a bunch of files. Each file is tagged 
 with a category field. I want to write Java code to get a list of all the 
 categories. I am trying to do this using the terms aggregation:

 QueryBuilder qb = QueryBuilders.matchAllQuery(); 
 SearchResponse sr = 
 esClient.prepareSearch()
 .setQuery(qb)
 .setSize(20)
  
 .addAggregation(AggregationBuilders.terms(category_names).field(category)).setSize(20)
 .execute()
  .actionGet(); 

 Terms terms = sr.getAggregations().get(category_names);
 int numCategories = terms.getBuckets().size();
 System.out.println(Number of category buckets found: + numCategories);

 But this code always prints out that it found 10 category buckets. Yet, 
 if I execute the following query in Sense:

 GET files-index/file_with_category/_search
 {
   query: {
   match_all: {}
   },
   aggs: {
 categories: {
   terms: {
 field: category,
 size: 100
   }
 }
   }
 }

 I get 18 category buckets. What am I doing wrong in the Java code?

 Thx.

 Alain


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/b6d0300c-524e-423b-bd5a-6ce9d9e7b168%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/b6d0300c-524e-423b-bd5a-6ce9d9e7b168%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f4ff78a9-f042-43c6-ad61-d11ba5fb7ff1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Yet another OOME: Java heap space thread :S

2014-07-31 Thread Chris Neal
Hi everyone,

First off, apologies for the thread.  I know OOME discussions are somewhat
overdone in the group, but I need to reach out for some help for this one.

I have a 2 node development cluster in EC2 on c3.4xlarge AMIs.  That means
16 vCPUs, 30GB RAM, 1Gb network, and I have 2 500GB EBS volumes for
Elasticsearch data on each AMI.

I'm running Java 1.7.0_55, and using the G1 collector.  The Java args are:

/usr/bin/java -Xms8g -Xmx8g -Xss256k -Djava.awt.headless=true -server
-XX:+UseCompressedOops -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError
The index has 2 shards, each with 1 replica.

I have a daily index being filled with application log data.  The index, on
average, gets to be about:
486M documents
53.1GB (primary size)
106.2GB (total size)

Other than indexing, there really is nothing going on in the cluster.  No
searches, or percolators, just collecting data.

I have:

   - Tweaked the idex.merge.policy
   - Tweaked the indices.fielddata.breaker.limit and cache.size
   - change the index refresh_interval from 1s to 60s
   - created a default template for the index such that _all is disabled,
   and all fields in the mapping are set to not_analyzed.

Here is my complete elasticsearch.yml:

action:
  disable_delete_all_indices: true
cluster:
  name: elasticsearch-dev
discovery:
  zen:
minimum_master_nodes: 2
ping:
  multicast:
enabled: false
  unicast:
hosts: 10.0.0.45,10.0.0.41
gateway:
  recover_after_nodes: 2
index:
  merge:
policy:
  max_merge_at_once: 5
  max_merged_segment: 15gb
  number_of_replicas: 1
  number_of_shards: 2
  refresh_interval: 60s
indices:
  fielddata:
breaker:
  limit: 50%
cache:
  size: 30%
node:
  name: elasticsearch-ip-10-0-0-45
path:
  data:
  - /usr/local/ebs01/elasticsearch
  - /usr/local/ebs02/elasticsearch
threadpool:
  bulk:
queue_size: 500
size: 75
type: fixed
  get:
queue_size: 200
size: 100
type: fixed
  index:
queue_size: 1000
size: 100
type: fixed
  search:
queue_size: 200
size: 100
type: fixed

The heap sits about 13GB used.   I had been batting OOME exceptions for
awhile, and thought I had it licked, but one just popped up again.  My
cluster has been up and running fine for 14 days, and I just got this OOME:

=
[2014-07-30 11:52:28,394][INFO ][monitor.jvm  ]
[elasticsearch-ip-10-0-0-41] [gc][young][1158834][109906] duration [770ms],
collections [1]/[1s], total [770ms]/[43.2m], memory
[13.4gb]-[13.4gb]/[16gb], all_pools {[young]
[648mb]-[8mb]/[0b]}{[survivor] [0b]-[0b]/[0b]}{[old]
[12.8gb]-[13.4gb]/[16gb]}
[2014-07-30 15:03:01,070][WARN ][index.engine.internal]
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed engine [out of
memory]
[2014-07-30 15:03:10,324][WARN
][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:10,335][WARN
][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the
selector loop.
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:10,324][WARN ][index.merge.scheduler]
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:28,595][WARN ][index.translog   ]
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to flush shard
on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException:
[derbysoft-20140730][0] Flush failed
at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
at
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
at
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: this writer hit an
OutOfMemoryError; cannot commit
at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
at
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
... 5 more
[2014-07-30 15:03:28,658][WARN ][cluster.action.shard ]
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] sending failed shard
for [derbysoft-20140730][0], node[W-7FsjjZTyOXZdaJhhqxEA], [R], s[STARTED],
indexUUID [QC5Sg0FDSnOGUiFg30qNxA], reason [engine failure, message [out of
memory][IllegalStateException[this writer hit an OutOfMemoryError; cannot
commit]]]
[2014-07-30 15:34:36,418][WARN

Recommendation for using stop words

2014-07-31 Thread sulemanmubarik
Hi 
What are good recommendation to use the stop words. 
Using the default stop words provided by elastic search is good 
Or should I use some custom stop words too
More than 60% data is from twitter.  
Thanks 




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Recommendation-for-using-stop-words-tp4060924.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1406763306312-4060924.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Something like Wildcard Filter?

2014-07-31 Thread zouxcs
I have the same question now,how did you solve it?give me a hint.Thank you



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Something-like-Wildcard-Filter-tp2613862p4060937.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1406773485496-4060937.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: sum-aggregation script doesn't allow negative values?

2014-07-31 Thread Colin Goodheart-Smithe
The Elasticsearch log files can be found in the logs directory of your 
node's Elasticsearch directory.  If you re-create the error and have a look 
at the end of the log file you should see the stacktrace

Colin

On Wednesday, 30 July 2014 10:53:05 UTC+1, Valentin wrote:

 Hi Colin,

 I try increasing it up to 40 but nothing changes. I would post the stack 
 trace but I don't know how to find them.

 Thanks
 Valentin

 On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe 
 wrote:

 Also, your shard_size parameter should always be greater than the size 
 parameter.  So if you are asking for size of 10 then I would try setting 
 shard_size to 20 or 30.

 On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:

 Would you be able to re-run your query and post the stack trace from the 
 Elasticsearch server logs.  This might help to work out whats going on.

 Thanks

 Colin

 On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

 Ok. I think I found the problem. As soon as I try to sort on the script 
 value it ceases to work

 works, but unsorted
 {
   size: 0,
   aggs: {
 winners: {
   terms: {
 field: tit,
 size: 10,
 shard_size: 4
   },
   aggs: {
 articles_over_time: {
   date_histogram: {
 field: datetime,
 interval: 1d
   }
 },
 diff: {
   sum: {
 script: (doc['datetime'].value  140641200) ? -1 : 
 1,
 lang: groovy
   }
 }
   }
 }
   }
 }

 does not work:
 {
   size: 0,
   aggs: {
 winners: {
   terms: {
 field: tit,
 size: 10,
 order: {
   diff: desc
 },
 shard_size: 4
   },
   aggs: {
 articles_over_time: {
   date_histogram: {
 field: datetime,
 interval: 1d
   }
 },
 diff: {
   sum: {
 script: (doc['datetime'].value  140641200) ? -1 : 
 1,
 lang: groovy
   }
 }
   }
 }
   }
 }




 On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

 Hi Colin,

 I could figure out the shard_size problem thanks to your help.

 For the 'datetime' error: I checked and it exists in all the indices. 
 It has the correct mappings and the therefor probably could not have 
 wrong 
 values I guess. And using the elasticsearch-head plugin I dont get the 
 error but a wrong result which really seems strange.

 Thanks
 Valentin

 On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe 
 wrote:

 Firstly, I think the reason you are only getting results from one 
 index when you are asking for a size of 1 in your terms aggregation is 
 because you are asking for the top 1 bucket from each shard on each 
 index. 
  This will then be merged together and only the top bucket will be kept. 
  If the top bucket is not the same on all indexes then you will not get 
 results from all indices.  Setting the shard_size parameter to something 
 like 10 can help with this (see 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
  
 for more information on this)

 Second, I wonder if the reason you are getting the error from your 
 script is that you don't have a 'datetime' value for all of your 
 documents 
 in some of your indices?

 Regards,

 Colin

 On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

 Hi Colin,

 now it gets really strange. First my alias
 curl 'http://localhost:9200/_alias?pretty'
 { 
   live-2014-07-27 : { 

 aliases : { 

   aggtest : { } 

 } 

   }, 

   live-2014-07-26 : { 

 aliases : { 

   aggtest : { } 

 } 

   } 

 }


 I tried two different queries:
 curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' 
 -d '{
   size: 0,
   aggs: {
 winners: {
   terms: {
 field: tit,
 order: {
   diff: desc
 },
 size: 1
   },
   aggs: {
 articles_over_time: {
   date_histogram: {
 field: datetime,
 interval: 1d
   }
 },
 diff: {
   sum: {
 script: (doc['datetime'].value  140641200) ? -1 
 : 1,
 lang: groovy
   }
 }
   }
 }
   }
 }'

 and

 curl -XPOST '
 http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
  
 .

 both do give me a result (but a wrong one) when I do query using 
 elasticsearch-head but result in an error if I use the commandline

 {

   error : SearchPhaseExecutionException[Failed to execute phase 
 [query], all shards failed; shardFailures 
 {[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]: 
 QueryPhaseExecutionException[[live-2014-07-26][1]: 
 query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed 
 [Failed to execute main query]]; 

EsRejectedExecutionException

2014-07-31 Thread Anand kumar
Hi All,

In my cluster, I'm having around 500 indices. When i'm trying to start the 
elasticsearch instance, its showing the following exception.
Why its happening, what should be done to resolve it?

Thanks,
Anand


[2014-07-31 11:50:01,551][DEBUG][action.search.type   ] [ESCS_NODE] 
[components][3], node[VQMkI5CBRB2hDEnZ5yVLcw], [P], s[STARTED]: Failed to 
execute [org.elasticsearch.action.search.SearchRequest@c34bbb8] lastShard 
[true]
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: 
rejected execution (queue capacity 1000) on 
org.elasticsearch.search.action.SearchServiceTransportAction$23@60cfc409
at 
org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.execute(SearchServiceTransportAction.java:509)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:203)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:171)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:153)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
at 
org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:101)
at 
org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
at 
org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
at 
org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:75)
at 
org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
at 
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:294)
at 
org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:44)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at 
org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at 
org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
at 

Re: Geo distance filter exceptions

2014-07-31 Thread Joffrey Hercule
city.location was an example ;)
You should use yours.

Le mercredi 30 juillet 2014 18:13:53 UTC+2, Madhavan Ramachandran a écrit :

 Nope.. it did not work..got exception as 
 QueryParsingException[[offlocations_geo] failed to find geo_point field 
 [city.location

 Regards
 Madhavan.TR
 On Wednesday, July 30, 2014 10:08:45 AM UTC-5, Joffrey Hercule wrote:

 Hi !
 Use  query.

 ex :
 {
   query : {
 filtered : {
 query : {
 match_all : {}
 },
 filter : {
 geo_distance : {
 distance : 50km,
 city.location : {
 lat : 43.4,
 lon : 5.4
 }
 }
 }
 }
   }
 }

 Le mardi 29 juillet 2014 22:30:12 UTC+2, Madhavan Ramachandran a écrit :

 Hi Team,

 I am trying to find a solution for the below

1. Geo boundary based search.. My index have property for lat and 
lon as double.. not as a geopoint.. Here is my mapping for my index..

 How do i use the lon and lat from the below mapping for geo distance 
 filter/geo distance range filter ?

 Name Type Format Store?
 data string

 getpath string

 id double

 Region string

 Submarket string

 addr1 string

 addr2 string

 city string

 citymarket string

 country string

 countryid long

 cultureid long

 data string

 details string

 fax string

 id string

 language string

 lat double

 lon double

 When I search for the documents.. i got the below exception..

 Query : 
 {
 filter: {
 geo_distance : {
 distance : 300km,
 location : {
 lat : 45,
 lon : -122
 }
 }

 }
 }

 Exception: 
 error: SearchPhaseExecutionException[Failed to execute phase [query], 
 all shards failed; shardFailures 
 {[o3f66HetT3OSpVw895w0nA][offlocations][4]: 
 SearchParseException[[offlocations][4]: from[-1],size[-1]: Parse Failure 
 [Failed to parse source [{\n\filter\: {\n \geo_distance\ : {\n 
 \distance\ : \300km\,\n \location\ : {\n \lat\ : 45,\n \lon\ : 
 -122\n }\n } \n }\n}]]]; nested:

 I tried with removing the location, which i dont have in my mapping.. 

 {filter: { geo_distance : { distance : 300km,
 lat 
 : 45, lon : -122   }  }}

 I got the exception as lon is not a geo_point field..

 ElasticsearchIllegalArgumentException[the character '-' is not a valid 
 geohash character]; }]

 If i remove the - infront of lon.. then the exception says : 

 QueryParsingException[[offlocations] field [lon] is not a geo_point 
 field];




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a95b2a60-6f4d-4727-9b38-67cfd4329465%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Yet another OOME: Java heap space thread :S

2014-07-31 Thread David Pilato
Why do you start with 8gb HEAP? Can't you give 16gb or so?

/usr/bin/java -Xms8g -Xmx8g 



--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 30 juil. 2014 à 19:47, Chris Neal chris.n...@derbysoft.net a écrit :

Hi everyone,

First off, apologies for the thread.  I know OOME discussions are somewhat 
overdone in the group, but I need to reach out for some help for this one.  

I have a 2 node development cluster in EC2 on c3.4xlarge AMIs.  That means 16 
vCPUs, 30GB RAM, 1Gb network, and I have 2 500GB EBS volumes for Elasticsearch 
data on each AMI.

I'm running Java 1.7.0_55, and using the G1 collector.  The Java args are:
/usr/bin/java -Xms8g -Xmx8g -Xss256k -Djava.awt.headless=true -server 
-XX:+UseCompressedOops -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
-XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError

The index has 2 shards, each with 1 replica.

I have a daily index being filled with application log data.  The index, on 
average, gets to be about:
486M documents
53.1GB (primary size)
106.2GB (total size)

Other than indexing, there really is nothing going on in the cluster.  No 
searches, or percolators, just collecting data.  

I have:
Tweaked the idex.merge.policy
Tweaked the indices.fielddata.breaker.limit and cache.size
change the index refresh_interval from 1s to 60s
created a default template for the index such that _all is disabled, and all 
fields in the mapping are set to not_analyzed.
Here is my complete elasticsearch.yml:

action:
  disable_delete_all_indices: true
cluster:
  name: elasticsearch-dev
discovery:
  zen:
minimum_master_nodes: 2
ping:
  multicast:
enabled: false
  unicast:
hosts: 10.0.0.45,10.0.0.41
gateway:
  recover_after_nodes: 2
index:
  merge:
policy:
  max_merge_at_once: 5
  max_merged_segment: 15gb
  number_of_replicas: 1
  number_of_shards: 2
  refresh_interval: 60s
indices:
  fielddata:
breaker:
  limit: 50%
cache:
  size: 30%
node:
  name: elasticsearch-ip-10-0-0-45
path:
  data:
  - /usr/local/ebs01/elasticsearch
  - /usr/local/ebs02/elasticsearch
threadpool:
  bulk:
queue_size: 500
size: 75
type: fixed
  get:
queue_size: 200
size: 100
type: fixed
  index:
queue_size: 1000
size: 100
type: fixed
  search:
queue_size: 200
size: 100
type: fixed

The heap sits about 13GB used.   I had been batting OOME exceptions for awhile, 
and thought I had it licked, but one just popped up again.  My cluster has been 
up and running fine for 14 days, and I just got this OOME:

=
[2014-07-30 11:52:28,394][INFO ][monitor.jvm  ] 
[elasticsearch-ip-10-0-0-41] [gc][young][1158834][109906] duration [770ms], 
collections [1]/[1s], total [770ms]/[43.2m], memory [13.4gb]-[13.4gb]/[16gb], 
all_pools {[young] [648mb]-[8mb]/[0b]}{[survivor] [0b]-[0b]/[0b]}{[old] 
[12.8gb]-[13.4gb]/[16gb]}
[2014-07-30 15:03:01,070][WARN ][index.engine.internal] 
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed engine [out of 
memory]
[2014-07-30 15:03:10,324][WARN ][netty.channel.socket.nio.AbstractNioSelector] 
Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:10,335][WARN ][netty.channel.socket.nio.AbstractNioSelector] 
Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:10,324][WARN ][index.merge.scheduler] 
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to merge
java.lang.OutOfMemoryError: Java heap space
[2014-07-30 15:03:28,595][WARN ][index.translog   ] 
[elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to flush shard on 
translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: 
[derbysoft-20140730][0] Flush failed
at 
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
at 
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: this writer hit an 
OutOfMemoryError; cannot commit
at 
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
at 
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
... 5 more
[2014-07-30 15:03:28,658][WARN ][cluster.action.shard ] 

Re: ORA-01882: timezone region not found

2014-07-31 Thread George DRAGU
We have multiple Oracle installations, but we tested with 9i and 10g. My OS 
is an OS X Mavericks, JDK is the last one (1.7). The only JDBC driver which 
works with (JDK 1.7) is for Oracle 12c Release 1 
(http://www.oracle.com/technetwork/database/features/jdbc/jdbc-drivers-12c-download-1958347.html).
 
As I noted, with 10g server there is no problem, but with 9i I catch the 
error timezone region not found. As a general rule, in production 
environment is recommended to use the last JDK (or JRE) so 7 version 
(http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase14-419411.html).
 
Here we could found a warning about older versions:

WARNING: These older versions of the JRE and JDK are provided to help 
developers debug issues in older systems. They are not updated with the 
latest security patches and are not recommended for use in production.

But what if our customs still use Oracle 9i? We are forced to use a JDK 
version 4 (1.4)?

miercuri, 30 iulie 2014, 21:31:48 UTC+3, Jörg Prante a scris:

 Ups, I mean of course, this is *not*  ES related ...

 Jörg


 On Wed, Jul 30, 2014 at 7:59 PM, joerg...@gmail.com javascript: 
 joerg...@gmail.com javascript: wrote:

 This is ES related, but, what Oracle JDBC version is this and what Oracle 
 Database Server version?

 Jörg


 On Wed, Jul 30, 2014 at 3:59 PM, George DRAGU george@gmail.com 
 javascript: wrote:

 Hello, 

 Is it any possibility to specify a parameter value to java command line 
 behind the JDBC River? 
 I think at a -Duser.timezone=Europe/Istanbul, for exemple. 
 When I try to create a JDBC River for an Oracle database (with jprante 
 plugin) I catch this error. 

 Thanks 

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/a75e8e67-01f8-4b20-a2e9-86caba59e5aa%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/a75e8e67-01f8-4b20-a2e9-86caba59e5aa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.





-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5b2ae4be-9459-43cf-9542-83eb3412f3fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Constant Score Query Bool Query

2014-07-31 Thread Shawn Ritchie
Hi Guys,

Quick question regarding scoring when ConstantScoreQuery  Bool Query are 
used in conjunction

So here is the query

{

 from: 0,

 size: 10,

 explain: false,

 sort: [_score],

 query: {

 filtered: {

 query: {

 bool: {

 should: [{

 constant_score: {

 query: {

 match: {

 _all: {

 query: test

 }

 }

 },

 boost: 1.0

 }

 }, {

 constant_score: {

 query: {

 match: {

 _all: {

 query: check

 }

 }

 },

 boost: 1.0

 }

 }]

 },

 disable_coord: 1

 },

 filter: [{

 or: [{

 query: {

 match: {

 _all: {

 query: test

 }

 }

 }

 }, {

 query: {

 match: {

 _all: {

 query: check

 }

 }

 }

 }]

 }]

 }

 }

 }


Shouldn't the above query either return a score of either 1 OR 2 why is it 
returning score of lets say 1.4 if I am wrapping the sub queries  with a 
constant score query?

Regards
Shawn

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b749780a-23f3-4e1d-bdea-b6d59c9507e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ORA-01882: timezone region not found

2014-07-31 Thread joergpra...@gmail.com
With an Oracle JDBC client jar of 11gR2 or later, you will receive timezone
issues when connecting to Oracle DB 9i. Check Oracle Metalink ID 1068063.1

I feel a bit odd to give Oracle support on ES community mailing list. Sorry
for the noise. You could also open an issue at
https://github.com/jprante/elasticsearch-river-jdbc/issues

Jörg


On Thu, Jul 31, 2014 at 10:22 AM, George DRAGU george.gd.dr...@gmail.com
wrote:

 We have multiple Oracle installations, but we tested with 9i and 10g. My
 OS is an OS X Mavericks, JDK is the last one (1.7). The only JDBC driver
 which works with (JDK 1.7) is for Oracle 12c Release 1 (
 http://www.oracle.com/technetwork/database/features/jdbc/jdbc-drivers-12c-download-1958347.html).
 As I noted, with 10g server there is no problem, but with 9i I catch the
 error timezone region not found. As a general rule, in production
 environment is recommended to use the last JDK (or JRE) so 7 version (
 http://www.oracle.com/technetwork/java/javasebusiness/downloads/java-archive-downloads-javase14-419411.html).
 Here we could found a warning about older versions:

 WARNING: These older versions of the JRE and JDK are provided to help
 developers debug issues in older systems. They are not updated with the
 latest security patches and are not recommended for use in production.

 But what if our customs still use Oracle 9i? We are forced to use a JDK
 version 4 (1.4)?

 miercuri, 30 iulie 2014, 21:31:48 UTC+3, Jörg Prante a scris:

 Ups, I mean of course, this is *not*  ES related ...

 Jörg


 On Wed, Jul 30, 2014 at 7:59 PM, joerg...@gmail.com joerg...@gmail.com
 wrote:

 This is ES related, but, what Oracle JDBC version is this and what
 Oracle Database Server version?

 Jörg


 On Wed, Jul 30, 2014 at 3:59 PM, George DRAGU george@gmail.com
 wrote:

 Hello,

 Is it any possibility to specify a parameter value to java command line
 behind the JDBC River?
 I think at a -Duser.timezone=Europe/Istanbul, for exemple.
 When I try to create a JDBC River for an Oracle database (with jprante
 plugin) I catch this error.

 Thanks

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/a75e8e67-01f8-4b20-a2e9-86caba59e5aa%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a75e8e67-01f8-4b20-a2e9-86caba59e5aa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5b2ae4be-9459-43cf-9542-83eb3412f3fe%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5b2ae4be-9459-43cf-9542-83eb3412f3fe%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHA8JReMKVt4pV5Y29jZ_QnWVeDD3tVWOdjAupN5drijQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


knapsack use case

2014-07-31 Thread Matteo Moci
Hi All,
I have some questions about the knapsack plugin [1].

My idea to use the tool to do a backup to a file, starting from a 0.90.x
instance and then restore it on a different 1.2.x or 1.3.x instance. I see
it can't be done directly, copying to a local/remote cluster.

Would it work doing an intermediate step with a file?
Or the backup still has metadata about the es version it was generated
from, making it impossible?

Is the snapshot and restore feature [2] useful in my use case, or not?

Is the knapsack plugin able to backup and restore also aliases and
mappings, or do I have to manually migrate them before restoring data?

Thanks for the patience and the great work!
Matteo

[1] https://github.com/jprante/elasticsearch-knapsack
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

-- 
Matteo Moci
http://mox.fm

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch memory usage on centralized log clusters

2014-07-31 Thread Tim Stoop
Hi all,

We've been running an ElasticSearch cluster of three nodes since last 
December. We're running them on Debian Wheezy. Due to the size of our 
network, we're getting about 600 messages/s (800 at peak times). Using 
logstash and daily indices, we're currently at about 3TB of data spread 
over 158 indices. We use 2 replicas, so all machines have all data.

I'm struggling to understand what expected working limits are for 
ElasticSearch in a centralized log situation and would really appreciate 
input from others. What we eventually try to do is provide about 13 months 
of searchable logs via Kibana, but we're mostly running into RAM 
constraints. Each node has 3TB of data, but the heap usage is almost 
constantly up at 27-29GB.

We did have a problem with garbage collects earlier, which took so long 
that a node would drop from the cluster. To fix this, we switched to the G1 
gc, which seems very well suited for this operation. The machines we're 
using have 12 cores, so the added CPU overhead is negligible (general usage 
of the cores is less than 30%). But I'm not enough of a Java dev to judge 
if this switch could be the cause of the constant high heap usage. We're 
currently at about 1GB RAM per 100GB of disk usage.

My questions:

1- Is 1GB RAM usage per 100GB disk usage an expected usage pattern for an 
index heavy cluster?
2- Aside from closing indices, are there other ways of lowering this?
2.5- Should I worry about it?
3- Are we approaching this the wrong way and should we change our setup?
4- Would upgrading to 1.3.1 change the usage significantly, due to the fix 
in issue 6856 or is it unrelated?

The numbers:

3 nodes with the following config:
- 92GB of RAM
- 12 cores with HT
- 6x 3.6TB spinning disks (we're contemplating adding CacheCade, since the 
controller supports it)
  - We expose the disks to elasticsearch via 6 mount points
- Debian Wheezy
- Java 1.7.0_65 OpenJDK JRE with IcedTea 2.5.1 (Debian package with version 
7u65-2.5.1-2~deb7u1)
- ElasticSearch 1.2.1 from the ElasticSearch repositories

Config snippets (leaving out ips and the like):

bootstrap:
 mlockall: true 
cluster:
 name: logging
 routing:
  allocation:
node_concurrent_recoveries: 4 
discovery:
 zen:
  minimum_master_nodes: 2
  ping:
   unicast:
hosts:
[snip]
index:
 number_of_replicas: 2
 number_of_shards: 6 
indices:
 memory:
  index_buffer_size: 50%
 recovery:
  concurrent_streams: 5
  max_bytes_per_sec: 100mb 
node:
 concurrent_recoveries: 6
 name: stor1-stor 
path:
 data:
   - /srv/elasticsearch/a
   - /srv/elasticsearch/b
   - /srv/elasticsearch/c
   - /srv/elasticsearch/d
   - /srv/elasticsearch/f
   - /srv/elasticsearch/e

Java options: -Xms30g -Xmx30g -Xss256k -Djava.awt.headless=true 
-XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError

Thanks for your help! Please let me know if you need more information

--
Kind regards,
Tim Stoop

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2846daa5-a75c-4d04-ab34-957984c9e05e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ORA-01882: timezone region not found

2014-07-31 Thread George DRAGU
Thanks for your answer and excuse me for bothering you with unrelated ES 
problems.

joi, 31 iulie 2014, 11:46:51 UTC+3, Jörg Prante a scris:

 With an Oracle JDBC client jar of 11gR2 or later, you will receive 
 timezone issues when connecting to Oracle DB 9i. Check Oracle Metalink ID 
 1068063.1

 I feel a bit odd to give Oracle support on ES community mailing list. 
 Sorry for the noise. You could also open an issue at 
 https://github.com/jprante/elasticsearch-river-jdbc/issues 

 Jörg



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/254d40d7-8b2f-4f82-a484-1aef5ab30142%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Integration testing a native script

2014-07-31 Thread Nick T
Thanks for the help.

My working solution is:

1. Create the plugins/scripts directory created in /tmp for each test node
2. copy my script jar into it
3. Register script in settings:
 .put(script.native.NAME.type, NAME OF SCRIPT)

I am super grateful for all your help. Btw, I wasn't creating it as a 
plugin.


On Wednesday, 30 July 2014 11:33:45 UTC+1, Thomas wrote:

 I have noticed that you mention native java script so you have implemented 
 it as a plugin?
 if so try the following in your settings:
final Settings settings
 = settingsBuilder()
  ...

 .put(plugin.types, YourPlugin.class.getName())

 Thomas


 On Wednesday, 30 July 2014 12:31:06 UTC+3, Nick T wrote:

 Is there a way to have a native java script accessible in integration 
 tests? In my integration tests I am creating a test node in the /tmp 
 folder. 

 I've tried copying the script to /tmp/plugins/scripts but that was quite 
 hopeful and unfortunately does not work.

 Desperate for help.

 Thanks



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0d589406-5c0c-450b-b9b6-901d8278c4dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Understanding tokenization for auto-complete

2014-07-31 Thread Petar Djekic
I'm using ES's auto-completion and i'd like to understand how the prefix 
tokenization works. Example queries and results its returning currently:

1) 'blackb' - 'Blackberry Q10 Red' 
-- Expected

2) 'q' - 'Blackberry Q10 Red'
-- Expected

3) 'q10' - No result, expected is 'Blackberry Q10 Red'
-- Why are results returned when typing in 'q' but not 'q10'?

4) 'blackberry q10' 
-- Expected

5) 'sam' - 'Samsung Galaxy S5' 
-- Expected

6) 'galax'  - 'Samsung Galaxy S5' 
-- Expected

7) 'S5' 
- No result, expected is 'Samsung Galaxy S5' 

I'm indexing the document using input: [blackberry, Q10 Red], input: 
[samsung, galaxy s5], please find the mapping / query below. I thought 
the standard tokenizer would also tokenize on whitespaces and hence give 
result for S5, also i don't understand why 'q' gives results but 'q10' 
doesn't. Can i use the prefix tokenizer for such a use case or would it 
need to switch to ngrams completely?

My mapping looks as follow:

 mappings : {

suggestions : {

 _timestamp: {

   enabled: true,

   path : lastTimestamp

  },

 properties : {

   suggest : { type : completion,

index_analyzer : standard,

search_analyzer : simple,

payloads : true,

context : {

  type : {

type : category,

path : entity

and query like this:

{

suggestions : {

text : query',

completion : {

size : 5,

field : suggest,

fuzzy : {

fuzziness : 1

},

context : {

type : internalcategory'

}

}

}

 }





-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a917898-8525-4916-807e-cb72001b30c5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Data too large error

2014-07-31 Thread Rhys Campbell
I occasionally get the following error in Kibana from elasticsearch

*1.   *


*Oops!ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreakingException:Data
 
too large, data for field [@timestamp] would be larger than limit 
of[639015321/609.4mb]]*

I even get this when searching for 5 min of data through the Kibana log 
stash search. It will disappear and reappear for the reason apparent to me. 
My total data size is not that big yet.

Elastic search has been left on the default of  splitting the indices by 
day. The current size in 168M for today. 

Any hints?

Cheers,.

Rhys

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ebad6813-41d4-4e41-9a88-ff94058cd2b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: knapsack use case

2014-07-31 Thread joergpra...@gmail.com
Snapshot/restore is always recommended, but is a 1.0 feature. This is a
standard API of ES and well supported by the ES team. With that, you can
handle all kinds of index data safely on a binary level, fully and
incrementally.

Knapsack plugin is for document export/import only. I wrote it to transport
_source data harvested over a long time period from a   1.0 system to a
production system. It works on _source or stored fields only. It uses
search/query and bulk indexing API without snapshots, so it is up to the
admin to stop index writes while knapsack runs. There is also a lookup of
index settings and mappings, this information is also included in the
export archive file, and re-applied at import time. But, there is no check
if these settings/mappings can be applied on the target successfully, this
is left to the admin to prepare plugins, analyzers, etc. Aliases are not
transported but this is a good idea for improvement.

Currently, knapsack plugin does not work on ES 1.3 but I am progressing to
implement this. I am adding a Java-level API. Currently it is REST only.

Jörg


On Thu, Jul 31, 2014 at 11:05 AM, Matteo Moci mox...@gmail.com wrote:

 Hi All,
 I have some questions about the knapsack plugin [1].

 My idea to use the tool to do a backup to a file, starting from a 0.90.x
 instance and then restore it on a different 1.2.x or 1.3.x instance. I see
 it can't be done directly, copying to a local/remote cluster.

 Would it work doing an intermediate step with a file?
 Or the backup still has metadata about the es version it was generated
 from, making it impossible?

 Is the snapshot and restore feature [2] useful in my use case, or not?

 Is the knapsack plugin able to backup and restore also aliases and
 mappings, or do I have to manually migrate them before restoring data?

 Thanks for the patience and the great work!
 Matteo

 [1] https://github.com/jprante/elasticsearch-knapsack
 [2]
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html

 --
 Matteo Moci
 http://mox.fm

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAONgFZ60jWViqzRVO6_U-rYo6dUzunE3ojv%2BR5U8HX1Lwp4PdA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGrXthJjCchEf2oyvXKnSZyBp31nvnAeXwAZJaEkvnT5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Nested documents returned in search results by error

2014-07-31 Thread Jennifer Cumming
I'm trying to use a nested object type to store some additional data but 
running into some inconsistencies versus the documentation when it comes to 
searching. From the documentation it sounds like a simple search like

{
  query: {
bool: {
  must : {
query_string: {
  query: jane
}
  }
}
  }
}

Should not return any documents where the only instance of jane is in a 
nested document. Yet when I run this search against my test data it does. 
Mystery deepens when I then use the explain API to try work out why it was 
returned and it says the document was matched on _all yet I have 
include_in_all set to false in the mapping.

mapping and data for the two documents 
here https://gist.github.com/anonymous/febab9c09bdf9ea9849c
Search results here https://gist.github.com/anonymous/74311fab5e2452938505
Explain here https://gist.github.com/anonymous/4fed3653658d70fb0df8

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a2ab8e2-f725-4eb5-a4ee-1cb9d6a5097f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch memory usage on centralized log clusters

2014-07-31 Thread Mark Walkom
A lot of the answers for performance and capacity are it depends.
You'll get *much* better performance from Oracle java, 1.7u55 is current
recommended. Given you're experimenting with GCG1 (which isn't current
best practise, hence experimenting), you might even want to try Java 1.8.

If you want to drop memory use you can disable bloom filtering.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 31 July 2014 19:16, Tim Stoop tim.st...@gmail.com wrote:

 Hi all,

 We've been running an ElasticSearch cluster of three nodes since last
 December. We're running them on Debian Wheezy. Due to the size of our
 network, we're getting about 600 messages/s (800 at peak times). Using
 logstash and daily indices, we're currently at about 3TB of data spread
 over 158 indices. We use 2 replicas, so all machines have all data.

 I'm struggling to understand what expected working limits are for
 ElasticSearch in a centralized log situation and would really appreciate
 input from others. What we eventually try to do is provide about 13 months
 of searchable logs via Kibana, but we're mostly running into RAM
 constraints. Each node has 3TB of data, but the heap usage is almost
 constantly up at 27-29GB.

 We did have a problem with garbage collects earlier, which took so long
 that a node would drop from the cluster. To fix this, we switched to the G1
 gc, which seems very well suited for this operation. The machines we're
 using have 12 cores, so the added CPU overhead is negligible (general usage
 of the cores is less than 30%). But I'm not enough of a Java dev to judge
 if this switch could be the cause of the constant high heap usage. We're
 currently at about 1GB RAM per 100GB of disk usage.

 My questions:

 1- Is 1GB RAM usage per 100GB disk usage an expected usage pattern for an
 index heavy cluster?
 2- Aside from closing indices, are there other ways of lowering this?
 2.5- Should I worry about it?
 3- Are we approaching this the wrong way and should we change our setup?
 4- Would upgrading to 1.3.1 change the usage significantly, due to the fix
 in issue 6856 or is it unrelated?

 The numbers:

 3 nodes with the following config:
 - 92GB of RAM
 - 12 cores with HT
 - 6x 3.6TB spinning disks (we're contemplating adding CacheCade, since the
 controller supports it)
   - We expose the disks to elasticsearch via 6 mount points
 - Debian Wheezy
 - Java 1.7.0_65 OpenJDK JRE with IcedTea 2.5.1 (Debian package with
 version 7u65-2.5.1-2~deb7u1)
 - ElasticSearch 1.2.1 from the ElasticSearch repositories

 Config snippets (leaving out ips and the like):

 bootstrap:
  mlockall: true
 cluster:
  name: logging
  routing:
   allocation:
 node_concurrent_recoveries: 4
 discovery:
  zen:
   minimum_master_nodes: 2
   ping:
unicast:
 hosts:
 [snip]
 index:
  number_of_replicas: 2
  number_of_shards: 6
 indices:
  memory:
   index_buffer_size: 50%
  recovery:
   concurrent_streams: 5
   max_bytes_per_sec: 100mb
 node:
  concurrent_recoveries: 6
  name: stor1-stor
 path:
  data:
- /srv/elasticsearch/a
- /srv/elasticsearch/b
- /srv/elasticsearch/c
- /srv/elasticsearch/d
- /srv/elasticsearch/f
- /srv/elasticsearch/e

 Java options: -Xms30g -Xmx30g -Xss256k -Djava.awt.headless=true
 -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError

 Thanks for your help! Please let me know if you need more information

 --
 Kind regards,
 Tim Stoop

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2846daa5-a75c-4d04-ab34-957984c9e05e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2846daa5-a75c-4d04-ab34-957984c9e05e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aGOa%2B_EmSFkQgXZSv_gcyT%2BS1rO_XnpcZoaLDku4iruA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


3rd party scoring service

2014-07-31 Thread Alex S.V.
Hello,

My idea is to use 3rd party scoring service (REST), and currently I'd like 
to use native scripts and play with NativeScriptFactory.
The approach has many drawbacks. 

Here is my problem - assume we have two entities - products and product 
prices. I should filter by price. 
Price is a complex thing, because it depends on many factors, like request 
date, remote user information, custom provided parameters. In case of 
regular parent - child relation and has_child query it's too complex and 
too slow to implement it using scripting (currently mvel).

Also one more condition - i have not many products - around 25K, and around 
25M different base price items (which are basic for future price 
calculation).
There are next ideas:
1. Have a service, which returns exact price for all product by custom 
parameters like. The drawback is - there should be 5 same calls from each 
shard (if 5 by default). In this case it doesn't matter, where base prices 
are stored - in elasticsearch index, in database or in in-memory storage. 
2. Write a code, which operates over child price documents on concrete 
shard. In this case it will generate prices only for all properties from 
particular shard. But I don't know, if I can access shard index or make 
calls to the index from concrete shard in NativeScriptFactory class. 

Could you point me the right way?

P.S. Initially I was interested in Redis-Elasticsearch 
example http://java.dzone.com/articles/connecting-redis-elasticsearch

Thanks,
Alex

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/893b22dc-1415-475b-8675-596119f4f1f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3rd party scoring service

2014-07-31 Thread Itamar Syn-Hershko
You should bring the price over to Elasticsearch and not the other way
around. Scoring against an external service is an added friction with huge
performance costs.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Author of RavenDB in Action http://manning.com/synhershko/


On Thu, Jul 31, 2014 at 1:50 PM, Alex S.V. alexs.vasile...@gmail.com
wrote:

 Hello,

 My idea is to use 3rd party scoring service (REST), and currently I'd like
 to use native scripts and play with NativeScriptFactory.
 The approach has many drawbacks.

 Here is my problem - assume we have two entities - products and product
 prices. I should filter by price.
 Price is a complex thing, because it depends on many factors, like request
 date, remote user information, custom provided parameters. In case of
 regular parent - child relation and has_child query it's too complex and
 too slow to implement it using scripting (currently mvel).

 Also one more condition - i have not many products - around 25K, and
 around 25M different base price items (which are basic for future price
 calculation).
 There are next ideas:
 1. Have a service, which returns exact price for all product by custom
 parameters like. The drawback is - there should be 5 same calls from each
 shard (if 5 by default). In this case it doesn't matter, where base prices
 are stored - in elasticsearch index, in database or in in-memory storage.
 2. Write a code, which operates over child price documents on concrete
 shard. In this case it will generate prices only for all properties from
 particular shard. But I don't know, if I can access shard index or make
 calls to the index from concrete shard in NativeScriptFactory class.

 Could you point me the right way?

 P.S. Initially I was interested in Redis-Elasticsearch example
 http://java.dzone.com/articles/connecting-redis-elasticsearch

 Thanks,
 Alex

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/893b22dc-1415-475b-8675-596119f4f1f8%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/893b22dc-1415-475b-8675-596119f4f1f8%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtVL%2B8Qia7pAcYVt0y0EH8d4U3X7%2Bd3N2v%3DxyFfhjQMdA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: 3rd party scoring service

2014-07-31 Thread Alex S.V.
I think it's acceptable if service responds with 20ms and using some thrift 
protocol for example. It's much better then current 500ms - 5s calculations 
using elasticsearch scripting. 
If we have 25K products than it could be around 300Kb data package from 
this service. The risk is in possible broken communication or some 
increased latency

Alex

On Thursday, July 31, 2014 1:59:36 PM UTC+3, Itamar Syn-Hershko wrote:

 You should bring the price over to Elasticsearch and not the other way 
 around. Scoring against an external service is an added friction with huge 
 performance costs.

 --

 Itamar Syn-Hershko
 http://code972.com | @synhershko https://twitter.com/synhershko
 Freelance Developer  Consultant
 Author of RavenDB in Action http://manning.com/synhershko/


 On Thu, Jul 31, 2014 at 1:50 PM, Alex S.V. alexs.v...@gmail.com 
 javascript: wrote:

 Hello,

 My idea is to use 3rd party scoring service (REST), and currently I'd 
 like to use native scripts and play with NativeScriptFactory.
 The approach has many drawbacks. 

 Here is my problem - assume we have two entities - products and product 
 prices. I should filter by price. 
 Price is a complex thing, because it depends on many factors, like 
 request date, remote user information, custom provided parameters. In case 
 of regular parent - child relation and has_child query it's too complex and 
 too slow to implement it using scripting (currently mvel).

 Also one more condition - i have not many products - around 25K, and 
 around 25M different base price items (which are basic for future price 
 calculation).
 There are next ideas:
 1. Have a service, which returns exact price for all product by custom 
 parameters like. The drawback is - there should be 5 same calls from each 
 shard (if 5 by default). In this case it doesn't matter, where base prices 
 are stored - in elasticsearch index, in database or in in-memory storage. 
 2. Write a code, which operates over child price documents on concrete 
 shard. In this case it will generate prices only for all properties from 
 particular shard. But I don't know, if I can access shard index or make 
 calls to the index from concrete shard in NativeScriptFactory class. 

 Could you point me the right way?

 P.S. Initially I was interested in Redis-Elasticsearch example 
 http://java.dzone.com/articles/connecting-redis-elasticsearch

 Thanks,
 Alex

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/893b22dc-1415-475b-8675-596119f4f1f8%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/893b22dc-1415-475b-8675-596119f4f1f8%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c61f9637-3de8-4906-a2c4-49055dee2cd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch memory usage on centralized log clusters

2014-07-31 Thread Tim Stoop
Hi Mark,

Thanks for your reply!

Op donderdag 31 juli 2014 12:16:36 UTC+2 schreef Mark Walkom:

 A lot of the answers for performance and capacity are it depends.


Well, I'm currently not that worried about performance, as long as it can 
keep up with the amount of data we throw at it. It's currently 800 
messages/s at peak times, but we expect to grow to about 2000. Beyond that, 
as long as searches do not time out, I'd be happy. I'm much more worried 
about stability at the moment, hence my question regarding if I should be 
worried about the memory usage I'm seeing.
 

 You'll get *much* better performance from Oracle java, 1.7u55 is current 
 recommended. Given you're experimenting with GCG1 (which isn't current 
 best practise, hence experimenting), you might even want to try Java 1.8.


Ok, will try the Oracle JRE. Regarding G1 being experimental, I assume you 
mean from ES' POV, right? Because from what I read, it's fully supported in 
Java 7. I didn't find any other way to solve the 'stop the world' gc the 
CMS ran into every few hours :S I'm not a Java dev, however, just wanted 
something that didn't crash once a day.
 

 If you want to drop memory use you can disable bloom filtering.

  
Done and that did indeed help a little.

-- 
Kind regards,
Tim Stoop

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/df14168b-3615-450e-9a8f-4360c811d7d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elastic Search losing data not in _source on _update

2014-07-31 Thread Jordan Reiter
Hi, 

Every time I run a POST request using _update, I notice that any indexed 
information I didn't put in _source appears to go missing.

Obviously, it would be ideal if I didn't have to store, for example, the 
contents of a several-megabyte file in _source in order to keep it in my 
record after calling the _update method on my index/mapping. 

To start, here is the version info for elastic search:

{
  status : 200,
  name : Feron,
  version : {
number : 1.3.1,
build_hash : 2de6dc5268c32fb49b205233c138d93aaf772015,
build_timestamp : 2014-07-28T14:45:15Z,
build_snapshot : false,
lucene_version : 4.9
  },
  tagline : You Know, for Search
}


Here's my cluster health: 


{
  cluster_name : my-cluster,
  status : green,
  timed_out : false,
  number_of_nodes : 2,
  number_of_data_nodes : 2,
  active_primary_shards : 5,
  active_shards : 10,
  relocating_shards : 0,
  initializing_shards : 0,
  unassigned_shards : 0
}


A script for recreating the issue is attached. In it, I create a mapping and 
save a record using the attachment plugin. The records correctly match searches 
on a field in _source, a field excluded from _source, and within the content 
(attachment) field (also excluded from source).


As soon as I make the POST request to …/_update searches against fields 
excluded from _source return 0 hits.


Is the only solution to this to store all fields in _source if I plan on 
calling _update on the record?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/341a24f2-aedf-4f5f-9a9e-1434b9ea1e62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
indexserver=example.org:9200
indexname=sample
curl -XDELETE http://$indexserver/$indexname/losing_data/_mapping?pretty=1;
curl -XPUT http://$indexserver/$indexname/losing_data/_mapping?pretty=1; -d '
{
  losing_data : {
_source : {
  enabled: true,
  excludes : [ content, not_sourced ]
},
properties : {
  record_counts : {
type : nested,
include_in_parent: true,
properties : {
  first_count : {
type : long
  },
  second_count : {
type : long
  }
}
  },

  description : {
type : string
  },
  not_sourced : {
type : string
  },
  blarf : {
type: string
  },
  content : {
type : attachment
}
  }

}
  }
}
'




file_path='test-1.rtf' # test-1.rtf is an RTF file containing the phrase This contains red
file_content=`cat $file_path | perl -MMIME::Base64 -ne 'print encode_base64($_)'`
json='
{
   content : '${file_content}',
   not_sourced: Giraffe,
   description: This is not about bears,
   record_counts: {
first_count: 1,
second_count: 100
   }
}
'
echo $json  json.file
curl -XPUT http://$indexserver/$indexname/losing_data/1; -d @json.file

curl http://$indexserver/$indexname/losing_data/_search?q=redpretty=1;
# should return one hit (from the file)
curl http://$indexserver/$indexname/losing_data/_search?q=giraffepretty=1;
# should return one hit (from not_sourced field)
curl http://$indexserver/$indexname/losing_data/_search?q=bearspretty=1;
# should return one hit (from description)

curl http://$indexserver/$indexname/losing_data/1?pretty=1;
# record_counts  first_count should be 1
curl -XPOST http://$indexserver/$indexname/losing_data/1/_update; -d '{
script: ctx._source.record_counts.first_count += 1,
lang: groovy
}'
curl http://$indexserver/$indexname/losing_data/1/?pretty=1;
# record_counts  first_count should be 2

curl http://$indexserver/$indexname/losing_data/_search?q=redpretty=1;
# I get 0 hits
curl http://$indexserver/$indexname/losing_data/_search?q=giraffepretty=1;
# I get 0 hits
curl http://$indexserver/$indexname/losing_data/_search?q=bearspretty=1;
# description is in source, so I still get a hit


Re: ElasticSearch memory usage on centralized log clusters

2014-07-31 Thread Mark Walkom
GCG1 is experimental in that it's not recommended by the ES team as you
guessed, even if it is supported within java.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 31 July 2014 21:30, Tim Stoop tim.st...@gmail.com wrote:

 Hi Mark,

 Thanks for your reply!

 Op donderdag 31 juli 2014 12:16:36 UTC+2 schreef Mark Walkom:

 A lot of the answers for performance and capacity are it depends.


 Well, I'm currently not that worried about performance, as long as it can
 keep up with the amount of data we throw at it. It's currently 800
 messages/s at peak times, but we expect to grow to about 2000. Beyond that,
 as long as searches do not time out, I'd be happy. I'm much more worried
 about stability at the moment, hence my question regarding if I should be
 worried about the memory usage I'm seeing.


 You'll get *much* better performance from Oracle java, 1.7u55 is current
 recommended. Given you're experimenting with GCG1 (which isn't current
 best practise, hence experimenting), you might even want to try Java 1.8.


 Ok, will try the Oracle JRE. Regarding G1 being experimental, I assume you
 mean from ES' POV, right? Because from what I read, it's fully supported in
 Java 7. I didn't find any other way to solve the 'stop the world' gc the
 CMS ran into every few hours :S I'm not a Java dev, however, just wanted
 something that didn't crash once a day.


 If you want to drop memory use you can disable bloom filtering.


 Done and that did indeed help a little.


 --
 Kind regards,
 Tim Stoop

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/df14168b-3615-450e-9a8f-4360c811d7d0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/df14168b-3615-450e-9a8f-4360c811d7d0%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624aisD-1uRrKY-V_ox2gBah2GKqC21YEfbGZvm5V4n6nfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommendations needed for large ELK system design

2014-07-31 Thread Alex
Hello Mark,

Thank you for your reply, it certainly helps to clarify many things.

Of course I have some new questions for you!

   1. I haven't looked into it much yet but I'm guessing Curator can handle 
   different index naming schemes. E.g. logs-2014.06.30 and 
   stats-2014.06.30. We'd actually be wanting to store the stats data for 2 
   years and logs for 90 days so it would indeed be helpful to split the data 
   into different index sets. Do you use Curator?
   
   2. You say that you have 3 masters that also handle queries... but I 
   thought all masters did was handle queries? What is a master node that 
   *doesn't* handle queries? Should we have search load balancer nodes? AKA 
   not master and not data nodes.
   
   3. In the interests of reducing the number of node combinations for us 
   to test out would you say, then, that 3 master (and query(??)) only nodes, 
   and the 6 1TB data only nodes would be good?
   
   4. Quorum and split brain are new to me. This webpage 
   
http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/
 about 
   split brain recommends setting *discovery.zen.minimum_master_nodes* equal 
   to *N/2 + 1*. This formula is similar to the one given in the 
   documentation for quorum 
   
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency:
 
   index operations only succeed if a quorum (replicas/2+1) of active shards 
   are available. I completely understand the split brain issue, but not 
   quorum. Is quorum handled automatically or should I change some settings?

Thanks again for your help, we appreciate your time and knowledge!
Regards,
Alex

On Thursday, 31 July 2014 05:57:35 UTC+1, Mark Walkom wrote:

 1 - Looks ok, but why two replicas? You're chewing up disk for what 
 reason? Extra comments below.
 2 - It's personal preference really and depends on how your end points 
 send to redis.
 3 - 4GB for redis will cache quite a lot of data if you're only doing 50 
 events p/s (ie hours or even days based on what I've seen).
 4 - No, spread it out to all the nodes. More on that below though.
 5 - No it will handle that itself. Again, more on that below though.

 Suggestions;
 Set your indexes to (factors of) 6 shards, ie one per node, it spreads 
 query performance. I say factors of in that you can set it to 12 shards 
 per index to start and easily scale the node count and still spread the 
 load.
 Split your stats and your log data into different indexes, it'll make 
 management and retention easier.
 You can consider a master only node or (ideally) three that also handle 
 queries.
 Preferably have an uneven number of master eligible nodes, whether you 
 make them VMs or physicals, that way you can ensure quorum is reached with 
 minimal fuss and stop split brain.
 If you use VMs for master + query nodes then you might want to look at 
 load balancing the queries via an external service.

 To give you an idea, we have a 27 node cluster - 3 masters that also 
 handle queries and 24 data nodes. Masters are 8GB with small disks, data 
 nodes are 60GB (30 heap) and 512GB disk.
 We're running with one replica and have 11TB of logging data. At a high 
 level we're running out of disk more than heap or CPU and we're very write 
 heavy, with an average of 1K events p/s and comparatively minimal reads.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 31 July 2014 01:35, Alex alex@gmail.com javascript: wrote:

 Hello,

 We wish to set up an entire ELK system with the following features:

- Input from Logstash shippers located on 400 Linux VMs. Only a 
handful of log sources on each VM. 
- Data retention for 30 days, which is roughly 2TB of data in indexed 
ES JSON form (not including replica shards)
- Estimated input data rate of 50 messages per second at peak hours. 
Mostly short or medium length one-line messages but there will be Java 
traces and very large service responses (in the form of XML) to deal with 
too. 
- The entire system would be on our company LAN.
- The stored data will be a mix of application logs (info, errors 
etc) and server stats (CPU, memory usage etc) and would mostly be 
 accessed 
through Kibana. 

 This is our current plan:

- Have the LS shippers perform minimal parsing (but would do 
multiline). Have them point to two load-balanced servers containing Redis 
and LS indexers (which would do all parsing). 
- 2 replica shards for each index, which ramps the total data storage 
up to 6TB
- ES cluster spread over 6 nodes. Each node is 1TB in size 
- LS indexers pointing to cluster.

 So I have a couple questions regarding the setup and would greatly 
 appreciate the advice of someone with experience!

1. Does the balance between the number of nodes, the number of 
replica 

Failure connecting to x.x.x.x: dial tcp x.x.x.x:x: connection refused

2014-07-31 Thread Indirajith V
Hi all, I am Naive to elasticsearch and logstash. Logstash suddenly. When I 
looked in to the logs parsed and I can only see the following.
Jul 31 14:55:16 logs 2014-07-31T14:55:16+02:00  
logstash-forwarder[22514]: 2014/07/31 14:55:16.595025 Failure connecting to 
x.x,x.x: dial tcp x.x.x.x:x: connection refused.
Can anyone help to rectify this. I can not find any solution. Thanks in 
advance!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96794c7b-120c-4159-945d-ef184c94abe1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search losing data not in _source on _update

2014-07-31 Thread David Pilato
Yes you need the complete source so excluding fields won't work as expected.
In that case, you need to send back the attachment again I guess.



-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 13:38:54, Jordan Reiter (jordantheco...@gmail.com) a écrit:

Hi, 

Every time I run a POST request using _update, I notice that any indexed 
information I didn't put in _source appears to go missing.

Obviously, it would be ideal if I didn't have to store, for example, the 
contents of a several-megabyte file in _source in order to keep it in my record 
after calling the _update method on my index/mapping. 

To start, here is the version info for elastic search:


{
  status : 200,
  name : Feron,
  version : {
number : 1.3.1,
build_hash : 2de6dc5268c32fb49b205233c138d93aaf772015,
build_timestamp : 2014-07-28T14:45:15Z,
build_snapshot : false,
lucene_version : 4.9
  },
  tagline : You Know, for Search
}




Here's my cluster health:  




{
  cluster_name : my-cluster,
  status : green,
  timed_out : false,
  number_of_nodes : 2,
  number_of_data_nodes : 2,
  active_primary_shards : 5,
  active_shards : 10,
  relocating_shards : 0,
  initializing_shards : 0,
  unassigned_shards : 0
}


A script for recreating the issue is attached. In it, I create a mapping and 
save a record using the attachment plugin. The records correctly match searches 
on a field in _source, a field excluded from _source, and within the content 
(attachment) field (also excluded from source).


As soon as I make the POST request to …/_update searches against fields 
excluded from _source return 0 hits.


Is the only solution to this to store all fields in _source if I plan on 
calling _update on the record?
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/341a24f2-aedf-4f5f-9a9e-1434b9ea1e62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da44d3.38437fdb.f0d0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: more like this vs. mlt

2014-07-31 Thread Peter Li
Sorry, I copied and pasted the wrong query. It still didn't work with the 
min* clauses in the body:

curl -XGET $url/ease/RadiologyResult/_search?routing=07009409pretty -d '
{

query : {
more_like_this : {
fields: [
Observation.Value
],
ids : [ 90642 ],
min_term_freq : 1,
min_doc_freq : 1
}
}
}'

This returned nothing either.

On Wednesday, July 30, 2014 11:03:28 PM UTC-5, vineeth mohan wrote:

 Hello Peter , 

 You have set these variable for the API and not the query , that is why 
 its working - min_term_freq=1 , min_doc_freq=1

 Thanks
Vineeth


 On Thu, Jul 31, 2014 at 5:02 AM, Peter Li jenli...@gmail.com 
 javascript: wrote:

 I ran a query:

 curl -XGET 
 $url/ease/RadiologyResult/90642/_mlt?routing=07009409mlt_fields=Observation.Valuemin_term_freq=1min_doc_freq=1pretty

 It worked and returned several documents. But if I ran this:

 curl -XGET $url/ease/RadiologyResult/_search?routing=07009409pretty -d 
 '
 {
 
 query : {
 more_like_this : {
 fields: [
 Observation.Value
 ],
 ids : [ 90642 ]
}
}
 }'

 It returned nothing. Is there something I am missing ?

 Thanks in advance.
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ffa73983-ce04-4fd3-9786-ee09d3248d83%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/ffa73983-ce04-4fd3-9786-ee09d3248d83%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3a47fc16-cc0d-4871-8ea6-0287635fa66e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search losing data not in _source on _update

2014-07-31 Thread Jordan Reiter
So I guess using updates is not a good idea for records with file 
attachments.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/972f1b41-d10b-4ebc-8ac2-c83b80891924%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search losing data not in _source on _update

2014-07-31 Thread David Pilato
In your case, it's not. Because you excluded the attachment field.

If you are a Java developer, you could easily use Tika directly in your own 
code and send to elasticsearch only the extracted content and not the binary 
file.
In that case, you could remove mapper attachment plugin.

If not, I think you need to send again the full JSON document, including the 
binary file.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 16:32:04, Jordan Reiter (jordantheco...@gmail.com) a écrit:

So I guess using updates is not a good idea for records with file attachments.
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/972f1b41-d10b-4ebc-8ac2-c83b80891924%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da5401.684a481a.f0d0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Diagnosing a slow query

2014-07-31 Thread Christopher Ambler
Okay, let's attack this directly. We have a cluster of 6 machines (6 
nodes). We have an index of just under 3.5 million documents. Each document 
represents an Internet domain name. We are performing queries against this 
index to see names that exist in our index. Most queries are coming back in 
the sub-50ms range. But a bunch are taking 600ms to 900ms and, thus, 
showing up in our slow query log. If they ALL were performing at this 
speed, I'd wouldn't be nearly as confused, but it looks like only about 10% 
to 20% of the queries are slow. That's clearly too much.

Head reports that this index looks like this:

aftermarket-2014-07-31_02-38-19
size: 424Mi (2.47Gi)
docs: 3,428,471 (3,428,471)

Here is the configuration for a typical node (they're all pretty-much the 
same). We have 2 machines in a dev data center, 2 machines in a mesa data 
center and 2 machines in a phx data center. Each of the two machines in a 
data center has a node.zone tag set, and, as you can see, I have the 
cluster routing awareness set to see zone as its marching orders. The 
data pipes between the data centers are beefy, and while I acknowledge that 
cross-DC isn't something that's generally smiled-upon, it appears to work 
fine.

Each machine has 96G of RAM. We start ES giving it 30G for the heap size. 
File descriptors are set at 64,000. Note that I've selected the memory 
mapped file system.

#
# Server-specific settings for cluster domainiq-es
#
cluster.name: domainiq-es
node.name: Mesa-03
node.zone: es-mesa-prod
discovery.zen.ping.unicast.hosts: [dev2.glbt1.gdg, m1p1.mesa1.gdg, 
m1p4.mesa1.gdg, p3p3.phx3.gdg, p3p4.phx3.gdg]
#
# The following configuration items should be the same for all ES servers
#
node.master: true
node.data: true
index.number_of_shards: 5
index.number_of_replicas: 5
index.store.type: mmapfs
index.memory.index_buffer_size: 30%
index.translog.flush_threshold_ops: 25000
index.refresh_interval: 30s
bootstrap.mlockall: true
cluster.routing.allocation.awareness.attributes: zone
gateway.recover_after_nodes: 4
gateway.recover_after_time: 2m
gateway.expected_nodes: 6
discovery.zen.minimum_master_nodes: 3
discovery.zen.ping.timeout: 10s
discovery.zen.ping.retries: 3
discovery.zen.ping.interval: 15s
discovery.zen.ping.multicast.enabled: false

And here is a typical slow query:

[2014-07-31 07:35:31,530][WARN ][index.search.slowlog.query] [Mesa-03] 
[aftermarket-2014-07-31_02-38-19][2] took[707.6ms], took_millis[707], 
types[premium], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], 
source[], 
extra_source[{size:35,query:{query_string:{query:sld:petusies^20.0 
OR tokens:(((pet^1.2 pets^1.0 *^1.0)AND(us^1.2 *^0.8)AND(ie^1.2 
*^0.6)AND(s^1.2 *^0.4)) OR((pet^1.2 pets^1.0)AND(us^1.2)AND(ie^1.2))^3.0) 
AND tld:(com^1.001 OR in^0.99 OR co.in^0.941174367459617 OR 
net.in^0.8848832474555992 OR us^0.85 OR org.in^0.8397882862729736 OR 
gen.in^0.785829669672289 OR firm.in^0.7414549824163524 OR ind.in^0.7 OR 
org^0.6) OR 
_id:petusi.es^5.0-domaintype:partner,lowercase_expanded_terms:true,analyze_wildcard:false}}}],
 


So note that I create 5 shards and 5 replicas, so that each node has all 5 
shards at all times. I THOUGHT THIS MEANT BETTER PERFORMANCE. That is, I 
thought having all 5 shards on every node meant that a query to a node 
didn't have to ask another node for data. IS THIS NOT TRUE?

Here's where it also gets interesting: I tried setting the number of shards 
to 2 (with 5 replicas) and my slow queries went to almost 2 seconds 
(2000ms). This is also terribly counter-intuitive! I thought fewer shards 
meant less lookup time.

Clearly, I want to optimize for read here. I don't care if indexing is 
three times as slow, we need our queries to be sub-100ms.

Any help is SERIOUSLY appreciated (and if you're in the Bay Area, I'm not 
above bribes of beer :-))

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/43b8bd8a-b20f-49de-a99d-825168095d6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Yet another OOME: Java heap space thread :S

2014-07-31 Thread Chris Neal
Ooops.  Sorry.  That was a copy/paste error.  It is using 16GB.  Here is
the correct process arguments:

/usr/bin/java -Xms16g -Xmx16g -Xss256k -Djava.awt.headless=true -server
-XX:+UseCompressedOops -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch
-Des.pidfile=/var/run/elasticsearch/elasticsearch.pid
-Des.path.home=/usr/share/elasticsearch [snip CP]

Thanks!
Chris


On Thu, Jul 31, 2014 at 2:43 AM, David Pilato da...@pilato.fr wrote:

 Why do you start with 8gb HEAP? Can't you give 16gb or so?

 /usr/bin/java -Xms8g -Xmx8g



 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 30 juil. 2014 à 19:47, Chris Neal chris.n...@derbysoft.net a écrit :

 Hi everyone,

 First off, apologies for the thread.  I know OOME discussions are somewhat
 overdone in the group, but I need to reach out for some help for this one.

 I have a 2 node development cluster in EC2 on c3.4xlarge AMIs.  That means
 16 vCPUs, 30GB RAM, 1Gb network, and I have 2 500GB EBS volumes for
 Elasticsearch data on each AMI.

 I'm running Java 1.7.0_55, and using the G1 collector.  The Java args are:

 /usr/bin/java -Xms8g -Xmx8g -Xss256k -Djava.awt.headless=true -server
 -XX:+UseCompressedOops -XX:+UseG1GC -XX:MaxGCPauseMillis=20
 -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError
 The index has 2 shards, each with 1 replica.

 I have a daily index being filled with application log data.  The index,
 on average, gets to be about:
 486M documents
 53.1GB (primary size)
 106.2GB (total size)

 Other than indexing, there really is nothing going on in the cluster.  No
 searches, or percolators, just collecting data.

 I have:

- Tweaked the idex.merge.policy
- Tweaked the indices.fielddata.breaker.limit and cache.size
- change the index refresh_interval from 1s to 60s
- created a default template for the index such that _all is disabled,
and all fields in the mapping are set to not_analyzed.

 Here is my complete elasticsearch.yml:

 action:
   disable_delete_all_indices: true
 cluster:
   name: elasticsearch-dev
 discovery:
   zen:
 minimum_master_nodes: 2
 ping:
   multicast:
 enabled: false
   unicast:
 hosts: 10.0.0.45,10.0.0.41
 gateway:
   recover_after_nodes: 2
 index:
   merge:
 policy:
   max_merge_at_once: 5
   max_merged_segment: 15gb
   number_of_replicas: 1
   number_of_shards: 2
   refresh_interval: 60s
 indices:
   fielddata:
 breaker:
   limit: 50%
 cache:
   size: 30%
 node:
   name: elasticsearch-ip-10-0-0-45
 path:
   data:
   - /usr/local/ebs01/elasticsearch
   - /usr/local/ebs02/elasticsearch
 threadpool:
   bulk:
 queue_size: 500
 size: 75
 type: fixed
   get:
 queue_size: 200
 size: 100
 type: fixed
   index:
 queue_size: 1000
 size: 100
 type: fixed
   search:
 queue_size: 200
 size: 100
 type: fixed

 The heap sits about 13GB used.   I had been batting OOME exceptions for
 awhile, and thought I had it licked, but one just popped up again.  My
 cluster has been up and running fine for 14 days, and I just got this OOME:

 =
 [2014-07-30 11:52:28,394][INFO ][monitor.jvm  ]
 [elasticsearch-ip-10-0-0-41] [gc][young][1158834][109906] duration [770ms],
 collections [1]/[1s], total [770ms]/[43.2m], memory
 [13.4gb]-[13.4gb]/[16gb], all_pools {[young]
 [648mb]-[8mb]/[0b]}{[survivor] [0b]-[0b]/[0b]}{[old]
 [12.8gb]-[13.4gb]/[16gb]}
 [2014-07-30 15:03:01,070][WARN ][index.engine.internal]
 [elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed engine [out of
 memory]
 [2014-07-30 15:03:10,324][WARN
 ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the
 selector loop.
 java.lang.OutOfMemoryError: Java heap space
 [2014-07-30 15:03:10,335][WARN
 ][netty.channel.socket.nio.AbstractNioSelector] Unexpected exception in the
 selector loop.
 java.lang.OutOfMemoryError: Java heap space
 [2014-07-30 15:03:10,324][WARN ][index.merge.scheduler]
 [elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to merge
 java.lang.OutOfMemoryError: Java heap space
 [2014-07-30 15:03:28,595][WARN ][index.translog   ]
 [elasticsearch-ip-10-0-0-41] [derbysoft-20140730][0] failed to flush shard
 on translog threshold
 org.elasticsearch.index.engine.FlushFailedEngineException:
 [derbysoft-20140730][0] Flush failed
 at
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
  at
 org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
 at
 org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.IllegalStateException: this writer hit an
 

Can the elasticsearch-reindex plugin preserve internal search indices?

2014-07-31 Thread Chris Berry
Greetings,

Recently, as suggested in this mailing list and the inter-web in several 
posts, we used the elasticsearch-reindex plugin to change the number of 
Shards for a given large Index, because the Shards for that Index had 
become too large.

But after the reindex — we could not search against the new Index. 
All of the stored data was transferred properly though.

In essence, all of the internal search indices did not transfer from the 
old Index to the new.
(The first sign of this was that the new Index was about 1/4 the size of 
the old)

I am assuming that I just did this wrong??
Is it possible to reindex from one Index into another, preserving all the 
internal search indices??

BTW: it doesn’t seem to matter that whether we had the _source saved or 
not??

Thanks,
— Chris 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b86cc2fe-87df-4126-8145-65c2984e7755%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed to configure logging error on start up.

2014-07-31 Thread Ivan Brusic
That scenario should not happen since FileVisitOption.FOLLOW_LINKS is
enabled.

https://github.com/elasticsearch/elasticsearch/blob/fe86c8bc88a321bf587dd8eb4df52aaed9ed2156/src/main/java/org/elasticsearch/common/logging/log4j/LogConfigurator.java#L107

Seems like a bug somewhere.

-- 

Ivan


On Wed, Jul 30, 2014 at 11:54 AM, Peter Li jenli.pe...@gmail.com wrote:

 Did more experiments. If I used a real scripts directory, instead of a
 symbolic link,
 then no error message. But does this means that I will have to drop the
 same script
 into all my server's config/scripts directory ? It would be nice to use
 symbolic links
 for this.

 Any suggestions ?

 On Wednesday, July 30, 2014 1:36:24 PM UTC-5, Peter Li wrote:

 I have a setup with multiple servers.
 The file tree for each is like the following:

 /data/
configs/
  elastic-1.yml
  logging-1.yml
  scripts/
(empty)
elastic-core/   (from distribution)
  bin/...
  config/...
  lib/...
  logs/...
elastic-1/
   bin -- ../elastic-core/bin
   config/
  elasticsearch.yml -- ../../configs/elastic-1.yml
  logging.yml -- ../../configs/logging-1.yml
  scripts -- ../../configs/scripts
   data/...
   lib -- ../elastic-core/lib
   logs/...

 In the elastic-1.yml, I have:

path.conf=/data/elastic-1/config

 When I start the node without the config/scripts symbolic link:

 /data/elastic-1/bin/elasticsearch -Des.config=/data/elastic-1/
 config/elasticsearch.yml

 It runs fine. But if I have the scripts link/directory, it complains of:

 Failed to configure logging...
 org.elasticsearch.ElasticsearchException: Failed to load logging
 configuration
 at org.elasticsearch.common.logging.log4j.LogConfigurator.
 resolveConfig(LogConfigurator.java:117)
 at org.elasticsearch.common.logging.log4j.LogConfigurator.
 configure(LogConfigurator.java:81)
 at org.elasticsearch.bootstrap.Bootstrap.setupLogging(Bootstrap.java:94)
 at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:178)
 at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
 Caused by: java.nio.file.FileSystemException: /data/elastic-1/config:
 Unknown error 1912615013
 at sun.nio.fs.UnixException.translateToIOException(Unknown Source)
 at sun.nio.fs.UnixException.asIOException(Unknown Source)
 at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.readNextEntry(Unknown
 Source)
 at sun.nio.fs.UnixDirectoryStream$UnixDirectoryIterator.hasNext(Unknown
 Source)
 at java.nio.file.FileTreeWalker.walk(Unknown Source)
 at java.nio.file.FileTreeWalker.walk(Unknown Source)
 at java.nio.file.Files.walkFileTree(Unknown Source)
 at org.elasticsearch.common.logging.log4j.LogConfigurator.
 resolveConfig(LogConfigurator.java:107)
 ... 4 more
 log4j:WARN No appenders could be found for logger (common.jna).
 log4j:WARN Please initialize the log4j system properly.
 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
 more info.

 Any guesses as to why it is complaining ?

 Thanks in advance.

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5a010109-7f5e-43a4-b0b6-1ee59957fb8d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5a010109-7f5e-43a4-b0b6-1ee59957fb8d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB-aWq4cOJ0%2BZMuBBOwCWSBjk70JNgHu8bLLg8tt4F%3DWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ip type and support for a port?

2014-07-31 Thread David Pilato
Yes. :80 should not be part of the IP. It's a port.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 16:56:29, Chris Neal (chris.n...@derbysoft.net) a écrit:

Hi Jorg,

Here's an example:

An ip field named as 'host' in mapping.
               host: {
                 type :ip
               },

Then I add below to ES
POST /test_index/test
{
  host:127.0.0.1:80
}

ES returns this error
{
   error: MapperParsingException[failed to parse [host]]; nested: 
ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]]; 
nested: NumberFormatException[For input string: \1:80\]; ,
   status: 400
}

If I only post IP without port like this
POST /test_index/test
{
  host:127.0.0.1
}
ES return success.

Does that help explain the issue?
Thanks!
Chris


On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com joergpra...@gmail.com 
wrote:
Can you give an example what you mean by IP ports?

Transport protocols like TCP has ports, but IP (Internet addresses) is used to 
address hosts on a network.

Jörg



On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net wrote:
Hi all,

I'm trying to use the ip type in ES, but my IPs also have ports.  That doesn't 
seem to be supported, which was a bit of a surprise!

Does anyone know of a way to do this?  Or does it sound like a good feature to 
add support for to this type?

Thanks!
Chris
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da6df3.75c6c33a.f0d0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: ip type and support for a port?

2014-07-31 Thread joergpra...@gmail.com
Maybe it would help to implement a new ES data type for URI/URL?

Usage example:

POST /test/test
{
url : http://127.0.0.1:80;
}

Jörg


On Thu, Jul 31, 2014 at 4:56 PM, Chris Neal chris.n...@derbysoft.net
wrote:

 Hi Jorg,

 Here's an example:

 An ip field named as 'host' in mapping.
host: {
  type :ip
},

 Then I add below to ES
 POST /test_index/test
 {
   host:127.0.0.1:80 http://127.0.0.1/
 }

 ES returns this error
 {
error: MapperParsingException[failed to parse [host]]; nested:
 ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]];
 nested: NumberFormatException[For input string: \1:80\]; ,
status: 400
 }

 If I only post IP without port like this
 POST /test_index/test
 {
   host:127.0.0.1
 }
 ES return success.

 Does that help explain the issue?
 Thanks!
 Chris


 On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

 Can you give an example what you mean by IP ports?

 Transport protocols like TCP has ports, but IP (Internet addresses) is
 used to address hosts on a network.

 Jörg



 On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net
 wrote:

 Hi all,

 I'm trying to use the ip type in ES, but my IPs also have ports.  That
 doesn't seem to be supported, which was a bit of a surprise!

 Does anyone know of a way to do this?  Or does it sound like a good
 feature to add support for to this type?

 Thanks!
 Chris

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkTDvuqYVFBy-L-WvgWA9NjeiHBAuj%2BmmiLW%3DJkn1jgQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ip type and support for a port?

2014-07-31 Thread David Pilato
I think you should handle that at index time and separate IP from port and 
index both in separate fields.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 18:38:09, Chris Neal (chris.n...@derbysoft.net) a écrit:

Right, but I was more wondering about the use case of supporting a port in 
*some* fashion, in some type?
Or is a string type the best type for holding an ip:port combination?

Thanks for the help :)



On Thu, Jul 31, 2014 at 11:25 AM, David Pilato da...@pilato.fr wrote:
Yes. :80 should not be part of the IP. It's a port.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 16:56:29, Chris Neal (chris.n...@derbysoft.net) a écrit:

Hi Jorg,

Here's an example:

An ip field named as 'host' in mapping.
               host: {
                 type :ip
               },

Then I add below to ES
POST /test_index/test
{
  host:127.0.0.1:80
}

ES returns this error
{
   error: MapperParsingException[failed to parse [host]]; nested: 
ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]]; 
nested: NumberFormatException[For input string: \1:80\]; ,
   status: 400
}

If I only post IP without port like this
POST /test_index/test
{
  host:127.0.0.1
}
ES return success.

Does that help explain the issue?
Thanks!
Chris


On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com joergpra...@gmail.com 
wrote:
Can you give an example what you mean by IP ports?

Transport protocols like TCP has ports, but IP (Internet addresses) is used to 
address hosts on a network.

Jörg



On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net wrote:
Hi all,

I'm trying to use the ip type in ES, but my IPs also have ports.  That doesn't 
seem to be supported, which was a bit of a surprise!

Does anyone know of a way to do this?  Or does it sound like a good feature to 
add support for to this type?

Thanks!
Chris
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da6df3.75c6c33a.f0d0%40MacBook-Air-de-David.local.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpiAi7F3f6YE_QP5VjtPDzx-bxtmW1PbpwYWtU2KN02U%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da7154.520eedd1.f0d0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: ip type and support for a port?

2014-07-31 Thread David Pilato
I don't see the value we can add here. 
For example, does a Range query on URL make sense? 

If it's only for breaking the string into two subfields at index time, it can 
be done by a plugin like the mapper attachment. Although, I'd prefer using 
logstash for example to do that before sending the index request to ES.


Just my 0.05 cents here :)

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 31 juillet 2014 à 18:39:33, joergpra...@gmail.com (joergpra...@gmail.com) a 
écrit:

Maybe it would help to implement a new ES data type for URI/URL?

Usage example:

POST /test/test
{
    url : http://127.0.0.1:80;
}

Jörg


On Thu, Jul 31, 2014 at 4:56 PM, Chris Neal chris.n...@derbysoft.net wrote:
Hi Jorg,

Here's an example:

An ip field named as 'host' in mapping.
               host: {
                 type :ip
               },

Then I add below to ES
POST /test_index/test
{
  host:127.0.0.1:80
}

ES returns this error
{
   error: MapperParsingException[failed to parse [host]]; nested: 
ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]]; 
nested: NumberFormatException[For input string: \1:80\]; ,
   status: 400
}

If I only post IP without port like this
POST /test_index/test
{
  host:127.0.0.1
}
ES return success.

Does that help explain the issue?
Thanks!
Chris


On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com joergpra...@gmail.com 
wrote:
Can you give an example what you mean by IP ports?

Transport protocols like TCP has ports, but IP (Internet addresses) is used to 
address hosts on a network.

Jörg



On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net wrote:
Hi all,

I'm trying to use the ip type in ES, but my IPs also have ports.  That doesn't 
seem to be supported, which was a bit of a surprise!

Does anyone know of a way to do this?  Or does it sound like a good feature to 
add support for to this type?

Thanks!
Chris
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkTDvuqYVFBy-L-WvgWA9NjeiHBAuj%2BmmiLW%3DJkn1jgQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53da728f.23f9c13c.f0d0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch still scan all types in a index even if I specify a type

2014-07-31 Thread Ivan Brusic
All types eventually belong to the same Lucene index and Lucene cannot
handle different types for the same field name. Avoid using the same name
across types if the field type is different.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping.html#_avoiding_type_gotchas

-- 
Ivan


On Wed, Jul 30, 2014 at 8:58 PM, panfei cnwe...@gmail.com wrote:

 First, put some sample data:

 curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '
 {
 title: jumping tom,
 val: 101
 }'

 curl -XPUT 'localhost:9200/testindex/action2/1?pretty' -d '
 {
 title: jumping jerry,
 val: test
 }'

 as you can see, and the mapping is :

 {
 action1 : {
 properties : {
 val : {
 type : long
 },
 title : {
 type : string
 }
 }
 },
 action2 : {
 properties : {
 val : {
 type : string
 },
 title : {
 type : string
 }
 }
 }
 }

 But when do a aggs action:

 curl 'http://192.168.2.245:9200/testindex/action1/_search' -d '
 {
 aggs: {
 vals: {
 terms: {
 field: val
 }
 }
 }
 }'


 {
 took : 37,
 timed_out : false,
 _shards : {
 total : 5,
 successful : 4,
 failed : 1,
 failures : [
 {
 index : testindex,
 shard : 2,
 status : 500,
 reason : 
 RemoteTransportException[[a00][inet[/192.168.2.246:9300]][search/phase/query]];
 nested: ElasticsearchException[java.lang.NumberFormatException: Invalid
 shift value (84) in prefixCoded bytes (is encoded value really an INT?)];
 nested: UncheckedExecutionException[java.lang.NumberFormatException:
 Invalid shift value (84) in prefixCoded bytes (is encoded value really an
 INT?)]; nested: NumberFormatException[Invalid shift value (84) in
 prefixCoded bytes (is encoded value really an INT?)]; 
 }
 ]
 },
 hits : {
 total : 0,
 max_score : null,
 hits : [
 ]
 },
 aggregations : {
 vals : {
 buckets : [
 ]
 }
 }
 }

 The val field in action1 type is mapped to long, but it seems that ES
 still scan the action2 type even if I specify the action1 type.

 any advice to resolve this issue ? thanks.
 --
 不学习,不知道

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLB_Md4w49%2BDW2O2OuLY6RBAR0DPz6rHLvb9WcLq8h3n6Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CA%2BJstLB_Md4w49%2BDW2O2OuLY6RBAR0DPz6rHLvb9WcLq8h3n6Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCA5e332x7zaZUn9kaFTANL-Pv--%2Buj3ed4a%3DY8_zgBGA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommendation for using stop words

2014-07-31 Thread Ivan Brusic
Tuning stop words can be as long of a process as you want it to be. Saving
your queries/results and doing some search analytics can help you fine tune
the stop words. In general, the default stop words list is very good for
English, but Twitterspeak is not really English. :)

You can look at all the terms in your inverted index (
https://github.com/jprante/elasticsearch-index-termlist) to see what the
top words are and see if they are relevant.

Have you looked at the common words query?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-common-terms-query.html

Cheers,

Ivan


On Wed, Jul 30, 2014 at 4:35 PM, sulemanmubarik sulemanmuba...@gmail.com
wrote:

 Hi
 What are good recommendation to use the stop words.
 Using the default stop words provided by elastic search is good
 Or should I use some custom stop words too
 More than 60% data is from twitter.
 Thanks




 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/Recommendation-for-using-stop-words-tp4060924.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1406763306312-4060924.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDhFfibwP1pbG52AKaNfx%3DtHcqW-KZJ7rV42ugwFbe_Mg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: EsRejectedExecutionException

2014-07-31 Thread Ivan Brusic
The thread pool will reject any search requests when there are 1000 actions
already queued.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html

Do you have this many search requests at one time? Do you have warmers
and/or percolators running since you mentioned it occurred at startup?

-- 
Ivan


On Thu, Jul 31, 2014 at 12:36 AM, Anand kumar anandv1...@gmail.com wrote:

 Hi All,

 In my cluster, I'm having around 500 indices. When i'm trying to start the
 elasticsearch instance, its showing the following exception.
 Why its happening, what should be done to resolve it?

 Thanks,
 Anand


 [2014-07-31 11:50:01,551][DEBUG][action.search.type   ] [ESCS_NODE]
 [components][3], node[VQMkI5CBRB2hDEnZ5yVLcw], [P], s[STARTED]: Failed to
 execute [org.elasticsearch.action.search.SearchRequest@c34bbb8] lastShard
 [true]
 org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
 rejected execution (queue capacity 1000) on
 org.elasticsearch.search.action.SearchServiceTransportAction$23@60cfc409
 at
 org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
 at
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
 at
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.execute(SearchServiceTransportAction.java:509)
 at
 org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:203)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:171)
 at
 org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.start(TransportSearchTypeAction.java:153)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:59)
 at
 org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction.doExecute(TransportSearchQueryThenFetchAction.java:49)
 at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:101)
 at
 org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:43)
 at
 org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
 at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
 at
 org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:212)
 at
 org.elasticsearch.rest.action.search.RestSearchAction.handleRequest(RestSearchAction.java:75)
 at
 org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
 at
 org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:142)
 at
 org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
 at
 org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
 at
 org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:294)
 at
 org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:44)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
 at
 org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at
 org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at
 org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at
 org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
 at
 org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
 at
 org.elasticsearch.common.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
 at
 

Re: ip type and support for a port?

2014-07-31 Thread joergpra...@gmail.com
Of course, range queries make much sense. For example, over schemes, IP
ranges /subnets, or subdomains, or host names.

URL/URIs must also be canonicalized to make them comparable which is a
non-trivial issue, not resolvable by regex/pattern parsing.

URL/URIs data types are a building block for web crawlers.

I do not see Logstash being a webcrawler such as Nutch, Heritrix, or
Crawler4j

My 2¢s.

Jörg


On Thu, Jul 31, 2014 at 6:45 PM, David Pilato da...@pilato.fr wrote:

 I don't see the value we can add here.
 For example, does a Range query on URL make sense?

 If it's only for breaking the string into two subfields at index time, it
 can be done by a plugin like the mapper attachment. Although, I'd prefer
 using logstash for example to do that before sending the index request to
 ES.


 Just my 0.05 cents here :)

 --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 31 juillet 2014 à 18:39:33, joergpra...@gmail.com (
 joergpra...@gmail.com) a écrit:

 Maybe it would help to implement a new ES data type for URI/URL?

 Usage example:

 POST /test/test
 {
 url : http://127.0.0.1:80;
 }

 Jörg


 On Thu, Jul 31, 2014 at 4:56 PM, Chris Neal chris.n...@derbysoft.net
 wrote:

 Hi Jorg,

 Here's an example:

  An ip field named as 'host' in mapping.
 host: {
  type :ip
},

  Then I add below to ES
  POST /test_index/test
 {
   host:127.0.0.1:80 http://127.0.0.1/
 }

 ES returns this error
  {
error: MapperParsingException[failed to parse [host]]; nested:
 ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]];
 nested: NumberFormatException[For input string: \1:80\]; ,
status: 400
 }

 If I only post IP without port like this
  POST /test_index/test
 {
   host:127.0.0.1
 }
  ES return success.

 Does that help explain the issue?
  Thanks!
  Chris


  On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

  Can you give an example what you mean by IP ports?

 Transport protocols like TCP has ports, but IP (Internet addresses) is
 used to address hosts on a network.

 Jörg



  On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net
 wrote:

  Hi all,

 I'm trying to use the ip type in ES, but my IPs also have ports.  That
 doesn't seem to be supported, which was a bit of a surprise!

 Does anyone know of a way to do this?  Or does it sound like a good
 feature to add support for to this type?

 Thanks!
 Chris
   --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


   --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
  To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com?utm_medium=emailutm_source=footer.


 For more options, visit https://groups.google.com/d/optout.


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkTDvuqYVFBy-L-WvgWA9NjeiHBAuj%2BmmiLW%3DJkn1jgQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkTDvuqYVFBy-L-WvgWA9NjeiHBAuj%2BmmiLW%3DJkn1jgQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, 

Re: Diagnosing a slow query

2014-07-31 Thread Christopher Ambler
It has been suggested that what I'm seeing is a CPU-bound issue in that the 
large number of OR directives in our query could make many of these queries 
take a long time.

As I'm not an expert on crafting queries, any expert opinions?

Because I'm feeling pretty good about my configuration about now...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ca810684-5fa3-4f0e-92ca-38b3fe359fd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Memory Explosion: Heap Dump in less than one minute

2014-07-31 Thread Tom Wilson
What exactly do I need to delete and how do I do it?

On Wednesday, July 30, 2014 5:45:03 PM UTC-7, Mark Walkom wrote:

 Unless you are attached to the stats you have in the marvel index for 
 today it might be easier to delete them than try to recover the unavailable 
 shards.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com
  

 On 31 July 2014 10:36, Tom Wilson twils...@gmail.com javascript: 
 wrote:

 Upping to 1GB, memory usage seems to level off at 750MB, but there's a 
 problem in there somewhere. I'm getting a failure message, and the marvel 
 dashboard isn't able to fetch.


 C:\elasticsearch-1.1.1\binelasticsearch
 Picked up _JAVA_OPTIONS: -Djava.net.preferIPv4Stack=true
 [2014-07-30 17:33:27,138][INFO ][node ] [Mondo] 
 version[1.1.1], pid[10864], build[f1585f0/2014-04-16
 T14:27:12Z]
 [2014-07-30 17:33:27,139][INFO ][node ] [Mondo] 
 initializing ...
 [2014-07-30 17:33:27,163][INFO ][plugins  ] [Mondo] 
 loaded [ldap-river, marvel], sites [marvel]
 [2014-07-30 17:33:30,731][INFO ][node ] [Mondo] 
 initialized
 [2014-07-30 17:33:30,731][INFO ][node ] [Mondo] 
 starting ...
 [2014-07-30 17:33:31,027][INFO ][transport] [Mondo] 
 bound_address {inet[/0.0.0.0:9300]}, publish_address
  {inet[/192.168.0.6:9300]}
 [2014-07-30 17:33:34,202][INFO ][cluster.service  ] [Mondo] 
 new_master [Mondo][liyNQAHAS0-8f-qDDqa5Rg][twilson-T
 HINK][inet[/192.168.0.6:9300]], reason: zen-disco-join (elected_as_master)
 [2014-07-30 17:33:34,239][INFO ][discovery] [Mondo] 
 elasticsearch/liyNQAHAS0-8f-qDDqa5Rg
  [2014-07-30 17:33:34,600][INFO ][http ] [Mondo] 
 bound_address {inet[/0.0.0.0:9200]}, publish_address
  {inet[/192.168.0.6:9200]}
 [2014-07-30 17:33:35,799][INFO ][gateway  ] [Mondo] 
 recovered [66] indices into cluster_state
 [2014-07-30 17:33:35,815][INFO ][node ] [Mondo] 
 started
 [2014-07-30 17:33:39,823][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,830][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,837][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,838][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:43,973][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,212][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,357][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,501][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,294][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,309][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,310][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,310][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,281][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,283][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,286][DEBUG][action.search.type   ] [Mondo] All 
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:45,662][ERROR][marvel.agent.exporter] [Mondo] 
 create failure (index:[.marvel-2014.07.31] type: [no
 de_stats]): UnavailableShardsException[[.marvel-2014.07.31][0] [2] 
 shardIt, [0] active : Timeout waiting for [1m], reque
 st: org.elasticsearch.action.bulk.BulkShardRequest@39b65640]



 On Wednesday, July 30, 2014 5:30:29 PM UTC-7, Mark Walkom wrote:

 Up that to 1GB and see if it starts.
 512MB is pretty tiny, you're better off starting at 1/2GB if you can.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
  

 On 31 July 2014 10:28, Tom Wilson twils...@gmail.com wrote:

  JDK 1.7.0_51

 It has 512MB of heap, which was enough -- I've been running it like 
 that for the past few months, and I only have two indexes and around 
 300-400 documents. This is a development instance I'm running on my local 
 machine. This only happened when I started it today. 

 -tom


 On Wednesday, July 30, 2014 5:16:11 PM UTC-7, Mark Walkom wrote:

 What java version? How much heap have you allocated and how much RAM 
 on the server?

 Basically you have too much data for the heap size, 

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-31 Thread Ivan Brusic
Look into the curator, which should help:
https://github.com/elasticsearch/curator

If you have just a single development instance, perhaps Marvel is an
overkill. Do you need historical metrics? If not, just use some other
plugin such as head/bigdesk/hq.

Cheers,

Ivan




On Thu, Jul 31, 2014 at 10:52 AM, Tom Wilson twilson...@gmail.com wrote:

 What exactly do I need to delete and how do I do it?


 On Wednesday, July 30, 2014 5:45:03 PM UTC-7, Mark Walkom wrote:

 Unless you are attached to the stats you have in the marvel index for
 today it might be easier to delete them than try to recover the unavailable
 shards.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 31 July 2014 10:36, Tom Wilson twils...@gmail.com wrote:

 Upping to 1GB, memory usage seems to level off at 750MB, but there's a
 problem in there somewhere. I'm getting a failure message, and the marvel
 dashboard isn't able to fetch.


 C:\elasticsearch-1.1.1\binelasticsearch
 Picked up _JAVA_OPTIONS: -Djava.net.preferIPv4Stack=true
 [2014-07-30 17:33:27,138][INFO ][node ] [Mondo]
 version[1.1.1], pid[10864], build[f1585f0/2014-04-16
 T14:27:12Z]
 [2014-07-30 17:33:27,139][INFO ][node ] [Mondo]
 initializing ...
 [2014-07-30 17:33:27,163][INFO ][plugins  ] [Mondo]
 loaded [ldap-river, marvel], sites [marvel]
 [2014-07-30 17:33:30,731][INFO ][node ] [Mondo]
 initialized
 [2014-07-30 17:33:30,731][INFO ][node ] [Mondo]
 starting ...
 [2014-07-30 17:33:31,027][INFO ][transport] [Mondo]
 bound_address {inet[/0.0.0.0:9300]}, publish_address
  {inet[/192.168.0.6:9300]}
 [2014-07-30 17:33:34,202][INFO ][cluster.service  ] [Mondo]
 new_master [Mondo][liyNQAHAS0-8f-qDDqa5Rg][twilson-T
 HINK][inet[/192.168.0.6:9300]], reason: zen-disco-join
 (elected_as_master)
 [2014-07-30 17:33:34,239][INFO ][discovery] [Mondo]
 elasticsearch/liyNQAHAS0-8f-qDDqa5Rg
  [2014-07-30 17:33:34,600][INFO ][http ] [Mondo]
 bound_address {inet[/0.0.0.0:9200]}, publish_address
  {inet[/192.168.0.6:9200]}
 [2014-07-30 17:33:35,799][INFO ][gateway  ] [Mondo]
 recovered [66] indices into cluster_state
 [2014-07-30 17:33:35,815][INFO ][node ] [Mondo]
 started
 [2014-07-30 17:33:39,823][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,830][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,837][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:39,838][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:43,973][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,212][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,357][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:44,501][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,294][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,309][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,310][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:33:53,310][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,281][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,283][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:03,286][DEBUG][action.search.type   ] [Mondo] All
 shards failed for phase: [query_fetch]
 [2014-07-30 17:34:45,662][ERROR][marvel.agent.exporter] [Mondo]
 create failure (index:[.marvel-2014.07.31] type: [no
 de_stats]): UnavailableShardsException[[.marvel-2014.07.31][0] [2]
 shardIt, [0] active : Timeout waiting for [1m], reque
 st: org.elasticsearch.action.bulk.BulkShardRequest@39b65640]



 On Wednesday, July 30, 2014 5:30:29 PM UTC-7, Mark Walkom wrote:

 Up that to 1GB and see if it starts.
 512MB is pretty tiny, you're better off starting at 1/2GB if you can.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 31 July 2014 10:28, Tom Wilson twils...@gmail.com wrote:

  JDK 1.7.0_51

 It has 512MB of heap, which was enough -- I've been running it like
 that for the past few months, and I only have two indexes and around
 300-400 documents. This is a development instance I'm 

Re: ip type and support for a port?

2014-07-31 Thread Chris Neal
Thanks guys for the discussion.  I'm glad we're talking it through!

For an immediate solution it is easy to parse off the port and and store it
separately.  Longer term I do see value in a url/uri type as Jorg mentions.

Thanks again.  Much appreciated!
Chris


On Thu, Jul 31, 2014 at 12:35 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Of course, range queries make much sense. For example, over schemes, IP
 ranges /subnets, or subdomains, or host names.

 URL/URIs must also be canonicalized to make them comparable which is a
 non-trivial issue, not resolvable by regex/pattern parsing.

 URL/URIs data types are a building block for web crawlers.

 I do not see Logstash being a webcrawler such as Nutch, Heritrix, or
 Crawler4j

 My 2¢s.

 Jörg


 On Thu, Jul 31, 2014 at 6:45 PM, David Pilato da...@pilato.fr wrote:

 I don't see the value we can add here.
 For example, does a Range query on URL make sense?

 If it's only for breaking the string into two subfields at index time, it
 can be done by a plugin like the mapper attachment. Although, I'd prefer
 using logstash for example to do that before sending the index request to
 ES.


 Just my 0.05 cents here :)

  --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr


 Le 31 juillet 2014 à 18:39:33, joergpra...@gmail.com (
 joergpra...@gmail.com) a écrit:

 Maybe it would help to implement a new ES data type for URI/URL?

 Usage example:

 POST /test/test
 {
 url : http://127.0.0.1:80;
 }

 Jörg


 On Thu, Jul 31, 2014 at 4:56 PM, Chris Neal chris.n...@derbysoft.net
 wrote:

 Hi Jorg,

 Here's an example:

  An ip field named as 'host' in mapping.
 host: {
  type :ip
},

  Then I add below to ES
  POST /test_index/test
 {
   host:127.0.0.1:80 http://127.0.0.1/
 }

 ES returns this error
  {
error: MapperParsingException[failed to parse [host]]; nested:
 ElasticsearchIllegalArgumentException[failed to parse ip [127.0.0.1:80]];
 nested: NumberFormatException[For input string: \1:80\]; ,
status: 400
 }

 If I only post IP without port like this
  POST /test_index/test
 {
   host:127.0.0.1
 }
  ES return success.

 Does that help explain the issue?
  Thanks!
  Chris


  On Wed, Jul 30, 2014 at 4:35 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

  Can you give an example what you mean by IP ports?

 Transport protocols like TCP has ports, but IP (Internet addresses) is
 used to address hosts on a network.

 Jörg



  On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net
  wrote:

  Hi all,

 I'm trying to use the ip type in ES, but my IPs also have ports.  That
 doesn't seem to be supported, which was a bit of a surprise!

 Does anyone know of a way to do this?  Or does it sound like a good
 feature to add support for to this type?

 Thanks!
 Chris
   --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpgEWdxZcaFe1wmSWtjRfXqxvUC0vbq8bfE0mC7DusS_%2Bw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


   --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
  To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFKah6pkCWgKZYASM%3DfxyBq8waPQjyCBHj2aZaAeyiRwQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAND3DpjhbQunFsT8gi0bspkNDmj9u5kOyhohoAuEX20P0rcBkQ%40mail.gmail.com?utm_medium=emailutm_source=footer.


 For more options, visit https://groups.google.com/d/optout.


 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to 

[ANN] Elasticsearch Juju charm

2014-07-31 Thread Jorge Castro
Hello everyone,

We (Ubuntu) have been working on a Juju[1] charm for Elasticsearch so that 
our users can deploy ES easily on as many clouds as possible. 

- https://github.com/charms/elasticsearch/tree/trusty

We use ansible to install the latest packages from the upstream repository, 
though the charm has an option of changing that for those of you that 
maintain your own internal mirrors of the packages. With this charm it's 
possible to stand up ES clusters on just about any cloud where Ubuntu runs. 
This 
is the 2nd iteration of our charm and we're using it in production, so 
we're keen on getting more eyeballs on it. The basic goal is to enable 
every Ubuntu user (and eventually Windows and CentOS) to easily deploy ES 
clusters out of the box in 14.04 (and in 12.04 after that as well). 

I was hoping we could get some community folks to give our code a peer 
review, perhaps point out places where we could improve, and to ensure that 
we're following upstream best practices. 

Any help/guidance would be appreciated!

--
Jorge Castro

PS: On a related note we do have charms for Kibana and Logstash as well:

- https://github.com/charms/kibana/
- https://github.com/charms/logstash-indexer/

And I am also currently working on what we call a bundle, which is a set 
of charms bundled together if anyone is interested in checking it out, the 
idea is for people to be able to drag and drop bundles for deployment: 
http://manage.jujucharms.com/bundle/~jorge/elasticsearch/cluster

[1]: http://juju.ubuntu.com

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2fa7e262-2454-42ef-8d4b-11a3173b54fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


timestamp format

2014-07-31 Thread Cristian Falcas
Hello all,

I'm using currently rsyslogd to send messages to elasticsearch and kibana
as a GUI.

Rsyslogd is sending the @timestamp in the following format:

2014-07-31T21:01:16.515922+03:00

I was wondering if elasticsearch is able to understand this format? Because
kibana sorting doesn't do anything with it. The sorting is completly random.

Should I send the timestamp in another format? Can I keep the microseconds?

Best regards,
Cristian Falcas

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMo7R_dKY%3DE-PyiKhc1SUtv5NKPM%3DAUJFyNmiwN-yuOigDEmQQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: slow filter execution

2014-07-31 Thread Kireet Reddy
Quick update, I found that if I explicitly set _cache to true, things seem 
to work more as expected, i.e. subsequent executions of the query sped up. 
I looked at DateFieldMapper.rangeFilter() and to me it looks like if a 
number is passed, caching will be disabled unless it's explicitly set to 
true. Not sure if this has been fixed in 1.3.x yet or not. This meshes with 
my observed behavior. 

On Wednesday, July 30, 2014 8:59:37 AM UTC-7, Kireet Reddy wrote:

 Thanks for the detailed reply. 

 I am a bit confused about and vs bool filter execution. I read this post 
 http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/ 
 on 
 the elasticsearch blog. From that, I thought the bool filter would work by 
 basically creating a bitset for the entire segment(s) being examined. If 
 the filter value changes every time, will this still be cheaper than an AND 
 filter that will just examine the matching docs? My segments can be very 
 big and this query for example on matched one document.

 There is no match_all query filter, There is a match query filter on a 
 field named all. :)

 Based on your feedback, I moved all filters, including the query filter, 
 into the bool filter. However it didn't change things: the query takes an 
 order of magnitude slower with the range filter, unless I set execution to 
 fielddata. I am using 1.2.2, I tried the strategy anyways and it didn't 
 make a difference.

 {
 query: {
 filtered: {
 query: {
 match_all: {}
 },
 filter: {
 bool: {
 must: [
 {
 terms: {
 source_id: [s1, s2, s3]
 }
 },
 {
 query: {
 match: {
 all: {
 query: foo
 }
 }
 }
 },
 {
 range: {
 published: {
 to: 1406064191883
 }
 }
 }
 ]
 }
 }
 }
 },
 sort: [
 {
 crawlDate: {
 order: desc
 }
 }
 ]
 }

 On Wednesday, July 30, 2014 4:30:10 AM UTC-7, Clinton Gormley wrote:

 Don't use the `and` filter - use the `bool` filter instead.  They have 
 different execution modes and the `bool` filter works best with bitset 
 filters (but also knows how to handle non-bitset filters like geo etc).  

 Just remove the `and`, `or` and `not` filters from your DSL vocabulary.

 Also, not sure why you are ANDing with a match_all filter - that doesn't 
 make much sense.

 Depending on which version of ES you're using, you may be encountering a 
 bug in the filtered query which ended up always running the query first, 
 instead of the filter. This was fixed in v1.2.0 
 https://github.com/elasticsearch/elasticsearch/issues/6247 .  If you are 
 on an earlier version you can force filter-first execution manually by 
 specifying a strategy of random_access_100.  See 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html#_filter_strategy

 In summary, (and taking your less granular datetime clause into account) 
 your query would be better written as:

 GET /_search
 {
   query: {
 filtered: {
   strategy: random_access_100,   pre 1.2 only
   filter: {
 bool: {
   must: [
 {
   terms: {
 source_id: [ s1, s2, s3 ]
   }
 },
 {
   range: {
 published: {
   gte: now-1d/d   coarse grained, cached
 }
   }
 },
 {
   range: {
 published: {
   gte: now-30m  fine grained, not cached, 
 could use fielddata too
 },
 _cache: false
   }
 }
   ]
 }
   }
 }
   }
 }





 On 30 July 2014 10:55, David Pilato da...@pilato.fr wrote:

 May be a stupid question: why did you put that filter inside a query and 
 not within the same filter you have at the end?


 For my test case it's the same every time. In the real query it will 
 change every time, but I planned to not cache this filter and have a less 
 granular date filter in the bool filter that would be cached. However 
 while 
 debugging I 

Re: slow filter execution

2014-07-31 Thread Clinton Gormley
On 31 July 2014 20:25, Kireet Reddy kir...@feedly.com wrote:

 Quick update, I found that if I explicitly set _cache to true, things seem
 to work more as expected, i.e. subsequent executions of the query sped up.
 I looked at DateFieldMapper.rangeFilter() and to me it looks like if a
 number is passed, caching will be disabled unless it's explicitly set to
 true. Not sure if this has been fixed in 1.3.x yet or not. This meshes with
 my observed behavior.


Nice catch!!!

That's a notable bug!  Opened here:
https://github.com/elasticsearch/elasticsearch/issues/7114

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPt3XKSuS6f28kmXT_b3LFvCZJG1-_ui2D%3Drf-rojn4x6Mf%2Brw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Complete suggestion Issue

2014-07-31 Thread Madhavan Ramachandran
Hi 

I have created index for typeahead aka completion suggestion in ES. I 
feeded the below data into ES.

us-en-us-chicago
us-en-us-chicago, central loop
us-en-us-chicago, east loop
us-en-us-chicago, eastern east-west
us-en-us-chicago, i-80  i-55 corridors
us-en-us-chicago, i-90 corridor
us-en-us-chicago, north (cook county)
us-en-us-chicago, north (lake county)
us-en-us-chicago, north i-94 corridor
us-en-us-chicago, north michigan avenue
us-en-us-chicago, northwest
us-en-us-chicago, northwest indiana
us-en-us-chicago, o'hare
us-en-us-chicago, river north
us-en-us-chicago, west loop
us-en-us-chicago, western east-west

when i run the below suggest query, i am getting only 5 results out of the 
above..

{   
my-suggest : {
text : us-en-us-c,
completion : {
field : search
}
}
}


{
_shards: {
total: 5,
successful: 5,
failed: 0
},
my-suggest: [
{
text: us-en-us-c,
offset: 0,
length: 10,
options: [
{
text: us-en-us-chicago,
score: 1
},
{
text: us-en-us-chicago, central loop,
score: 1
},
{
text: us-en-us-chicago, east loop,
score: 1
},
{
text: us-en-us-chicago, eastern east-west,
score: 1
},
{
text: us-en-us-chicago, i-80  i-55 corridors,
score: 1
}
]
}
]
}

But if i do search.. i used to see all the results.. how to get all the 
results using _suggest ?

Regards
Madhavan.TR

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f9e77708-bd3f-417e-8e79-766894afa45d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


It is possibile don't token word with elasticsearch?

2014-07-31 Thread Fedele Mantuano
Hi,

i have a question. I have an elasticsearch server and I use logstash to put 
log into it, but I noticed that the mail are tokenized:

t...@example-test.com became [test, example, test].

It is possibile don't token word with elasticsearch?

I found this guide:


*http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html*but
 
I can't work.

Is this code correct?

curl -XPUT http://localhost:9200/test_index/ -d '
{   
  

  settings: { 
  analysis : { 
filter : { 
word_filter : { 
type:word_delimiter,
generate_word_parts:false,
generate_number_parts:false,
split_on_numerics:false,
split_on_case_change:false,
preserve_original:true 
}'


Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d1550294-777b-40f6-9f70-3531ddf9a7cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: It is possibile don't token word with elasticsearch?

2014-07-31 Thread David Pilato
Change the mapping for this field and change index to not_analyzed.
So it will be kept as is.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 31 juil. 2014 à 21:55, Fedele Mantuano mantuano.fed...@gmail.com a écrit :

Hi,

i have a question. I have an elasticsearch server and I use logstash to put log 
into it, but I noticed that the mail are tokenized:

t...@example-test.com became [test, example, test].

It is possibile don't token word with elasticsearch?

I found this guide:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html

but I can't work.

Is this code correct?

curl -XPUT http://localhost:9200/test_index/ -d '
{   
  
  settings: { 
  analysis : { 
filter : { 
word_filter : { 
type:word_delimiter,
generate_word_parts:false,
generate_number_parts:false,
split_on_numerics:false,
split_on_case_change:false,
preserve_original:true 
}'


Thanks
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d1550294-777b-40f6-9f70-3531ddf9a7cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3567C7B1-7BF3-4739-85B5-03FE29306FF2%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


How to perform a wildcard in phrase search

2014-07-31 Thread Joyce Wu
Hi,

I have a document contains a phrase interest entities.  If I use wildcard 
query to search entit*, that document is returned.  However, if I use 
wildcard query to do a phrase search interest*entit*, that document is 
not found.  I also tried with prefix query as well as wildcard query with 
interest entit*, no document is found either.   

I would expect follow query to work according to ElasticSearch document:  
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_literal_wildcard_literal_and_literal_regexp_literal_queries.html

  {wildcard:{document:interest*entit*}}

The document was indexed using standard tokenizer and standard filter, 
which should be not stemmed.

Does anyone has an idea how to get the wildcard query works with wildcard 
in a phrase?

Many thanks!


Joyce




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b75810bb-9e10-4dfd-aa2a-bfad63d78c27%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Remote access through SSH

2014-07-31 Thread Chia-Eng Chang
I tried curl -XGET http://IPADDRESS:9200/.. http://IPADDRESS:9200/.. but 
it failed.
Actually I cant even telnet port 9200 on my machine.
My guess is that elasticsearch port 9200 is hidden behind ssh port 22.

So I use ssh tunnel forwarding port 9200 on the server to my machine.
Like :

ssh -Lmy-port:target-machine:target-port user@target-machine

Then I can simply apply curl -get localhost:9200 to query elasticsearch on my 
cloud server.
The java api transpot client might need the same setting to make it work.


On Wednesday, July 30, 2014 5:48:28 PM UTC-7, Mark Walkom wrote:

 You can also curl from your local machine to the server, without having to 
 SSH to it - curl -XGET http://IPADDRESS:9200/

 You don't need to provide SSH credentials for that transport client 
 example.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 31 July 2014 10:35, Chia-Eng Chang chia...@uci.edu javascript: 
 wrote:

 Thank you for the links. Yeah, I am new to ES. (and http rest)
 What I understand is that if I want to get the index documents on my SSH 
 server, I can SSH log in the server.
 And then rest http get from localhost:9200.

 Could you explain more about  use SSH directly for it? 
 I think what I want to do is close to this transport client example 
 http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/client.html#transport-client
  
 But I have to provide ssh credential.


 On Wednesday, July 30, 2014 4:47:22 PM UTC-7, Mark Walkom wrote:

 You may want to look at http://www.elasticsearch.
 org/guide/en/elasticsearch/reference/current/search.html

 If you are just learning ES, then check out http://
 exploringelasticsearch.com/

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
  

 On 31 July 2014 09:35, Chia-Eng Chang chia...@uci.edu wrote:

  Thanks @Mark
 I have a public key on server and I know how to SSH to server then get 
 the index from localhost:9200.
 But what I want to do is remotely obtain the index on the SSH server 
 (which I know its public IP)


 On Wednesday, July 30, 2014 3:56:04 PM UTC-7, Mark Walkom wrote:

 You need to use SSH directly for it, curl won't work.

 ssh user@host -i ~/.ssh/id_rsa.pub

 Assuming you have a public key on the server.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 31 July 2014 08:47, Chia-Eng Chang chia...@uci.edu wrote:

  About the HTTP API, I wonder if I want to remote access a cluster 
 on SSH server, what should I include in my http rest command:

 example as mapping:

 curl -XGET ' http://localhost:9200/ index /_mapping/ type ' 

 I  tried something like below but got failed:

 curl -XGET -u user_name: --key ~/.ssh/id_rsa --pubkey 
 ~/.ssh/id_rsa.pub  'xx.xxx.xxx.xxx:9200/index/_mapping/type'

 Is there anyone knows the solution?

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, 
 send an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/31b4e835-8ebb-4dc7-bc2b-c8fa09414f12%40goo
 glegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/31b4e835-8ebb-4dc7-bc2b-c8fa09414f12%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/d6404d32-6b0f-4670-8626-f38b1284809d%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/d6404d32-6b0f-4670-8626-f38b1284809d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/3d1ad9da-e964-482b-89ac-75ad35b68227%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/3d1ad9da-e964-482b-89ac-75ad35b68227%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: Remote access through SSH

2014-07-31 Thread Patrick Proniewski
Hi,

Looks like your problem is not related to ES. It's a networking problem. You ES 
server appears to live behind a firewall (or is bind to 127.0.0.1 only), and 
your only way to contact it is to create an ssh tunnel to the server and to 
pipe your queries into this tunnel so that you can get in touch with the ES 
process.

You might want to get in touch with your hosting provider. Also, if you can run 
a web server (apache, nginx...) on your server, you might want to configure a 
proxy in this web server, so that requests sent to a particular URL will be 
redirected to the ES process. You would have something like this: 

client  (public IP, port 80 or 443) http-server - (localhost port 9200) 
ES process

It allows you to add some kind of access control to the ES process by using 
access control directives from Apache or Nginx.

On 31 juil. 2014, at 23:47, Chia-Eng Chang wrote:

 I tried curl -XGET http://IPADDRESS:9200/ but it failed.
 Actually I cant even telnet port 9200 on my machine.
 My guess is that elasticsearch port 9200 is hidden behind ssh port 22.
 
 So I use ssh tunnel forwarding port 9200 on the server to my machine.
 Like :
 ssh -Lmy-port:target-machine:target-port user@target-machine
 
 Then I can simply apply curl -get localhost:9200 to query elasticsearch on my 
 cloud server.
 The java api transpot client might need the same setting to make it work.
 
 
 On Wednesday, July 30, 2014 5:48:28 PM UTC-7, Mark Walkom wrote:
 You can also curl from your local machine to the server, without having to 
 SSH to it - curl -XGET http://IPADDRESS:9200/
 
 You don't need to provide SSH credentials for that transport client example.
 
 Regards,
 Mark Walkom
 
 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
 
 
 On 31 July 2014 10:35, Chia-Eng Chang chia...@uci.edu wrote:
 Thank you for the links. Yeah, I am new to ES. (and http rest)
 What I understand is that if I want to get the index documents on my SSH 
 server, I can SSH log in the server.
 And then rest http get from localhost:9200.
 
 Could you explain more about  use SSH directly for it? 
 I think what I want to do is close to this transport client example 
 But I have to provide ssh credential.
 
 
 On Wednesday, July 30, 2014 4:47:22 PM UTC-7, Mark Walkom wrote:
 You may want to look at 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search.html
 
 If you are just learning ES, then check out http://exploringelasticsearch.com/
 
 Regards,
 Mark Walkom
 
 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
 
 
 On 31 July 2014 09:35, Chia-Eng Chang chia...@uci.edu wrote:
 Thanks @Mark
 I have a public key on server and I know how to SSH to server then get the 
 index from localhost:9200.
 But what I want to do is remotely obtain the index on the SSH server (which I 
 know its public IP)
 
 
 On Wednesday, July 30, 2014 3:56:04 PM UTC-7, Mark Walkom wrote:
 You need to use SSH directly for it, curl won't work.
 
 ssh user@host -i ~/.ssh/id_rsa.pub
 
 Assuming you have a public key on the server.
 
 Regards,
 Mark Walkom
 
 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
 
 
 On 31 July 2014 08:47, Chia-Eng Chang chia...@uci.edu wrote:
 About the HTTP API, I wonder if I want to remote access a cluster on SSH 
 server, what should I include in my http rest command:
 
 example as mapping:
 
 curl -XGET ' http://localhost:9200/
  index /_mapping/ type ' 
 
 I  tried something like below but got failed:
 
 curl -XGET -u user_name: --key ~/.ssh/id_rsa --pubkey ~/.ssh/id_rsa.pub  
 'xx.xxx.xxx.xxx:9200/index/_ma
 pping/type'
 
 Is there anyone knows the solution?
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/31b4e835-8ebb-4dc7-bc2b-c8fa09414f12%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/d6404d32-6b0f-4670-8626-f38b1284809d%40googlegroups.com.
 
 For more options, visit https://groups.google.com/d/optout.
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 

Issue (Bug?) with field_value_factor and Search Templates

2014-07-31 Thread Germán Carrillo
Hi,

I'm getting the following error message while attempting to use the Search
Template (I use ES 1.3.1):

 *nested: ElasticsearchException[Unable to find a field mapper for field
[weight]*


I' m storing the template in the .script index as stated here [1]. The
field 'weight' is used inside a field_value_factor (function_score).

The query runs appropriately when not using a Search Template.


You can find a mapping, sample data, working query, sample template, and a
not working template query at [2].
Is that a bug?


I'd appreciate any help,

Germán


[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html#_filling_in_a_query_string_with_a_single_value
 [2] https://titanpad.com/es-fvf-issue

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANaz7mwZPi5ZpYnBNHpvapyqW6kZfy8zfkuRgqROYjivFdHCnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommendations needed for large ELK system design

2014-07-31 Thread Mark Walkom
1 - Curator FTW.
2 - Masters handle cluster state, shard allocation and a whole bunch of
other stuff around managing the cluster and it's members and data. A node
that is master and data set to false is considered a search node. But the
role of being a master is not onerous, so it made sense for us to double up
the roles. We then just round robin any queries to these three masters.
3 - Yes, butit's entirely dependent on your environment. If you're
happy with that and you can get the go-ahead then see where it takes you.
4 - Quorum is automatic and having the n/2+1 means that the majority of
nodes will have to take place in an election, which reduces the possibility
of split brain. If you set the discovery settings then you are also
essentially setting the quorum settings.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 31 July 2014 22:27, Alex alex.mon...@gmail.com wrote:

 Hello Mark,

 Thank you for your reply, it certainly helps to clarify many things.

 Of course I have some new questions for you!

1. I haven't looked into it much yet but I'm guessing Curator can
handle different index naming schemes. E.g. logs-2014.06.30 and
stats-2014.06.30. We'd actually be wanting to store the stats data for 2
years and logs for 90 days so it would indeed be helpful to split the data
into different index sets. Do you use Curator?

2. You say that you have 3 masters that also handle queries... but I
thought all masters did was handle queries? What is a master node that
*doesn't* handle queries? Should we have search load balancer nodes?
AKA not master and not data nodes.

3. In the interests of reducing the number of node combinations for us
to test out would you say, then, that 3 master (and query(??)) only nodes,
and the 6 1TB data only nodes would be good?

4. Quorum and split brain are new to me. This webpage

 http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/
  about
split brain recommends setting *discovery.zen.minimum_master_nodes* equal
to *N/2 + 1*. This formula is similar to the one given in the
documentation for quorum

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency:
index operations only succeed if a quorum (replicas/2+1) of active shards
are available. I completely understand the split brain issue, but not
quorum. Is quorum handled automatically or should I change some settings?

 Thanks again for your help, we appreciate your time and knowledge!
 Regards,
 Alex


 On Thursday, 31 July 2014 05:57:35 UTC+1, Mark Walkom wrote:

 1 - Looks ok, but why two replicas? You're chewing up disk for what
 reason? Extra comments below.
 2 - It's personal preference really and depends on how your end points
 send to redis.
 3 - 4GB for redis will cache quite a lot of data if you're only doing 50
 events p/s (ie hours or even days based on what I've seen).
 4 - No, spread it out to all the nodes. More on that below though.
 5 - No it will handle that itself. Again, more on that below though.

 Suggestions;
 Set your indexes to (factors of) 6 shards, ie one per node, it spreads
 query performance. I say factors of in that you can set it to 12 shards
 per index to start and easily scale the node count and still spread the
 load.
 Split your stats and your log data into different indexes, it'll make
 management and retention easier.
 You can consider a master only node or (ideally) three that also handle
 queries.
 Preferably have an uneven number of master eligible nodes, whether you
 make them VMs or physicals, that way you can ensure quorum is reached with
 minimal fuss and stop split brain.
 If you use VMs for master + query nodes then you might want to look at
 load balancing the queries via an external service.

 To give you an idea, we have a 27 node cluster - 3 masters that also
 handle queries and 24 data nodes. Masters are 8GB with small disks, data
 nodes are 60GB (30 heap) and 512GB disk.
 We're running with one replica and have 11TB of logging data. At a high
 level we're running out of disk more than heap or CPU and we're very write
 heavy, with an average of 1K events p/s and comparatively minimal reads.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 31 July 2014 01:35, Alex alex@gmail.com wrote:

  Hello,

 We wish to set up an entire ELK system with the following features:

- Input from Logstash shippers located on 400 Linux VMs. Only a
handful of log sources on each VM.
- Data retention for 30 days, which is roughly 2TB of data in
indexed ES JSON form (not including replica shards)
- Estimated input data rate of 50 messages per second at peak hours.
Mostly short or medium length one-line messages but there will be Java
 

Re: It is possibile don't token word with elasticsearch?

2014-07-31 Thread Fedele Mantuano
Thanks.

I do like this:

curl -XPUT http://localhost:9200/index_test/ -d ' 
{
  mappings: {
type_test1 : {
  properties : {
sender-address : {
  type : string,
  index: not_analyzed },
 recipient-address : {
  type : string,
  index: not_analyzed }}},
type_test2 : {
  properties : {
User-Agent : {
  type : string,
  index: not_analyzed }'

and I solved.

Bye bye

On Thursday, July 31, 2014 9:55:42 PM UTC+2, Fedele Mantuano wrote:

 Hi,

 i have a question. I have an elasticsearch server and I use logstash to 
 put log into it, but I noticed that the mail are tokenized:

 t...@example-test.com became [test, example, test].

 It is possibile don't token word with elasticsearch?

 I found this guide:


 *http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html
  
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html*but
  
 I can't work.

 Is this code correct?

 curl -XPUT http://localhost:9200/test_index/ -d '
 { 
 

   settings: { 
   analysis : { 
 filter : { 
 word_filter : { 
 type:word_delimiter,
 generate_word_parts:false,
 generate_number_parts:false,
 split_on_numerics:false,
 split_on_case_change:false,
 preserve_original:true 
 }'


 Thanks


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9fa76939-84ec-41c4-a47f-a0b3543a77a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Diagnosing a slow query

2014-07-31 Thread Otis Gospodnetic
ORs certainly tend to be slower than simpler/shorter term queries, but I'd 
suspect that cross-DC part because your index is tinny and servers are 
beefy and plentiful.  Maybe you can look at your network and query metrics 
and correlate a drop or spike in traffic or packet loss with slow queries?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/

tel: +1 347 480 1610   fax: +1 718 679 9190



On Thursday, July 31, 2014 7:49:43 PM UTC+2, Christopher Ambler wrote:

 It has been suggested that what I'm seeing is a CPU-bound issue in that 
 the large number of OR directives in our query could make many of these 
 queries take a long time.

 As I'm not an expert on crafting queries, any expert opinions?

 Because I'm feeling pretty good about my configuration about now...


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/18093462-aa38-4f6b-835f-0f6578dd69dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommendations needed for large ELK system design

2014-07-31 Thread Otis Gospodnetic
You can further simplify your architecture by using rsyslog with 
omelasticsearch instead of LS.

This might be handy: 
http://blog.sematext.com/2013/07/01/recipe-rsyslog-elasticsearch-kibana/

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/

tel: +1 347 480 1610   fax: +1 718 679 9190


On Friday, August 1, 2014 12:57:26 AM UTC+2, Mark Walkom wrote:

 1 - Curator FTW.
 2 - Masters handle cluster state, shard allocation and a whole bunch of 
 other stuff around managing the cluster and it's members and data. A node 
 that is master and data set to false is considered a search node. But the 
 role of being a master is not onerous, so it made sense for us to double up 
 the roles. We then just round robin any queries to these three masters.
 3 - Yes, butit's entirely dependent on your environment. If you're 
 happy with that and you can get the go-ahead then see where it takes you.
 4 - Quorum is automatic and having the n/2+1 means that the majority of 
 nodes will have to take place in an election, which reduces the possibility 
 of split brain. If you set the discovery settings then you are also 
 essentially setting the quorum settings.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 31 July 2014 22:27, Alex alex@gmail.com javascript: wrote:

 Hello Mark,

 Thank you for your reply, it certainly helps to clarify many things.

 Of course I have some new questions for you!

1. I haven't looked into it much yet but I'm guessing Curator can 
handle different index naming schemes. E.g. logs-2014.06.30 and 
stats-2014.06.30. We'd actually be wanting to store the stats data for 2 
years and logs for 90 days so it would indeed be helpful to split the 
 data 
into different index sets. Do you use Curator?

2. You say that you have 3 masters that also handle queries... but 
I thought all masters did was handle queries? What is a master node that 
*doesn't* handle queries? Should we have search load balancer nodes? 
AKA not master and not data nodes.

3. In the interests of reducing the number of node combinations for 
us to test out would you say, then, that 3 master (and query(??)) only 
nodes, and the 6 1TB data only nodes would be good?

4. Quorum and split brain are new to me. This webpage 

 http://blog.trifork.com/2013/10/24/how-to-avoid-the-split-brain-problem-in-elasticsearch/
  about 
split brain recommends setting *discovery.zen.minimum_master_nodes* equal 
to *N/2 + 1*. This formula is similar to the one given in the 
documentation for quorum 

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency:
  
index operations only succeed if a quorum (replicas/2+1) of active 
 shards 
are available. I completely understand the split brain issue, but not 
quorum. Is quorum handled automatically or should I change some settings? 

 Thanks again for your help, we appreciate your time and knowledge!
 Regards,
 Alex


 On Thursday, 31 July 2014 05:57:35 UTC+1, Mark Walkom wrote:

 1 - Looks ok, but why two replicas? You're chewing up disk for what 
 reason? Extra comments below.
 2 - It's personal preference really and depends on how your end points 
 send to redis.
 3 - 4GB for redis will cache quite a lot of data if you're only doing 50 
 events p/s (ie hours or even days based on what I've seen).
 4 - No, spread it out to all the nodes. More on that below though.
 5 - No it will handle that itself. Again, more on that below though.

 Suggestions;
 Set your indexes to (factors of) 6 shards, ie one per node, it spreads 
 query performance. I say factors of in that you can set it to 12 shards 
 per index to start and easily scale the node count and still spread the 
 load.
 Split your stats and your log data into different indexes, it'll make 
 management and retention easier.
 You can consider a master only node or (ideally) three that also handle 
 queries.
 Preferably have an uneven number of master eligible nodes, whether you 
 make them VMs or physicals, that way you can ensure quorum is reached with 
 minimal fuss and stop split brain.
 If you use VMs for master + query nodes then you might want to look at 
 load balancing the queries via an external service.

 To give you an idea, we have a 27 node cluster - 3 masters that also 
 handle queries and 24 data nodes. Masters are 8GB with small disks, data 
 nodes are 60GB (30 heap) and 512GB disk.
 We're running with one replica and have 11TB of logging data. At a high 
 level we're running out of disk more than heap or CPU and we're very write 
 heavy, with an average of 1K events p/s and comparatively minimal reads.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: 

Re: Diagnosing a slow query

2014-07-31 Thread Christopher Ambler
Suspecting this, we tried taking things down to a single server and still 
have the exact same response.

That said, optimizing the query to get rid of some of those ORs has helped, 
so I think that's the path we're taking.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/17cdcddd-fa02-40bf-b59c-dfee613c3a84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Cluster making

2014-07-31 Thread arshpreet singh
Hi I am new to clusters. First I was using elastic search only with 4 GB
machine but now I want to increase it. I am using rocks linux for cluster
making. I have installed frontend but having problem while making cluster.
I am following exact tutorial provided by rocks linux website but frontend
does not detect node.
Is there any problem with dhcp configuration?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAstK2FJ-MhHyarbxX21OdPh5i5SBdJUyx4aa7YaTgRApQd1gg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster making

2014-07-31 Thread arshpreet singh
On 1 Aug 2014 06:57, Mark Walkom ma...@campaignmonitor.com wrote:

 It will help if you link to the tutorial page, as this doesn't really
make much sense.

Thanks for quick reply. I have installed rocks on frontend and now
following the documentation for cluster installing.

http://www.rocksclusters.org/roll-documentation/base/5.5/install-compute-nodes.html

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAAstK2F0iKFgyy_JFL_%3DyQBbpwJc4q3S1-pvVMhKaF1LUVQ63Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster making

2014-07-31 Thread Mark Walkom
I don't think you need this - ES handles clustering by itself.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 1 August 2014 11:38, arshpreet singh arsh...@gmail.com wrote:


 On 1 Aug 2014 06:57, Mark Walkom ma...@campaignmonitor.com wrote:
 
  It will help if you link to the tutorial page, as this doesn't really
 make much sense.

 Thanks for quick reply. I have installed rocks on frontend and now
 following the documentation for cluster installing.


 http://www.rocksclusters.org/roll-documentation/base/5.5/install-compute-nodes.html

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAAstK2F0iKFgyy_JFL_%3DyQBbpwJc4q3S1-pvVMhKaF1LUVQ63Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAAstK2F0iKFgyy_JFL_%3DyQBbpwJc4q3S1-pvVMhKaF1LUVQ63Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YUOi9cxzeUJ-9tod2x0%2BBpAtg%2ByHG23j%2BcTLqqyygtew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


index size impact on search performance?

2014-07-31 Thread Wang Yong
Hi folks, I have an index storing lots of time serial data. The data are put
into index by :

 

curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '

{

val: 23,

timestamp: 1406822400

}'

 

And the only thing I search in this index is histogram facet in a very short
time range, like recent 5 min. I found that the performance was pretty
good at first. But when the index get bigger, the performance dropped to
unacceptable. I found the IO maybe the bottleneck by checking the result of
iostat.

 

My question is, even I only facet in a very short time range, why the size
of index has so big impact on the performance of such query? Do I have to
use daily index, just like logstash?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search losing data not in _source on _update

2014-07-31 Thread Jordan Reiter
Is there any way to do this so it can be stored but I don't get it
when I pull in the _source record? Even extracted text is going to be
huge when you're talking about 20-30+ page documents.

On Thu, Jul 31, 2014 at 10:34 AM, David Pilato da...@pilato.fr wrote:
 In your case, it's not. Because you excluded the attachment field.

 If you are a Java developer, you could easily use Tika directly in your own
 code and send to elasticsearch only the extracted content and not the binary
 file.
 In that case, you could remove mapper attachment plugin.

 If not, I think you need to send again the full JSON document, including the
 binary file.

 --
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr


 Le 31 juillet 2014 à 16:32:04, Jordan Reiter (jordantheco...@gmail.com) a
 écrit:

 So I guess using updates is not a good idea for records with file
 attachments.
 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.

 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/972f1b41-d10b-4ebc-8ac2-c83b80891924%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/26HBTz6XKgM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/etPan.53da5401.684a481a.f0d0%40MacBook-Air-de-David.local.

 For more options, visit https://groups.google.com/d/optout.



-- 
Jordan Reiter
AACE - Association for the Advancement of Computing in Education
Email: jor...@aace.org | Website: www.aace.org | +1.267.438.2388

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD4hTsUW6VZ2dhRjqyMaRV5uiTGYuAWiz4Z%3D0y0dtbozJeHMLA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search losing data not in _source on _update

2014-07-31 Thread Jordan Reiter
Never mind, a little googling answered that question:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/root-object.html#source-field
In a search request, you can ask for only certain fields by
specifying the _source parameter in the request body.

That neatly resolves my issue!

It does mean I'm going to have to change my mapping and, probably,
re-index my entire collection.

Thanks for your help
Jordan

On Thu, Jul 31, 2014 at 10:21 PM, Jordan Reiter jor...@aace.org wrote:
 Is there any way to do this so it can be stored but I don't get it
 when I pull in the _source record? Even extracted text is going to be
 huge when you're talking about 20-30+ page documents.

 On Thu, Jul 31, 2014 at 10:34 AM, David Pilato da...@pilato.fr wrote:
 In your case, it's not. Because you excluded the attachment field.

 If you are a Java developer, you could easily use Tika directly in your own
 code and send to elasticsearch only the extracted content and not the binary
 file.
 In that case, you could remove mapper attachment plugin.

 If not, I think you need to send again the full JSON document, including the
 binary file.

 --
 David Pilato | Technical Advocate | Elasticsearch.com
 @dadoonet | @elasticsearchfr


 Le 31 juillet 2014 à 16:32:04, Jordan Reiter (jordantheco...@gmail.com) a
 écrit:

 So I guess using updates is not a good idea for records with file
 attachments.
 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.

 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/972f1b41-d10b-4ebc-8ac2-c83b80891924%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/26HBTz6XKgM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/etPan.53da5401.684a481a.f0d0%40MacBook-Air-de-David.local.

 For more options, visit https://groups.google.com/d/optout.



 --
 Jordan Reiter
 AACE - Association for the Advancement of Computing in Education
 Email: jor...@aace.org | Website: www.aace.org | +1.267.438.2388



-- 
Jordan Reiter
AACE - Association for the Advancement of Computing in Education
Email: jor...@aace.org | Website: www.aace.org | +1.267.438.2388

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD4hTsWbKFb2iA9x-ezsz-EiY8j1gH%2BfkMpGv-khQnyUqv%3DqzA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: index size impact on search performance?

2014-07-31 Thread David Pilato
Well. I guess it depends on your query. What does it look like?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 1 août 2014 à 04:14, Wang Yong cnwangy...@gmail.com a écrit :

Hi folks, I have an index storing lots of time serial data. The data are put 
into index by :
 
curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '
{
val: 23,
timestamp: 1406822400
}'
 
And the only thing I search in this index is histogram facet in a very short 
time range, like “recent 5 min”. I found that the performance was pretty good 
at first. But when the index get bigger, the performance dropped to 
unacceptable. I found the IO maybe the bottleneck by checking the result of 
iostat.
 
My question is, even I only facet in a very short time range, why the size of 
index has so big impact on the performance of such query? Do I have to use 
daily index, just like logstash?
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0D3FCF70-13F5-4E92-8407-47E907736F79%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Issue (Bug?) with field_value_factor and Search Template

2014-07-31 Thread German Carrillo
Hi, 

I'm getting the following error message while attempting to use the Search 
Template (I use ES 1.3.1):

*nested: ElasticsearchException[Unable to find a field mapper for field 
[weight]*


I' m storing the template in the .script index as stated here [1]. The 
field 'weight' is used inside a field_value_factor (function_score). 

The query runs appropriately when not using a Search Template. 


You can find a mapping, sample data, working query, sample template, and a 
not working template query at [2].
Is that a bug? 


I'd appreciate any help,

Germán


[1] 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-template.html#_filling_in_a_query_string_with_a_single_value
[2] https://titanpad.com/es-fvf-issue

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/65d2ceb8-37cb-4f61-92e6-e85c0c4f0b6e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: index size impact on search performance?

2014-07-31 Thread Mark Walkom
If you're using time series data then it makes sense to use time based
indexes.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 1 August 2014 12:43, David Pilato da...@pilato.fr wrote:

 Well. I guess it depends on your query. What does it look like?

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


 Le 1 août 2014 à 04:14, Wang Yong cnwangy...@gmail.com a écrit :

 Hi folks, I have an index storing lots of time serial data. The data are
 put into index by :



 curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '

 {

 val: 23,

 timestamp: 1406822400

 }'



 And the only thing I search in this index is histogram facet in a very
 short time range, like “recent 5 min”. I found that the performance was
 pretty good at first. But when the index get bigger, the performance
 dropped to unacceptable. I found the IO maybe the bottleneck by checking
 the result of iostat.



 My question is, even I only facet in a very short time range, why the size
 of index has so big impact on the performance of such query? Do I have to
 use daily index, just like logstash?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com
 https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0D3FCF70-13F5-4E92-8407-47E907736F79%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/0D3FCF70-13F5-4E92-8407-47E907736F79%40pilato.fr?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YEg4XVNGAN%2BL8P6QHKjFn_9JTqkJE7TtMHh8fPqZhFxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


stop types for tokenizer filter?

2014-07-31 Thread David Weinstein
Hi group,

I want to be able to specify what `types` to use as stop words. So:

curl -XGET 
'192.168.42.43:9200/_analyze?tokenizer=uax_url_emailchar_filter=html_stripfilter=keyword'
 -d 'http://www.google.com hello'
{tokens:[{token:http://www.google.com,start_offset:0,end_offset:21,type:;URL,position:1},{token:hello,start_offset:22,end_offset:27,type:ALPHANUM,position:2}]}%
 


I'd like to filter out the ALPHANUM tokens and keep the URL token.


I didn't see a token filter that would do this... did I miss something?


Regards,


David

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3934b9de-c111-47d6-a204-d918c6beab7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


extracted text visibility from a Tika-processed attachment

2014-07-31 Thread Erik Paulson
Hello -

I'm using Elasticsearch 1.2.1, with mapper-attachments-2.0.0. I'm a little
baffled by how to surface the text that Tika extracts from a PDF into the
structured document that ES is storing.

Long story short, with a trivial PDF file with one line of text, I'm
getting something like this:

{
   _index : test,
   _type : doc,
   _id : 1,
   _score : 0.067124054,
   fields : {
 my_attachment.date : [ 2014-07-31T23:29:45.000Z ],
 my_attachment.keywords : [ TestKeyword1, TestKeyword2 ],
 my_attachment.title : [ Untitled ]
   }
 }


When what I want is this (with the content of the file included):

 {
   _index : test,
   _type : doc,
   _id : 1,
   _score : 0.067124054,
   fields : {
 my_attachment.date : [ 2014-07-31T23:29:45.000Z ],
 my_attachment.keywords : [ TestKeyword1, TestKeyword2 ],
 my_attachment.title : [ Untitled ],
 my_attachment.file : This is the easiest PDF ever.
   }
 }


A somewhat related question: I'm also a bit confused as to the difference
between the fields from the attachment, and other fields in my document
that I'm storing in my _source. If I ask for the attachment fields, I don't
get anything else I stored in the document; if I don't ask for any fields,
I get everything from _source. Is there a way I can make the
my_attachment.* fields and the Thing field I store in my document
co-equals? I think what I want is for the my_attachment fields to show up
without having to explicitly ask for them.

My sample PDF documents are here:

http://pages.cs.wisc.edu/~epaulson/simplepdfs/Untitled1.pdf

http://pages.cs.wisc.edu/~epaulson/simplepdfs/Untitled2.pdf

And my curl/shell is below, followed by the sample output of a run.

curl -X DELETE localhost:9200/test
curl -X PUT localhost:9200/test

curl -X PUT localhost:9200/test/doc/_mapping -d '
{
doc : {
properties : {
my_attachment : {
  type : attachment,
  fields: {
title : { store : yes },
date : {store : yes},
author : {store : yes},
keywords : {store : yes},
 content_type : {store : yes},
 content_length : {store : yes},
 language : {store : yes},
 file: { store : yes, term_vector:
with_positions_offsets}
   }
}
}
}
}'

echo
echo Uploading a PDF with 'This is the easiest PDF ever'
coded=`cat simple/Untitled1.pdf | base64`
json={\Thing\:\first\,\my_attachment\:\${coded}\}
echo $json  json.file
curl -X PUT 'localhost:9200/test/doc/1?refresh=true' -d @json.file
rm json.file

echo
echo Uploading a PDF with 'This is the second easiest PDF ever'
coded=`cat simple/Untitled2.pdf | base64`
json={\Thing\: \followup\, \my_attachment\:\${coded}\}
echo $json  json.file
curl -X PUT 'localhost:9200/test/doc/2?refresh=true' -d @json.file
rm json.file

echo
echo Querying: Should get two hits
curl -X POST 'localhost:9200/test/doc/_search?pretty=true' -d '{
fields: [title, author, date, file, keywords],
query : { match : { _all : easiest } }
}'
echo
echo
echo Querying: Should get one hit
curl -X POST 'localhost:9200/test/doc/_search?pretty=true' -d '{
  fields: *,

   query : { match : { _all : second } }

}'

echo

echo

echo Directly loading object 1

echo

curl 'localhost:9200/test/doc/1'
echo

 And the output

{acknowledged:true}{acknowledged:true}{acknowledged:true}

Uploading a PDF with 'This is the easiest PDF ever'

{_index:test,_type:doc,_id:1,_version:1,created:true}

Uploading a PDF with 'This is the second easiest PDF ever'

{_index:test,_type:doc,_id:2,_version:1,created:true}

Querying: Should get two hits

{

  took : 2,

  timed_out : false,

  _shards : {

total : 5,

successful : 5,

failed : 0

  },

  hits : {

total : 2,

max_score : 0.067124054,

hits : [ {

  _index : test,

  _type : doc,

  _id : 2,

  _score : 0.067124054,

  fields : {

my_attachment.date : [ 2014-07-31T21:48:21.000Z ],

my_attachment.keywords : [  ],

my_attachment.title : [ Untitled ]

  }

}, {

  _index : test,

  _type : doc,

  _id : 1,

  _score : 0.067124054,

  fields : {

my_attachment.date : [ 2014-07-31T23:29:45.000Z ],

my_attachment.keywords : [ TestKeyword1, TestKeyword2 ],

my_attachment.title : [ Untitled ]

  }

} ]

  }

}


Querying: Should get one hit

{

  took : 2,

  timed_out : false,

  _shards : {

total : 5,

successful : 5,

failed : 0

  },

  hits : {

total : 1,

max_score : 0.067124054,

hits : [ {

  _index : test,

  _type : doc,

  _id : 2,

  _score : 0.067124054,

  fields : {

my_attachment.content_type : [ application/pdf ],

my_attachment.keywords : [  ],

my_attachment.title : [ Untitled ],

 

Re: index size impact on search performance?

2014-07-31 Thread Wang Yong
Thank you david 

most of my query looks like:



{

filtered: {

  query: {

match_all: {}

  },

  filter: {

   range: {

timestamp: {

  from: 1403567280,

  to: 1403567340,

  include_lower: true,

  include_upper: false

}

  }

}

  },

 

 

 

facets : {



val: {


  statistical: {


field: val



  }


}

}

}






Sent from Surface





From: David Pilato
Sent: ‎Friday‎, ‎August‎ ‎1‎, ‎2014 ‎10‎:‎43‎ ‎AM
To: elasticsearch@googlegroups.com





Well. I guess it depends on your query. What does it look like?


--

David ;-)

Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs





Le 1 août 2014 à 04:14, Wang Yong cnwangy...@gmail.com a écrit :





Hi folks, I have an index storing lots of time serial data. The data are put 
into index by :

 

curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '

{

val: 23,

timestamp: 1406822400

}'

 

And the only thing I search in this index is histogram facet in a very short 
time range, like “recent 5 min”. I found that the performance was pretty good 
at first. But when the index get bigger, the performance dropped to 
unacceptable. I found the IO maybe the bottleneck by checking the result of 
iostat.

 

My question is, even I only facet in a very short time range, why the size of 
index has so big impact on the performance of such query? Do I have to use 
daily index, just like logstash?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0D3FCF70-13F5-4E92-8407-47E907736F79%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53db1703.a567440a.7a19.36b5%40mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: index size impact on search performance?

2014-07-31 Thread Wang Yong
Thank you Mark,

in your word “time based indexes”, is that means create one index every day? if 
I index my data in this way, i have to specify which index to search when 
create query in my java client, based on the “from” and “to”.






Sent from Surface





From: Mark Walkom
Sent: ‎Friday‎, ‎August‎ ‎1‎, ‎2014 ‎11‎:‎02‎ ‎AM
To: elasticsearch@googlegroups.com





If you're using time series data then it makes sense to use time based indexes.




Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com



On 1 August 2014 12:43, David Pilato da...@pilato.fr wrote:



Well. I guess it depends on your query. What does it look like?


--

David ;-)

Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs





Le 1 août 2014 à 04:14, Wang Yong cnwangy...@gmail.com a écrit :







Hi folks, I have an index storing lots of time serial data. The data are put 
into index by :

 

curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d '

{

val: 23,

timestamp: 1406822400

}'

 

And the only thing I search in this index is histogram facet in a very short 
time range, like “recent 5 min”. I found that the performance was pretty good 
at first. But when the index get bigger, the performance dropped to 
unacceptable. I found the IO maybe the bottleneck by checking the result of 
iostat.

 

My question is, even I only facet in a very short time range, why the size of 
index has so big impact on the performance of such query? Do I have to use 
daily index, just like logstash?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00fd01cfad2e%2457938420%2406ba8c60%24%40gmail.com.
For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0D3FCF70-13F5-4E92-8407-47E907736F79%40pilato.fr.


For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YEg4XVNGAN%2BL8P6QHKjFn_9JTqkJE7TtMHh8fPqZhFxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53db1a2f.8d5b460a.30e5.41d5%40mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Unable to install plugin

2014-07-31 Thread sahil
Hey, new to ES, I'm running elastic search as a service and installed it 
using this guide:

Run Elastic Search as a service on linux 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service.html#setup-service

I'm unable to access any bin/plugin directory and hence can't install 
plugins.

A little help would be really appreciated.
Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2697075a-ead1-4b78-adc6-9bb86a096b42%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.