Re: A simple example of how to use the Java API?

2015-03-24 Thread David Pilato
Did you look at answer branch?

 Answers are on answers branch: 
 https://github.com/elasticsearchfr/hands-on/tree/answers
 



David

 Le 24 mars 2015 à 12:58, 'newbo' via elasticsearch 
 elasticsearch@googlegroups.com a écrit :
 
 Hey David,
 
 I'm looking at the code now and I see lots of TODOs and nulls, but no actual 
 indexing.  Perhaps that code needs to be cleaned up?
 
 On Thursday, November 15, 2012 at 9:11:02 AM UTC-5, David Pilato wrote:
 
 Hi Ryan,
 
 Have a look at this: https://github.com/elasticsearchfr/hands-on
 Answers are on answers branch: 
 https://github.com/elasticsearchfr/hands-on/tree/answers
 
 Some slides (in french but you can understand code) are here: 
 http://www.slideshare.net/mobile/dadoonet/hands-on-lab-elasticsearch
 
 I'm waiting for ES team to merge my Java tutorial: 
 https://github.com/elasticsearch/elasticsearch.github.com/pull/321
 It will look like this: 
 https://github.com/dadoonet/elasticsearch.github.com/blob/9f6dae922247a81f1c72c77769d1dce3ad09dca5/tutorials/_posts/2012-11-07-using-java-api-basics.markdown
 
 
 HTH
 
 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/92c7a0f5-ef47-45c2-b121-8c0302fa8c71%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/F32F4B0D-3A65-45C6-BC95-19608C8AE62B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


ElasticSearch aggregation

2015-03-24 Thread Raghav salotra

Hi All,
I have and index which contains details of job opening with some fields 
like 
[create_date,Open_positions,skils_required,project_start_date,project_end_date]

Now my concern is to find the top 10 skills in a year based on quarter. A 
skill will be considered among the top if it has more number of openings in 
partcular quarter and number of openings are not decreasing in consecutive 
quartar.
This is really urgent any help will be highly appreciated.

Regards,
Raghav Salotra

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4de2a55b-dfbb-435e-8233-9acf7d67cae6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: encountered monitor.jvm warning

2015-03-24 Thread Jason Wee
what is the new gen setting? how much is the system memory in total?
how many cpu cores in the node? what query do you use?

jason

On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain
huss...@novacom.mygbiz.com wrote:
 Hi all,

 we today got (for the first time) warning messages which seem to indicate a
 memory problem:
 [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger]
 [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total
 [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young]
 [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old]
 [6.9gb]-[3.7gb]/[8.5gb]}
 [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger]
 [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total
 [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young]
 [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old]
 [6.9gb]-[3.7gb]/[8.5gb]}
 [2015-03-24 09:08:15,372][WARN ][monitor.jvm  ] [Danger]
 [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total
 [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young]
 [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old]
 [3.7gb]-[4.9gb]/[8.5gb]}
 [2015-03-24 09:08:18,192][WARN ][monitor.jvm  ] [Danger]
 [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total
 [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young]
 [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old]
 [4.9gb]-[6gb]/[8.5gb]}
 [2015-03-24 09:08:21,506][WARN ][monitor.jvm  ] [Danger]
 [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total
 [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young]
 [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] [149.7mb]-[149.7mb]/[149.7mb]}{[old]
 [6gb]-[7.2gb]/[8.5gb]}

 We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other
 settings are defaults. From previous posts related to this issue, it is said
 that field data cache may be a problem.

 Requesting /_nodes/stats/indices/fielddata says:
 {
cluster_name: my_cluster,
nodes: {
   ILUggMfTSvix8Kc0nfNVAw: {
  timestamp: 1427188716203,
  name: Danger,
  transport_address: inet[/192.168.110.91:9300],
  host: xxx,
  ip: [
 inet[/192.168.110.91:9300],
 NONE
  ],
  indices: {
 fielddata: {
memory_size_in_bytes: 6484,
evictions: 0
 }
  }
   }
}
 }

 Running top results in:
 PID USER  PR  NI  VIRT  RES  SHR S   %CPU %MEMTIME+  COMMAND
 12735 root   20   0 15.8g  10g0 S 74 13.2   2485:26 java

 Any ideas what to do? If possible I would rather avoid increasing ES_HEAP as
 there isn't that much free memory left on the host.

 Regards,

 Abid

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itz65SQBZu8NOTXgEo6OGp0LdGbTbzvfhp-NPSR%3DaNVXAw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: corrupted shard after optimize

2015-03-24 Thread mjdude5
Thanks for the CheckIndex info, that worked!  It looks like only one of the 
segments in that shard has issues:

  1 of 20: name=_1om docCount=216683
codec=Lucene3x
compound=false
numFiles=10
size (MB)=5,111.421
diagnostics = {os=Linux, os.version=3.5.7, mergeFactor=7, source=merge, 
lucene.version=3.6.0 1310449 - rmuir - 2012-04-06 11:31:16, os.arch=amd64, 
mergeMaxNumSegments=-1, java.version=1.6.0_26, java.vendor=Sun Microsystems 
Inc.}
no deletions
test: open reader.OK
test: check integrity.OK
test: check live docs.OK
test: fields..OK [31 fields]
test: field norms.OK [20 fields]
test: terms, freq, prox...ERROR: java.lang.AssertionError: 
index=216690, numBits=216683
java.lang.AssertionError: index=216690, numBits=216683
at org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:252)
at 
org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:932)
at 
org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1325)
at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:631)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051)
test: stored fields...OK [3033562 total field count; avg 14 fields 
per doc]
test: term vectorsOK [0 total vector count; avg 0 term/freq 
vector fields per doc]
test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 
0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
FAILED
WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: Term Index test failed
at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:646)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051)

This is on ES 1.3.4, but the index I was running optimize on was likely 
created back in 0.9 or 1.0.

On Tuesday, March 24, 2015 at 5:27:04 AM UTC-4, Michael McCandless wrote:

 Hmm, not good.

 Which version of ES?  Do you have a full stack trace for the exception?

 To run CheckIndex you need to add all ES jars to the classpath.  It's 
 easiest to just use a wildcard for this, e.g.:

   java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex 
 ...

 Make sure you have the double quotes so the shell does not expand that 
 wildcard!

 Mike McCandless

 On Mon, Mar 23, 2015 at 9:50 PM, mjd...@gmail.com javascript: wrote:

 I did an optimize on this index and it looks like it caused a shard to 
 become corrupted.  Or maybe the optimize just brought the shard corruption 
 to light?

 On the node that reported the corrupted shard I tried shutting it down, 
 moving the shard out and then restarting. Unfortunately the next node that 
 got that shard then started with the same corruption issues.  The errors:

 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN 
 ][indices.cluster  ] [Meteorite II] [1-2013][0] failed to start 
 shard
 Mar 24 01:40:17 localhost 
 org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: 
 [1-2013][0] failed to fetch index version after copying it over
 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN 
 ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed 
 shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], 
 indexUUID [_na_], reason [Failed to start shard, message 
 [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index 
 version after copying it over]; nested: CorruptIndexException[[1-2013][0] 
 Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: 
 CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut: 
 org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]]

 I tried using CheckIndex, but had this issue:

 java.lang.IllegalArgumentException: A SPI class of type 
 org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. 
 You need to add the corresponding JAR file supporting this SPI to your 
 classpath.The current classpath supports the following names: [Pulsing41, 
 SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, 
 FST41, FSTOrd41, Lucene40, Lucene41]

 When running with:

 java -cp 
 /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar
  
 -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex

 I'm not a java programmer so after I tried other classpath combinations I 
 was out of ideas.


 Any tips?  Looking at _cat/shards the replica is currently marked 
 unassigned while the primary is initializing.  Thanks!

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com
  
 

Re: PUT gzipped data into elasticsearch

2015-03-24 Thread joergpra...@gmail.com
Are you using Java API?

Jörg

On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com
wrote:

 Hi,
 some of our data are big ones (1 - 10 MB), and if there are milions of
 those, it causes us trouble in our internal network.
 We would like to compress these data in generation time, and send over the
 network to elasticsearch in compressed state. I tried to find some
 solution, but usually I found only retrieving gzipped data. Is there a way,
 how to put compressed data into elasticsearch?

 Thanks,
 Marcel

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Install Kibana4 as a plugin

2015-03-24 Thread DK
Hi,

Is it possible to install Kibana 4 as a plugin to Elasticsearch 1.5.0?

If so do I need to configure anything afterwards?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Nested list aggregation

2015-03-24 Thread Vasily Kirichenko
I have documents like this in my index:

{
   time: 2015-03-24T08:24:55.9056988,
   msg: {
corrupted: false,
stat: {
fileSize: 10186,
stages: [
{
   stage: queued,
   duration: 420
},
{
   stage: validate,
   duration: 27
},
{
   stage: cacheFile,
   duration: 87
},
{
   stage: sendResult,
   duration: 1332
}
   ]
   }
   }
}

I'd like to calculate sum(msg.stat.stages.duration) grouped by 
msg.stat.stages.stage.
I tried the following:

{
  size: 0,
  aggs: 
  {
1: 
{
  terms: { field: msg.stat.stages.stage },
  aggs:
  {
2:
{
  nested: { path: stat.stages },
  aggs: 
  {
3: {
  sum: {
field: stat.stages.duration
  }
}
  }
}
  }
}
  },
  query: {
match_all: {}
  }
}

and got:

{
   took: 6,
   timed_out: false,
   _shards: {
  total: 5,
  successful: 5,
  failed: 0
   },
   hits: {
  total: 1,
  max_score: 0,
  hits: []
   },
   aggregations: {
  1: {
 doc_count_error_upper_bound: 0,
 sum_other_doc_count: 0,
 buckets: [
{
   2: {
  doc_count: 0
   },
   key: cachefile,
   doc_count: 1
},
  {
   2: {
  doc_count: 0
   },
   key: queued,
   doc_count: 1
},
 {
   2: {
  doc_count: 0
   },
   key: sendresult,
   doc_count: 1
},
{
   2: {
  doc_count: 0
   },
   key: validate,
   doc_count: 1
}
 ]
  }
   }
}

which is not what I expected. Any ideas?

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22173f85-6d94-4b29-a9f5-b13c46a4850d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Filter cache and field data caches

2015-03-24 Thread Vladi Feigin
Hi All,

We observe the relatively high heap usage : 70-75% per node. Our heap size 
is 32G. Each node has 2 shards (one primary , one replica). Each shard size 
is approximately 43G
The size of the field data cache is ~22G and there are zero evictions from 
this cache
The size of the filter cache is ~28G and there are many evictions
Is it normal having such large cache sizes? We think it affects our queries 
performance. On an empty cache a query latency is 2-3 minutes for the first 
run. Second run takes 10 seconds (and after some time again 2 min)
Does it those 2-3 min take to Elastic to rebuild/refresh the caches?  Why 
it's so long? 
Please help us to understand how Elastic creates filter cache and field 
data cache. When does Elastic create these caches and how does it affect 
the queries performance?

Thank you in advance,
Vladi


-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d51b34e-5252-44d8-ac56-df4c37b6a2b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: PUT gzipped data into elasticsearch

2015-03-24 Thread Marcel Matus
Hi Joe,
no, we post data into elasticsearch using Logstash. Actually, the complete
way is: APP - Logstash - RMQ - Logstash - ES.
And when I include F5 VIP's, there are 6 services on the way to process
these data...

Marcel


On Tue, Mar 24, 2015 at 1:50 PM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Are you using Java API?

 Jörg

 On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com
 wrote:

 Hi,
 some of our data are big ones (1 - 10 MB), and if there are milions of
 those, it causes us trouble in our internal network.
 We would like to compress these data in generation time, and send over
 the network to elasticsearch in compressed state. I tried to find some
 solution, but usually I found only retrieving gzipped data. Is there a way,
 how to put compressed data into elasticsearch?

 Thanks,
 Marcel

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/iWT19g3_Viw/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Marcel Matus

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Query conatining multiple fields

2015-03-24 Thread 4m7u1
Hi, This is a part of source in elasticsearch index.

_source: {
InvoiceId: a,
InvoiceNum: b,
LastUpdatedBy: FINUSER1,
SetOfBooksId: 1,
LastUpdateDate: 2015-03-20 18:31:16.592,
VendorId: 5,
InvoiceAmount: 1200,
InvoiceCurrencyCode: USD,
AmountPaid: NotFound,
PaymentCurrencyCode: USD,
 }
I need to query on multiple inputs using the JAVA API, for example say I 
have provided InvoiceId InvoiceNum as a and b .
How can I query using these multiple inputs? I want a single response for 
this.
One thing I came across was  

 MultiSearchResponse sr = 
client.prepareMultiSearch().add(srb1).add(srb2).execute().actionGet(); where
 SearchRequestBuilder srb1 = 
client.prepareSearch(dbdata2).setTypes(invoices2).setQuery(QueryBuilders.matchQuery(InvoiceId,
 
a));
 SearchRequestBuilder srb2 = 
client.prepareSearch(dbdata2).setTypes(invoices2).setQuery(QueryBuilders.matchQuery(InvoiceNum,b));

but these return individual results. Is there any way I can get a single 
output?
   
   

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4105c392-462a-4902-a762-c92dc92663ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: A simple example of how to use the Java API?

2015-03-24 Thread 'newbo' via elasticsearch
Hey David,

I'm looking at the code now and I see lots of TODOs and nulls, but no 
actual indexing.  Perhaps that code needs to be cleaned up?

On Thursday, November 15, 2012 at 9:11:02 AM UTC-5, David Pilato wrote:


 Hi Ryan,

 Have a look at this: https://github.com/elasticsearchfr/hands-on
 Answers are on answers branch: 
 https://github.com/elasticsearchfr/hands-on/tree/answers

 Some slides (in french but you can understand code) are here: 
 http://www.slideshare.net/mobile/dadoonet/hands-on-lab-elasticsearch

 I'm waiting for ES team to merge my Java tutorial: 
 https://github.com/elasticsearch/elasticsearch.github.com/pull/321
 It will look like this: 
 https://github.com/dadoonet/elasticsearch.github.com/blob/9f6dae922247a81f1c72c77769d1dce3ad09dca5/tutorials/_posts/2012-11-07-using-java-api-basics.markdown


 HTH

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/92c7a0f5-ef47-45c2-b121-8c0302fa8c71%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Change/update of the datatype of latitude and longitude from double to geo_points

2015-03-24 Thread David Pilato
1: you can not « update » an existing field and transform it from double to 
geo_points
2: you want coordinates to be your geo_point, not latitude and longitude to be 
each of it a geo_point.
3: your put mapping request is incorrect. Have a look at 
http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
 
http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
The type should be part of your JSON content.

HTH

-- 
David Pilato - Developer | Evangelist 
elastic.co
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs





 Le 24 mars 2015 à 10:39, nidhi chawhan nidhi.chaw...@gmail.com a écrit :
 
 Hi Team,
 
 I wanted to update the datatype of latitude and longitude fields from double 
 to geo_points .
 
 Index name  :- visit.hippo_en
 Type :- visit:PlaceDocument
 
 PUT /visit.hippo_en/visit:PlaceDocument/_mapping
 { 
 location: {
   properties :{
 coordinates: {
 properties:{
   coordinates: {
  properties:{ 
 latitude: {
 type: geo_point

  },
  longitude: {
 type: geo_point

  }
  }
   }
  
   }
 }
   }
}
 }
 
 then getting below error 
 
 
 {
error: MapperParsingException[Root type mapping not empty after 
 parsing! Remaining fields:   [location : 
 {properties={coordinates={properties={coordinates={properties={latitude={type=geo_point},
  longitude={type=geo_point}}}]],
status: 400
 }
 
 when trying to update through java API
 below is the code
 
 CreateIndex createIndex = new CreateIndex.Builder(visitHippoIndexName + 
 UNDERSCORE + lang).build();
   jestClient.execute(createIndex);
 
   String geoPointMapping = \n +
   \location\: {\n +
 \properties\ :{\n +
   \coordinates\: {\n +
   \properties\:{\n +
 \coordinates\: {\n +
\properties\:{ \n +
   \latitude\: {\n 
 +
   \type\: 
 \geo_point\\n +
  \n +
},\n +
\longitude\: 
 {\n +
   \type\: 
 \geo_point\\n +
  \n +
}\n +
}\n +
 }\n +
\n +
 }\n +
   }\n +
 }\n +
  };
 
   if(nodeType.equals(visit:PlaceDocument)){
   PutMapping keyMapping = new 
 PutMapping.Builder(visitHippoIndexName + UNDERSCORE + lang, nodeType, 
 geoPointMapping).build();
   jestClient.execute(keyMapping);
   }
 
 
   Index index = new 
 Index.Builder(source).index(visitHippoIndexName + UNDERSCORE + 
 lang).type(nodeType).id(handleNode.getIdentifier()).build();
   jestClient.execute(index);
 
 It gives me below exception
 
 2015-03-24 10:27:13,787][DEBUG][action.admin.indices.mapping.put] [Carnivore] 
 failed to put mappings on indices [[visit.hi
 po_en]], type [visit:PlaceDocument]
 rg.elasticsearch.index.mapper.MapperParsingException: malformed mapping no 
 root object found
at 
 org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:338)
at 
 org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:183)
at 
 org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:444)
at 
 org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:505)
at 
 org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
at 
 org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(Prio
 itizedEsThreadPoolExecutor.java:153)

PUT gzipped data into elasticsearch

2015-03-24 Thread Marcel Matus
Hi,
some of our data are big ones (1 - 10 MB), and if there are milions of 
those, it causes us trouble in our internal network.
We would like to compress these data in generation time, and send over the 
network to elasticsearch in compressed state. I tried to find some 
solution, but usually I found only retrieving gzipped data. Is there a way, 
how to put compressed data into elasticsearch?

Thanks,
Marcel

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Bulk UDP deprecated?

2015-03-24 Thread joergpra...@gmail.com
See https://github.com/elastic/elasticsearch/pull/7595

This feature is rarely used. Removing it will help reduce the moving parts
of Elasticsearch and focus on the core.

If there is demand, I can jump in and move the bulk UDP code to a
community-supported plugin for ES 2.0

For syslog, I have implemented a similar plugin:

https://github.com/jprante/elasticsearch-syslog

Jörg

On Tue, Mar 24, 2015 at 12:14 PM, Steve Freegard s...@fsl.com wrote:

 Hi all,

 Can anyone tell me why the Bulk UDP service is being deprecated?

 I'm using it currently as it's a handy way to send non-critical data to ES
 without the associated overheads of doing it via the standard API and
 without affecting the performance of the application sending the data (as
 it's essentially fire-and-forget).

 Kind regards,
 Steve.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/mergv2%24qpm%241%40ger.gmane.org.
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFv2Wyc4SQYQFbWqe_629twmT_3hU3CubunLsB%3D5Tnpjg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: corrupted shard after optimize

2015-03-24 Thread Michael McCandless
Hmm, not good.

Which version of ES?  Do you have a full stack trace for the exception?

To run CheckIndex you need to add all ES jars to the classpath.  It's
easiest to just use a wildcard for this, e.g.:

  java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex
...

Make sure you have the double quotes so the shell does not expand that
wildcard!

Mike McCandless

On Mon, Mar 23, 2015 at 9:50 PM, mjdu...@gmail.com wrote:

 I did an optimize on this index and it looks like it caused a shard to
 become corrupted.  Or maybe the optimize just brought the shard corruption
 to light?

 On the node that reported the corrupted shard I tried shutting it down,
 moving the shard out and then restarting. Unfortunately the next node that
 got that shard then started with the same corruption issues.  The errors:

 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN
 ][indices.cluster  ] [Meteorite II] [1-2013][0] failed to start
 shard
 Mar 24 01:40:17 localhost
 org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
 [1-2013][0] failed to fetch index version after copying it over
 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN
 ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed
 shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING],
 indexUUID [_na_], reason [Failed to start shard, message
 [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index
 version after copying it over]; nested: CorruptIndexException[[1-2013][0]
 Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by:
 CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut:
 org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]]

 I tried using CheckIndex, but had this issue:

 java.lang.IllegalArgumentException: A SPI class of type
 org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist.
 You need to add the corresponding JAR file supporting this SPI to your
 classpath.The current classpath supports the following names: [Pulsing41,
 SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41,
 FST41, FSTOrd41, Lucene40, Lucene41]

 When running with:

 java -cp
 /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar
 -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex

 I'm not a java programmer so after I tried other classpath combinations I
 was out of ideas.


 Any tips?  Looking at _cat/shards the replica is currently marked
 unassigned while the primary is initializing.  Thanks!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/31fa3d97-02fa-4d1c-b507-d413051f2ea3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKHUQPhMOJWkN9p_En%2BWDM98bEDHSWTi36B_TcQsZSw%2BBKorYQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


encountered monitor.jvm warning

2015-03-24 Thread Abid Hussain
Hi all,

we today got (for the first time) warning messages which seem to indicate a 
memory problem:
[2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
[gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total 
[5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] 
[853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] 
[6.9gb]-[3.7gb]/[8.5gb]}
[2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
[gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total 
[18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] 
[853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] 
[6.9gb]-[3.7gb]/[8.5gb]}
[2015-03-24 09:08:15,372][WARN ][monitor.jvm  ] [Danger] 
[gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], total 
[1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] 
[6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] 
[3.7gb]-[4.9gb]/[8.5gb]}
[2015-03-24 09:08:18,192][WARN ][monitor.jvm  ] [Danger] 
[gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], total 
[1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] 
[845.4mb]-[1.2mb]/[1.1gb]}{[survivor] 
[149.7mb]-[149.7mb]/[149.7mb]}{[old] [4.9gb]-[6gb]/[8.5gb]}
[2015-03-24 09:08:21,506][WARN ][monitor.jvm  ] [Danger] 
[gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], total 
[1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] 
[848.6mb]-[2.1mb]/[1.1gb]}{[survivor] 
[149.7mb]-[149.7mb]/[149.7mb]}{[old] [6gb]-[7.2gb]/[8.5gb]}

We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, other 
settings are defaults. From previous posts related to this issue, it is 
said that field data cache may be a problem.

Requesting */_nodes/stats/indices/fielddata *says:
{
   cluster_name: my_cluster,
   nodes: {
  ILUggMfTSvix8Kc0nfNVAw: {
 timestamp: 1427188716203,
 name: Danger,
 transport_address: inet[/192.168.110.91:9300],
 host: xxx,
 ip: [
inet[/192.168.110.91:9300],
NONE
 ],
 indices: {
fielddata: {
   memory_size_in_bytes: 6484,
   evictions: 0
}
 }
  }
   }
}

Running top results in:
PID USER  PR  NI  VIRT  RES  SHR S   %CPU %MEMTIME+  COMMAND
12735 root   20   0 15.8g  10g0 S 74 13.2   2485:26 java

Any ideas what to do? If possible I would rather avoid increasing ES_HEAP 
as there isn't that much free memory left on the host.

Regards,

Abid

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Change/update of the datatype of latitude and longitude from double to geo_points

2015-03-24 Thread nidhi chawhan
Hi Team,

I wanted to update the datatype of latitude and longitude fields from 
double to geo_points .

Index name  :- visit.hippo_en
Type :- visit:PlaceDocument

PUT /visit.hippo_en/visit:PlaceDocument/_mapping
{ 
location: {
  properties :{
coordinates: {
properties:{
  coordinates: {
 properties:{ 
latitude: {
type: geo_point
   
 },
 longitude: {
type: geo_point
   
 }
 }
  }
 
  }
}
  }
   }
}

then getting below error 


{
   error: MapperParsingException[Root type mapping not empty after 
parsing! Remaining fields:   [location : 
{properties={coordinates={properties={coordinates={properties={latitude={type=geo_point},
 
longitude={type=geo_point}}}]],
   status: 400
}

when trying to update through java API
below is the code

CreateIndex createIndex = new CreateIndex.Builder(visitHippoIndexName + 
UNDERSCORE + lang).build();
jestClient.execute(createIndex);

String geoPointMapping = \n +
\location\: {\n +
  \properties\ :{\n +
\coordinates\: {\n +
\properties\:{\n +
  \coordinates\: {\n +
 \properties\:{ \n +
\latitude\: {\n +
\type\: \geo_point\\n +
   \n +
 },\n +
 \longitude\: {\n +
\type\: \geo_point\\n +
   \n +
 }\n +
 }\n +
  }\n +
 \n +
  }\n +
}\n +
  }\n +
   };

if(nodeType.equals(visit:PlaceDocument)){
PutMapping keyMapping = new PutMapping.Builder(visitHippoIndexName + 
UNDERSCORE + lang, nodeType, geoPointMapping).build();
jestClient.execute(keyMapping);
}


Index index = new Index.Builder(source).index(visitHippoIndexName + 
UNDERSCORE + lang).type(nodeType).id(handleNode.getIdentifier()).build();
jestClient.execute(index);

It gives me below exception

2015-03-24 10:27:13,787][DEBUG][action.admin.indices.mapping.put] 
[Carnivore] failed to put mappings on indices [[visit.hi
po_en]], type [visit:PlaceDocument]
rg.elasticsearch.index.mapper.MapperParsingException: malformed mapping no 
root object found
   at 
org.elasticsearch.index.mapper.DocumentMapperParser.extractMapping(DocumentMapperParser.java:338)
   at 
org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:183)
   at 
org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:444)
   at 
org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:505)
   at 
org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
   at 
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(Prio
itizedEsThreadPoolExecutor.java:153)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)

i am giving the root object which is location in this case.
But still error.

Please advise how to update the double object to geopoints?


Thanks,
Nidhi



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/776927e3-e551-4269-bf17-962134a4b17a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unable to preserve special characters in search results of ElasticSearch.

2015-03-24 Thread Muddadi Hemaanusha
If I escape that character '/' I will get the data irrelevant to the search 
term 

eg: if a/b, a/c, a/d, a(b), a(c), a-b, ab  is my data

My requirement is if I enter the term a( then I have to get only a( as a 
result but not ab,a-b,...

I would like to get the results without escaping them, the result need to 
preserve the special character '/' , any query which is relevant to that 
will be helpful and the search term need to search in different fields but 
no need to match in all fields , any one field match result is required.

On Monday, March 23, 2015 at 7:50:08 PM UTC+5:30, Periyandavar wrote:

 Hi 
  I think You need to escape those spl char in search string, like 
 { 
   query: { 
 bool: { 
   must: [ 
 { 
   query_string: { 
 fields: [ 
   msg 
 ], 
 query: a\\/ 
   } 
 } 
   ] 
 } 
   } 
 } 



 -- 
 View this message in context: 
 http://elasticsearch-users.115913.n3.nabble.com/Unable-to-preserve-special-characters-in-search-results-of-ElasticSearch-tp4072409p4072418.html
  
 Sent from the ElasticSearch Users mailing list archive at Nabble.com. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c8b7093-5dae-4ea5-9b60-bf5353f6d985%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Bulk UDP deprecated?

2015-03-24 Thread Steve Freegard

Hi all,

Can anyone tell me why the Bulk UDP service is being deprecated?

I'm using it currently as it's a handy way to send non-critical data to 
ES without the associated overheads of doing it via the standard API and 
without affecting the performance of the application sending the data 
(as it's essentially fire-and-forget).


Kind regards,
Steve.

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/mergv2%24qpm%241%40ger.gmane.org.
For more options, visit https://groups.google.com/d/optout.


Query String query is not accepting analyzer which defined in settings..

2015-03-24 Thread Muddadi Hemaanusha
Hi Team,


I kept the following settings for an index facebook..

PUT /facebook
{
settings : {
analysis: {
  analyzer: {
wordAnalyzer: {
  type: custom,
  tokenizer: whitespace,
  filter: [
word_delimiter_for_phone
  ]
}
  },
  filter: {
word_delimiter_for_phone: {
  type: word_delimiter,
  catenate_all: true,
  generate_number_parts : false,
  split_on_case_change: false,
  generate_word_parts: false,
  split_on_numerics: false,
  preserve_original: true
}
  }
}
},
mapping:{
face: {
properties: {
   id: {
  type: string
   },
   name: {
  type: string
   }
}
 }
}
}

I would like to retrieve the data by preserving the special characters in 
the search term as it is, which is having special characters like -, /, ( , 
) ,
for this I have used the above analyzer and filter settings.

When am using query :

GET facebook/face/_search
{
  query: {
  
  bool: {
  must: [
 {
 query_string: {
fields: [
   id,name
],
query: a/,
analyzer:wordAnalyzer
 }
 }
  ]
  }
  }
  
}

Showing error as:

error: SearchPhaseExecutionException[Failed to execute phase [query], 
all shards failed; shardFailures {[0IYtu1aLQ2qxfwLnbCFTHg][facebook][0]: 
SearchParseException[[facebook][0]: from[-1],size[-1]: Parse Failure 
[Failed to parse source 

I would like to preserve the special characters while searching how can I 
achieve this?  do I need to change my query (or ) may I get any query to 
search the term in different fields by preserving the special characters.









-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb395416-f991-45c7-b24b-960438f2d625%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


question for tokenizer and synonym

2015-03-24 Thread Prateek Asthana

I am having requirement similar to below:

   - search for chevy should map to search for chevrolet. I am using 
   synonyms to accomplish this. 
   - search for BMW 6 series should map to 61i, 62i, 63i and so on. 
   I am planning to use synonyms to accomplish this too. 

Field that contains either (chevrolet or 61i or 62i or 63i)  is having 
index analyzer as:

synonym_analyzer : {
type:  custom,
tokenizer: standard,
filter:  [ lowercase, synonym_filter ]
}

My issue with above standard tokenizer is that I cannot map BMW 6 series to 
61i, 62i, 63i. I cannot use keyword tokenizer as the search could be: 
chevy red or BMW 6 series v8. 
I am open to changing tokenizers and other elements if required. Please advise.

Thanks,
Prateek
 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2891dc4d-9826-4d83-a1e0-aeeae73f47c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Help with mapping update

2015-03-24 Thread David Kleiner
Greetings,

I am trying to update mapping to make certain fields not_analyzed.  The 
index gets created from default template, now I'm trying to add the 
following mapping to it (long) and get an error.  The index is empty. 

I'm probably missing something very obvious. 

Thank you,

David




$ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json
{error:RemoteTransportException[[ /  Luis 
Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: 
NullPointerException; ,status:500}(dk)dk-



updated mapping:
{
 mappings : { 

 {mytype : {
 _ttl : {
 enabled : true,
 default : 17280
 },
 properties : {
 @timestamp : {
 type : date,
 format : dateOptionalTime
 },
 @version : {
 type : string
 },
 file : {
 type : string
 },
 host : {
 type : string
 },
 json : {
 properties : {
 .buffer.size : {
 type : long
 },
 action : {
 type : string,index : not_analyzed
 },

[ ]

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread Mark Walkom
Try deleting the index and recreating using the updated mapping?

Also as a suggestion, avoid using TTL, it's very resource intensive and if
you are using time based data then you should use time based indices.

On 25 March 2015 at 08:28, David Kleiner david.klei...@gmail.com wrote:

 Greetings,

 I am trying to update mapping to make certain fields not_analyzed.  The
 index gets created from default template, now I'm trying to add the
 following mapping to it (long) and get an error.  The index is empty.

 I'm probably missing something very obvious.

 Thank you,

 David


 

 $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json
 {error:RemoteTransportException[[ /  Luis
 Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested:
 NullPointerException; ,status:500}(dk)dk-

 

 updated mapping:
 {
  mappings : {

  {mytype : {
  _ttl : {
  enabled : true,
  default : 17280
  },
  properties : {
  @timestamp : {
  type : date,
  format : dateOptionalTime
  },
  @version : {
  type : string
  },
  file : {
  type : string
  },
  host : {
  type : string
  },
  json : {
  properties : {
  .buffer.size : {
  type : long
  },
  action : {
  type : string,index : not_analyzed
  },

 [ ]

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8rkdzDc-VXV1wx%2BoXBPAiAnoiHf6AkAETXYFc1Rvi8jg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch data directory on Windows 7

2015-03-24 Thread Ravi Gupta
I understand the link are valid but it doesn't explain the directory size. 
How can 2.2GB data shrunk to 77KB? (this led me to believe that ES using 
some other location to put data.)

On Tuesday, 24 March 2015 18:35:30 UTC+11, Mark Walkom wrote:

 Actually as Windows users don't have an installer they need to use the 
 zip, so the path in the link are valid.

 On 24 March 2015 at 17:38, Ravi Gupta gupta...@gmail.com javascript: 
 wrote:

 @Mark the link you posted is for linux paths. I assumed windows path will 
 be on similar lines and expected index data to be on below path:

 C:\ES home\data\elasticsearch\nodes\0\indices\idd

 I see the path looks like it contains all the index data but I am puzzled 
 why the size of data directory is only 77 KB.

 There is a file of size 2 KB and it seems to contain some data that looks 
 encoded (it appears my application data)
 C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state

 My question is how can 2.2GB size data is compressed into 2KB unless I am 
 looking at wrong directory.




 On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote:

 Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/
 current/setup-dir-layout.html#_zip_and_tar_gz

 On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com wrote:

 Hi ,

 I have downloaded and installed ES on windows 7 and imported 2million+ 
 records into it. (from a csv of size 2.2GB). I can search records using 
 fiddler and it works as expected.

 I was hoping to see size of below directory to increase few GB but the 
 size is only 77 kb.

 C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data

 This means ES stores data in some other directory on disk.

 Can you please tell me the directory location where ES stores the data?

 Regards,
 Ravi

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ef3acf56-d725-468c-89d5-5f07d2c38f35%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread David Kleiner
Thank you Mark, I'll keep trying, will also try elasticdump to load custom 
mapping or play with templates. 

Cheers,

David

On Tuesday, March 24, 2015 at 3:44:55 PM UTC-7, Mark Walkom wrote:

 Try deleting the index and recreating using the updated mapping?

 Also as a suggestion, avoid using TTL, it's very resource intensive and if 
 you are using time based data then you should use time based indices.

 On 25 March 2015 at 08:28, David Kleiner david@gmail.com 
 javascript: wrote:

 Greetings,

 I am trying to update mapping to make certain fields not_analyzed.  The 
 index gets created from default template, now I'm trying to add the 
 following mapping to it (long) and get an error.  The index is empty. 

 I'm probably missing something very obvious. 

 Thank you,

 David


 

 $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json
 {error:RemoteTransportException[[ /  Luis 
 Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: 
 NullPointerException; ,status:500}(dk)dk-

 

 updated mapping:
 {
  mappings : { 

  {mytype : {
  _ttl : {
  enabled : true,
  default : 17280
  },
  properties : {
  @timestamp : {
  type : date,
  format : dateOptionalTime
  },
  @version : {
  type : string
  },
  file : {
  type : string
  },
  host : {
  type : string
  },
  json : {
  properties : {
  .buffer.size : {
  type : long
  },
  action : {
  type : string,index : not_analyzed
  },

 [ ]

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/6222b24f-7cf1-42ea-a0e6-68b52c4e9e76%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7ffd5cda-ca1f-443e-b476-27f1d3765c1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES: Ways to work with frequently updated fields of document

2015-03-24 Thread Mark Walkom
Best best is to try to separate hot and cold/warm data so you are only
updating things which *need* to be updated.
It may make sense to split this out into different systems, but this is
really up to you.

On 25 March 2015 at 04:28, Александр Свиридов ooo_satu...@mail.ru wrote:

 I have forum. And every topic has such field as viewCount - how many times
 topic was viewed by forum users.

 I wanted that all fields of topics were taken from ES
 (id,date,title,content and viewCount). However, this case after every topic
 view ES must reindex entire document again - I asked the question about
 particial update at stack -
 http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed
 .

 It means that if topic is viewed 1000 times ES will index it 1000 times.
 And if I have a lot of users many documents will be indexed again and
 again. This is first strategy.

 The second strategy, as I think is to take some fields of topic from index
 and some from database. At this case I take viewAcount from DB. However,
 then I can store all fields in DB and use index only as INDEX - to get ids
 of current topic.

 Are there better strategies? What is the best way to solve such problem?


 --
 Александр Свиридов

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru
 https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9P6L49jbDpOccA_z-ckKVFjz1o%2BXWWttN%2BYneL0YBfDg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread David Kleiner
I got the index template to work, turns out it's really straightforward. I 
now do have .raw fields and all the world is green.  

I'll add the template to my other index so it gets geoip mapping when I 
roll to April. 

Cheers,

David

On Tuesday, March 24, 2015 at 2:28:38 PM UTC-7, David Kleiner wrote:

 Greetings,

 I am trying to update mapping to make certain fields not_analyzed.  The 
 index gets created from default template, now I'm trying to add the 
 following mapping to it (long) and get an error.  The index is empty. 

 I'm probably missing something very obvious. 

 Thank you,

 David


 

 $ curl -XPUT localhost:9200/backend/mytype/_mapping -d mytype_mapping.json
 {error:RemoteTransportException[[ /  Luis 
 Buñuel][inet[/10.XX.XX.XX:9333]][indices:admin/mapping/put]]; nested: 
 NullPointerException; ,status:500}(dk)dk-

 

 updated mapping:
 {
  mappings : { 

  {mytype : {
  _ttl : {
  enabled : true,
  default : 17280
  },
  properties : {
  @timestamp : {
  type : date,
  format : dateOptionalTime
  },
  @version : {
  type : string
  },
  file : {
  type : string
  },
  host : {
  type : string
  },
  json : {
  properties : {
  .buffer.size : {
  type : long
  },
  action : {
  type : string,index : not_analyzed
  },

 [ ]


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4114d717-ae65-493f-8f7f-c2cec8567e2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: encountered monitor.jvm warning

2015-03-24 Thread Jason Wee
Few filters should be fine but aggregations... hm

You could use dump stack trace and/or heap dump if it happen again and
start to analyze from there. Try as well,
http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html

hth

jason

On Tue, Mar 24, 2015 at 11:59 PM, Abid Hussain
huss...@novacom.mygbiz.com wrote:
 All settings except from ES_HEAP (set to 10 GB) are defaults, so I actually
 am not sure about the new gen setting.

 The host has 80 GB memory in total and 24 CPU cores. All ES indices together
 sum up to ~32 GB where the biggest indices are of size ~8 GB.

 We are using queries mostly together with filters and also aggregations.

 We solved the problem with a restart of the cluster. Are there any
 recommended diagnostics to be performed when this problem occurs next time?

 Regards,

 Abid

 Am Dienstag, 24. März 2015 15:24:43 UTC+1 schrieb Jason Wee:

 what is the new gen setting? how much is the system memory in total?
 how many cpu cores in the node? what query do you use?

 jason

 On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain
 hus...@novacom.mygbiz.com wrote:
  Hi all,
 
  we today got (for the first time) warning messages which seem to
  indicate a
  memory problem:
  [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger]
  [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total
  [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young]
  [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old]
  [6.9gb]-[3.7gb]/[8.5gb]}
  [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger]
  [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total
  [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young]
  [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old]
  [6.9gb]-[3.7gb]/[8.5gb]}
  [2015-03-24 09:08:15,372][WARN ][monitor.jvm  ] [Danger]
  [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s],
  total
  [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young]
  [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old]
  [3.7gb]-[4.9gb]/[8.5gb]}
  [2015-03-24 09:08:18,192][WARN ][monitor.jvm  ] [Danger]
  [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s],
  total
  [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young]
  [845.4mb]-[1.2mb]/[1.1gb]}{[survivor]
  [149.7mb]-[149.7mb]/[149.7mb]}{[old]
  [4.9gb]-[6gb]/[8.5gb]}
  [2015-03-24 09:08:21,506][WARN ][monitor.jvm  ] [Danger]
  [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s],
  total
  [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young]
  [848.6mb]-[2.1mb]/[1.1gb]}{[survivor]
  [149.7mb]-[149.7mb]/[149.7mb]}{[old]
  [6gb]-[7.2gb]/[8.5gb]}
 
  We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g,
  other
  settings are defaults. From previous posts related to this issue, it is
  said
  that field data cache may be a problem.
 
  Requesting /_nodes/stats/indices/fielddata says:
  {
 cluster_name: my_cluster,
 nodes: {
ILUggMfTSvix8Kc0nfNVAw: {
   timestamp: 1427188716203,
   name: Danger,
   transport_address: inet[/192.168.110.91:9300],
   host: xxx,
   ip: [
  inet[/192.168.110.91:9300],
  NONE
   ],
   indices: {
  fielddata: {
 memory_size_in_bytes: 6484,
 evictions: 0
  }
   }
}
 }
  }
 
  Running top results in:
  PID USER  PR  NI  VIRT  RES  SHR S   %CPU %MEMTIME+  COMMAND
  12735 root   20   0 15.8g  10g0 S 74 13.2   2485:26 java
 
  Any ideas what to do? If possible I would rather avoid increasing
  ES_HEAP as
  there isn't that much free memory left on the host.
 
  Regards,
 
  Abid
 
  --
  You received this message because you are subscribed to the Google
  Groups
  elasticsearch group.
  To unsubscribe from this group and stop receiving emails from it, send
  an
  email to elasticsearc...@googlegroups.com.
  To view this discussion on the web visit
 
  https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com.
  For more options, visit https://groups.google.com/d/optout.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4320ec5a-1dc6-4eec-83c2-d9da618bf957%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Elastic{ON} Sessions Are Online

2015-03-24 Thread Mark Walkom
Hi All,
The content from our recent conference is now available!

Please check out https://www.elastic.co/blog/elasticon-sessions-are-online
for all the information.

Cheers,
Mark

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zu_2HNTH%2B7Oh2AHcfxN6AE1tt4eS8UOVQnVi4HvKiMw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: corrupted shard after optimize

2015-03-24 Thread mjdude5
Quick followup question, is it safe to run -fix while ES is also running on 
the node?  Understanding that some documents will be lost.

On Tuesday, March 24, 2015 at 10:24:26 AM UTC-4, mjd...@gmail.com wrote:

 Thanks for the CheckIndex info, that worked!  It looks like only one of 
 the segments in that shard has issues:

   1 of 20: name=_1om docCount=216683
 codec=Lucene3x
 compound=false
 numFiles=10
 size (MB)=5,111.421
 diagnostics = {os=Linux, os.version=3.5.7, mergeFactor=7, 
 source=merge, lucene.version=3.6.0 1310449 - rmuir - 2012-04-06 11:31:16, 
 os.arch=amd64, mergeMaxNumSegments=-1, java.version=1.6.0_26, 
 java.vendor=Sun Microsystems Inc.}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [31 fields]
 test: field norms.OK [20 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: 
 index=216690, numBits=216683
 java.lang.AssertionError: index=216690, numBits=216683
 at org.apache.lucene.util.FixedBitSet.set(FixedBitSet.java:252)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:932)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1325)
 at 
 org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:631)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051)
 test: stored fields...OK [3033562 total field count; avg 14 fields 
 per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 
 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at 
 org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:646)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2051)

 This is on ES 1.3.4, but the index I was running optimize on was likely 
 created back in 0.9 or 1.0.

 On Tuesday, March 24, 2015 at 5:27:04 AM UTC-4, Michael McCandless wrote:

 Hmm, not good.

 Which version of ES?  Do you have a full stack trace for the exception?

 To run CheckIndex you need to add all ES jars to the classpath.  It's 
 easiest to just use a wildcard for this, e.g.:

   java -cp /path/to/es-install/lib/* org.apache.lucene.index.CheckIndex 
 ...

 Make sure you have the double quotes so the shell does not expand that 
 wildcard!

 Mike McCandless

 On Mon, Mar 23, 2015 at 9:50 PM, mjd...@gmail.com wrote:

 I did an optimize on this index and it looks like it caused a shard to 
 become corrupted.  Or maybe the optimize just brought the shard corruption 
 to light?

 On the node that reported the corrupted shard I tried shutting it down, 
 moving the shard out and then restarting. Unfortunately the next node that 
 got that shard then started with the same corruption issues.  The errors:

 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN 
 ][indices.cluster  ] [Meteorite II] [1-2013][0] failed to start 
 shard
 Mar 24 01:40:17 localhost 
 org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: 
 [1-2013][0] failed to fetch index version after copying it over
 Mar 24 01:40:17 localhost elasticsearch: [bma.0][WARN 
 ][cluster.action.shard ] [Meteorite II] [1-2013][0] sending failed 
 shard for [1-2013][0], node[ZzXsIZCsTyWD2emFuU0idg], [P], s[INITIALIZING], 
 indexUUID [_na_], reason [Failed to start shard, message 
 [IndexShardGatewayRecoveryException[[1-2013][0] failed to fetch index 
 version after copying it over]; nested: CorruptIndexException[[1-2013][0] 
 Corrupted index [corrupted_OahNymObSTyBzCCPu1FuJA] caused by: 
 CorruptIndexException[docs out of order (1493829 = 1493874 ) (docOut: 
 org.apache.lucene.store.RateLimitedIndexOutput@2901a3e1)]]; ]]

 I tried using CheckIndex, but had this issue:

 java.lang.IllegalArgumentException: A SPI class of type 
 org.apache.lucene.codecs.PostingsFormat with name 'es090' does not exist. 
 You need to add the corresponding JAR file supporting this SPI to your 
 classpath.The current classpath supports the following names: [Pulsing41, 
 SimpleText, Memory, BloomFilter, Direct, FSTPulsing41, FSTOrdPulsing41, 
 FST41, FSTOrd41, Lucene40, Lucene41]

 When running with:

 java -cp 
 /usr/share/elasticsearch/lib/lucene-codecs-4.9.1.jar:/usr/share/elasticsearch/lib/lucene-core-4.9.1.jar
  
 -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex

 I'm not a java programmer so after I tried other classpath combinations 
 I was out of ideas.


 Any tips?  Looking at _cat/shards the replica is currently marked 
 unassigned while the primary is initializing.  Thanks!

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop 

ActionRequestValidationException when attempting to set mapping

2015-03-24 Thread Blake McBride
I am attempting to define a mapping.  I am first creating a new index by 
adding a document that doesn't contain anything specified in my 
mapping/schema.  I am using a bulk operation to set the mapping but I get 
an ActionRequestValidationException error each time.  Here is my JS code:

var mappings = {
 body: [
 { mappings: {
 component: {
 properties: {
 classifications: {
 type: string,
 index: not_analyzed
 }
 }
 }
 }}
 ]
};
client.bulk(mappings, function(err, resp){
 // error appears here
});


The exact message I get is:  ActionRequestValidationException[Validation 
Failed: 1: no requests added;]


Any help would sure be appreciated.


Blake McBride


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7e4580c1-708e-4cf8-a39e-a8bcc314d65b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Sample custom cluster action

2015-03-24 Thread Ali Lotfdar
Hello All,

I would like create a cluster action (for instance for searching 
documents), but I could not find any sample to help me how implement
cluster action. I had looked on cookbook explanation and sample, although 
it was useful but I could not find how I can make it.

Thank you to let me have some sample or instruction document.

Regards,
Ali

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8fea8af-0b95-41cf-b6c2-27c1568b642b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Should I use elasticsearch as a core for faceted navigation-heavy website?

2015-03-24 Thread Dmitry
Hello,

I'm evaluating elasticsearch for use in new project. I can see that it has 
all search features we need. Problem is that after reading documentation 
and forum I still can't understand whether elastic is suitable technology 
for us performance-wise. I'd be very grateful to get your opinion on that.


We're building a directory of businesses, similar to Yelp. We have 5m 
businesses, and main feature of our site is faceted search on different 
facets: geography (tens of thousands geo objects), business type (several 
thousand options), additional services offered by that business (hundreds) 
and so on. So for each request (combination of search parameters) we need 
to get search results, but also what options are available in each facet 
(for example, what business types are located are in selected geography) 
for user to be able to narrow down his search (example: 
http://take.ms/oAZan). Full text search (by business name for example) is 
used in very small percentage of requests, bulk of requests is exact match 
on one or several facets.

Based on the similar our project we expect 1-5m requests per day. All 
requests are highly diversified: no single page (combination of search 
params) constitutes more than 0,1% of total requests. We expect to be able 
to answer request in 200-300ms, so I guess request to elasticsearch should 
take no more than 100ms.

On our similar project we use big lookup table in database with all 
possible combinations of params mapped to search result count. For each 
request we generate all possible combinations of parameters to refine 
current search and then check lookup table to see if they have any results.


My questions are:

Is elastic search suitable for our purposes? Specifically, are aggregations 
meant to be used in large number of low-latency requests, or are they more 
like analytical feature, where response time is not that important? I ask 
that because in discussions of aggregation and faceting performance here 
and elsewhere response times are mentioned in 1-10s range, which is ok for 
analytics and infrequent searches, but obviously on ok for us.

How hard it is to get performance we need: 50 rps, 100ms response time for 
search+facets, on some reasonable hardware, taking into account big number 
of possible facet combinations and high diversification of requests? What 
kind of hardware should we expect to handle our loads? I understand that 
these are vague questions, but I just need some approximation. Is it more 
like 1 server with commodity hardware and simple configuration, or more 
like cloud of 10 servers and extensive tuning? For example, our lookup 
table solution works on 1 commodity server with 16gb of ram with almost 
default setup.


Thank you for your responses,
Dmitry

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Grandparents aggregations NullPointer

2015-03-24 Thread Paolo Ciccarese
I've posted the issue here:
https://github.com/elastic/elasticsearch/issues/10158

On Wednesday, March 11, 2015 at 4:42:27 PM UTC-4, Paolo Ciccarese wrote:

 I am using Elasticsearch 1.4.4.

 I am defining a parent-child index with three levels (grandparents) 
 following the instructions in the document:

 https://www.elastic.co/guide/en/elasticsearch/guide/current/grandparents.html

 My structure is
 Continent - Country - Region

 MAPPING
 --
 I create an index with the following mapping:

 curl -XPOST 'localhost:9200/geo' -d'
 {
   mappings: {
 continent: {},
 country: {
   _parent: {
 type: continent 
   }
 },
 region: {
   _parent: {
 type: country 
   }
 }
   }
 }' 

 INDEXING
 --
 I index three entities:

 curl -XPOST 'localhost:9200/geo/continent/europe' -d'
 {
 name:Europe
 }'

 curl -XPOST 'localhost:9200/geo/country/italy?parent=europe' -d'
 {
 name:Italy
 }'

 curl -XPOST 
 'localhost:9200/geo/region/lombardy?parent=italyrouting=europe' -d'
 {
 name:Lombardia
 }'

 QUERY THAT WORKS
 
 If I query and aggregate according to the document everything works fine:

 curl -XGET 'localhost:9200/geo/continent/_search?pretty=true' -d '
 {
 query: {
 has_child: {
 type: country,
 query: {
 has_child: {
 type: region,
 query: {
 match: {
 name: Lombardia
 }
 }
 }
 }
 }
 },
 aggs: {
 country: {
 terms: { 
 field: name
 },
 aggs: {
 countries: {
 children: {
 type: country
 },
 aggs: {
 country_names : {
 terms :  {
 field : country.name
 }
 }   
 }
 }
 }
 }
 }
 }'

 QUERY THAT DOES NOT WORK
 ---
 However, if I try with multi-level aggregations like in:

 curl -XGET 'localhost:9200/geo/continent/_search?pretty=true' -d '
 {
 query: {
 has_child: {
 type: country,
 query: {
 has_child: {
 type: region,
 query: {
 match: {
 name: Lombardia
 }
 }
 }
 }
 }
 },
 aggs: {
 continent_names: {
 terms: { 
 field: name
 },
 aggs: {
 countries: {
 children: {
 type: country
 }, 
 aggs: {
 regions: {
 children: {
 type: region
 }, 
 aggs: {
 region_names : {
 terms :  {
 field : region.name
 }
 }
 }
 }
 }
 }
 }
 }
 }
 }'

 I get back the following

 {

   error : SearchPhaseExecutionException[Failed to execute phase 
 [query], all shards failed; shardFailures 
 {[b5CbW5byQdSSW-rIwta0rA][geo][0]: QueryPhaseExecutionException[[geo][0]: 
 query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]:
  
 Query Failed [Failed to execute main query]]; nested: NullPointerException; 
 }{[b5CbW5byQdSSW-rIwta0rA][geo][1]: QueryPhaseExecutionException[[geo][1]: 
 query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]:
  
 Query Failed [Failed to execute main query]]; nested: NullPointerException; 
 }{[b5CbW5byQdSSW-rIwta0rA][geo][2]: QueryPhaseExecutionException[[geo][2]: 
 query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]:
  
 Query Failed [Failed to execute main query]]; nested: NullPointerException; 
 }{[b5CbW5byQdSSW-rIwta0rA][geo][3]: QueryPhaseExecutionException[[geo][3]: 
 query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]:
  
 Query Failed [Failed to execute main query]]; nested: NullPointerException; 
 }{[b5CbW5byQdSSW-rIwta0rA][geo][4]: QueryPhaseExecutionException[[geo][4]: 
 query[filtered(child_filter[country/continent](filtered(child_filter[region/country](filtered(name:lombardia)-cache(_type:region)))-cache(_type:country)))-cache(_type:continent)],from[0],size[10]:
  
 Query Failed [Failed to execute main query]]; nested: NullPointerException; 
 }],

   status : 500

 }

 The 'grandparents' document says: Querying and aggregating across 
 generations works, as long as you step through each generation.

 How do I get the last query and aggregation to work across 

Re: encountered monitor.jvm warning

2015-03-24 Thread Abid Hussain
All settings except from ES_HEAP (set to 10 GB) are defaults, so I actually 
am not sure about the new gen setting.

The host has 80 GB memory in total and 24 CPU cores. All ES indices 
together sum up to ~32 GB where the biggest indices are of size ~8 GB.

We are using queries mostly together with filters and also aggregations.

We solved the problem with a restart of the cluster. Are there any 
recommended diagnostics to be performed when this problem occurs next time?

Regards,

Abid

Am Dienstag, 24. März 2015 15:24:43 UTC+1 schrieb Jason Wee:

 what is the new gen setting? how much is the system memory in total? 
 how many cpu cores in the node? what query do you use? 

 jason 

 On Tue, Mar 24, 2015 at 5:26 PM, Abid Hussain 
 hus...@novacom.mygbiz.com javascript: wrote: 
  Hi all, 
  
  we today got (for the first time) warning messages which seem to 
 indicate a 
  memory problem: 
  [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
  [gc][young][413224][18109] duration [5m], collections [1]/[5.3m], total 
  [5m]/[16.7m], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] 
  [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] 
  [6.9gb]-[3.7gb]/[8.5gb]} 
  [2015-03-24 09:08:12,960][WARN ][monitor.jvm  ] [Danger] 
  [gc][old][413224][104] duration [18.4s], collections [1]/[5.3m], total 
  [18.4s]/[58s], memory [7.9gb]-[3.7gb]/[9.8gb], all_pools {[young] 
  [853.9mb]-[6.1mb]/[1.1gb]}{[survivor] [149.7mb]-[0b]/[149.7mb]}{[old] 
  [6.9gb]-[3.7gb]/[8.5gb]} 
  [2015-03-24 09:08:15,372][WARN ][monitor.jvm  ] [Danger] 
  [gc][young][413225][18110] duration [1.4s], collections [1]/[2.4s], 
 total 
  [1.4s]/[16.7m], memory [3.7gb]-[5gb]/[9.8gb], all_pools {[young] 
  [6.1mb]-[2.7mb]/[1.1gb]}{[survivor] [0b]-[149.7mb]/[149.7mb]}{[old] 
  [3.7gb]-[4.9gb]/[8.5gb]} 
  [2015-03-24 09:08:18,192][WARN ][monitor.jvm  ] [Danger] 
  [gc][young][413227][18111] duration [1.4s], collections [1]/[1.8s], 
 total 
  [1.4s]/[16.7m], memory [5.8gb]-[6.2gb]/[9.8gb], all_pools {[young] 
  [845.4mb]-[1.2mb]/[1.1gb]}{[survivor] 
 [149.7mb]-[149.7mb]/[149.7mb]}{[old] 
  [4.9gb]-[6gb]/[8.5gb]} 
  [2015-03-24 09:08:21,506][WARN ][monitor.jvm  ] [Danger] 
  [gc][young][413229][18112] duration [1.2s], collections [1]/[2.3s], 
 total 
  [1.2s]/[16.7m], memory [7gb]-[7.3gb]/[9.8gb], all_pools {[young] 
  [848.6mb]-[2.1mb]/[1.1gb]}{[survivor] 
 [149.7mb]-[149.7mb]/[149.7mb]}{[old] 
  [6gb]-[7.2gb]/[8.5gb]} 
  
  We're using ES 1.4.2 as a single node cluster, ES_HEAP is set to 10g, 
 other 
  settings are defaults. From previous posts related to this issue, it is 
 said 
  that field data cache may be a problem. 
  
  Requesting /_nodes/stats/indices/fielddata says: 
  { 
 cluster_name: my_cluster, 
 nodes: { 
ILUggMfTSvix8Kc0nfNVAw: { 
   timestamp: 1427188716203, 
   name: Danger, 
   transport_address: inet[/192.168.110.91:9300], 
   host: xxx, 
   ip: [ 
  inet[/192.168.110.91:9300], 
  NONE 
   ], 
   indices: { 
  fielddata: { 
 memory_size_in_bytes: 6484, 
 evictions: 0 
  } 
   } 
} 
 } 
  } 
  
  Running top results in: 
  PID USER  PR  NI  VIRT  RES  SHR S   %CPU %MEMTIME+  COMMAND 
  12735 root   20   0 15.8g  10g0 S 74 13.2   2485:26 java 
  
  Any ideas what to do? If possible I would rather avoid increasing 
 ES_HEAP as 
  there isn't that much free memory left on the host. 
  
  Regards, 
  
  Abid 
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups 
  elasticsearch group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an 
  email to elasticsearc...@googlegroups.com javascript:. 
  To view this discussion on the web visit 
  
 https://groups.google.com/d/msgid/elasticsearch/8c0116c2-309f-4daf-ad76-b12b866f9066%40googlegroups.com.
  

  For more options, visit https://groups.google.com/d/optout. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4320ec5a-1dc6-4eec-83c2-d9da618bf957%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana fails in talking to Tribe Node

2015-03-24 Thread Abigail
Hi,

I have two ElasticSearch clusters and one tribe node for searching across 
the two clusters. And there is a kibana 4 connects to the tribe node. What 
I have done are:
1. Created .kibana and kibana-int manually in the cluster
2. Launch kibana
3. Got to the screen Configure an index pattern, hit the create button

Kibana loads then throws MasterNotDiscoveredException[waited for [30s]] 
error.

It seems a known issue. I also tried adding tribe.blocks.write: false 
into tribe node config file without any luck. I am sure saving dash board 
will throw the same error. Can anyone help to confirm that if Kibana can 
talk to tribe node? And anyone figure it out how to do it?

Thank you!
Abigail

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c0b2a218-572e-4e1f-929e-09852b847185%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Should I use elasticsearch as a core for faceted navigation-heavy website?

2015-03-24 Thread Itamar Syn-Hershko
Short answer: yes. With properly sharded and scaled out environment, and
using ES 1.4 or newer, you should be able to get those numbers.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Tue, Mar 24, 2015 at 5:38 PM, Dmitry dmitry.bit...@gmail.com wrote:

 Hello,

 I'm evaluating elasticsearch for use in new project. I can see that it has
 all search features we need. Problem is that after reading documentation
 and forum I still can't understand whether elastic is suitable technology
 for us performance-wise. I'd be very grateful to get your opinion on that.


 We're building a directory of businesses, similar to Yelp. We have 5m
 businesses, and main feature of our site is faceted search on different
 facets: geography (tens of thousands geo objects), business type (several
 thousand options), additional services offered by that business (hundreds)
 and so on. So for each request (combination of search parameters) we need
 to get search results, but also what options are available in each facet
 (for example, what business types are located are in selected geography)
 for user to be able to narrow down his search (example:
 http://take.ms/oAZan). Full text search (by business name for example) is
 used in very small percentage of requests, bulk of requests is exact match
 on one or several facets.

 Based on the similar our project we expect 1-5m requests per day. All
 requests are highly diversified: no single page (combination of search
 params) constitutes more than 0,1% of total requests. We expect to be able
 to answer request in 200-300ms, so I guess request to elasticsearch should
 take no more than 100ms.

 On our similar project we use big lookup table in database with all
 possible combinations of params mapped to search result count. For each
 request we generate all possible combinations of parameters to refine
 current search and then check lookup table to see if they have any results.


 My questions are:

 Is elastic search suitable for our purposes? Specifically, are
 aggregations meant to be used in large number of low-latency requests, or
 are they more like analytical feature, where response time is not that
 important? I ask that because in discussions of aggregation and faceting
 performance here and elsewhere response times are mentioned in 1-10s range,
 which is ok for analytics and infrequent searches, but obviously on ok for
 us.

 How hard it is to get performance we need: 50 rps, 100ms response time for
 search+facets, on some reasonable hardware, taking into account big number
 of possible facet combinations and high diversification of requests? What
 kind of hardware should we expect to handle our loads? I understand that
 these are vague questions, but I just need some approximation. Is it more
 like 1 server with commodity hardware and simple configuration, or more
 like cloud of 10 servers and extensive tuning? For example, our lookup
 table solution works on 1 commodity server with 16gb of ram with almost
 default setup.


 Thank you for your responses,
 Dmitry

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/807801d4-2926-4e52-9161-dc82d3f33a75%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsOOMoJUDmdcH-zsL4EHymR6SUw6T3dKSs_UHfVq0mtCw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread Mark Walkom
Are you using KB4 or 3?
If 4 then you need to refresh the fields under the index settings.

On 25 March 2015 at 11:06, David Kleiner david.klei...@gmail.com wrote:

 My confusion just leveled-up :)

 So the mapping is updated, all string fields now have .raw copies but I
 don't see them in Kibana.   Perhaps my understanding of this mechanism is
 incomplete and I need to manually copy fields to their .raw counterparts in
 logstash so that ES query returns documents with .raw fields in them..

 And I did remove the TTL so my cluster is happy again.

 On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote:

 I got the index template to work, turns out it's really straightforward.
 I now do have .raw fields and all the world is green.

 I'll add the template to my other index so it gets geoip mapping when I
 roll to April.

 Cheers,

 David

 [trimmed]

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X93PAVFmAfjODzhD93Hn3XJCKuuyco4NU5vXczc3aGyhg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Install Kibana4 as a plugin

2015-03-24 Thread Mark Walkom
No you cannot, it runs on its own port.

On 24 March 2015 at 23:56, DK desmond.kirr...@gmail.com wrote:

 Hi,

 Is it possible to install Kibana 4 as a plugin to Elasticsearch 1.5.0?

 If so do I need to configure anything afterwards?


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ee0c2a1c-b16f-4f58-929a-8330fc1c4aa4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9y8fnOrfTCr%3DuULF4C9Js0ntJ_yUxZKLJEL%2ByZA4dOxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is java elasticsearch a joke?

2015-03-24 Thread Sai Asuka
The problem is in either case, I see no new index or anything.

On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote:

 I am using elasticsearch java client, indexing a document looks simple.
 I know 100% my cluster is running, and my name is create.

 Node node  = nodeBuilder().clusterName(mycluster).node();

 Client client = node.client();

 String json = { +

 \user\:\kimchy\, +

 \postDate\:\2013-01-30\, +

 \message\:\trying out Elasticsearch\ +

 };


 IndexResponse response = client.prepareIndex(twitter, tweet)

 .setSource(json)

 .execute()

 .actionGet();

 I read somewhere that you should shove .client(true), but I get:


 org.elasticsearch.discovery.MasterNotDiscoveredException

 What gives?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is java elasticsearch a joke?

2015-03-24 Thread Sai Asuka
I am using elasticsearch java client, indexing a document looks simple.
I know 100% my cluster is running, and my name is create.

Node node  = nodeBuilder().clusterName(mycluster).node();

Client client = node.client();

String json = { +

\user\:\kimchy\, +

\postDate\:\2013-01-30\, +

\message\:\trying out Elasticsearch\ +

};


IndexResponse response = client.prepareIndex(twitter, tweet)

.setSource(json)

.execute()

.actionGet();

I read somewhere that you should shove .client(true), but I get:


org.elasticsearch.discovery.MasterNotDiscoveredException

What gives?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e6b7b18e-69a5-41c3-a192-27cf37b230fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread David Kleiner
My confusion just leveled-up :) 

So the mapping is updated, all string fields now have .raw copies but I 
don't see them in Kibana.   Perhaps my understanding of this mechanism is 
incomplete and I need to manually copy fields to their .raw counterparts in 
logstash so that ES query returns documents with .raw fields in them..

And I did remove the TTL so my cluster is happy again. 

On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote:

 I got the index template to work, turns out it's really straightforward. I 
 now do have .raw fields and all the world is green.  

 I'll add the template to my other index so it gets geoip mapping when I 
 roll to April. 

 Cheers,

 David

 [trimmed] 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Help with mapping update

2015-03-24 Thread David Kleiner
Indeed, it just works in 4!

Thank you again, 

David

On Tuesday, March 24, 2015 at 5:49:16 PM UTC-7, Mark Walkom wrote:

 Are you using KB4 or 3?
 If 4 then you need to refresh the fields under the index settings.

 On 25 March 2015 at 11:06, David Kleiner david@gmail.com 
 javascript: wrote:

 My confusion just leveled-up :) 

 So the mapping is updated, all string fields now have .raw copies but I 
 don't see them in Kibana.   Perhaps my understanding of this mechanism is 
 incomplete and I need to manually copy fields to their .raw counterparts in 
 logstash so that ES query returns documents with .raw fields in them..

 And I did remove the TTL so my cluster is happy again. 

 On Tuesday, March 24, 2015 at 4:36:05 PM UTC-7, David Kleiner wrote:

 I got the index template to work, turns out it's really straightforward. 
 I now do have .raw fields and all the world is green.  

 I'll add the template to my other index so it gets geoip mapping when I 
 roll to April. 

 Cheers,

 David

 [trimmed] 

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e09155d6-42fd-4342-a53e-201e3f638038%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/14929de4-fce5-4a49-aaa3-ebda189632b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: filter bitsets

2015-03-24 Thread Ashish Mishra
Thanks for the clarification - this was very helpful.


On Friday, March 20, 2015 at 5:45:59 AM UTC-7, Jörg Prante wrote:

 Caching filters are implemented in ES, not in Lucene. E.g. 
 org.elasticsearch,common.lucene.search.CachedFilter is a class that 
 implements cached filters on the base of Lucene filter class.

 The format is not only bitsets. The Lucene filter instance is cached, no 
 matter if it is doc sets or bit sets or whatever. ES code extends Lucene 
 filters by several methods for fast evaluation and traversal.

 ES evaluates the filter in the given filter chain order, from outer to 
 inner (also called top down).

 When a series of boolean filters (i.e. should/must/must_not) is used, they 
 can be evaluated efficiently by composition. See 
 org.elasticsearch,common.lucene.search.XBooleanFilter for the composition 
 algorithm.

 Field data will be loaded when a field is used for operations like filter 
 or sort. The higher the cardinality, the more effort is needed. This is 
 because the index is inverted.

 Jörg


 On Fri, Mar 20, 2015 at 3:30 AM, Ashish Mishra laughin...@gmail.com 
 javascript: wrote:

 Not sure I understand the difference between composable vs. cacheable.  
 Can filters be cached without using bitsets?  What format are the results 
 stored in, if not as bitsets?

 In the example below, would the string range field y filter be 
 evaluated on every document in the index, or just on the documents matching 
 the previous field x filter?

 Also, will y field data be loaded for all documents in the index, or 
 just for the documents matching the previous filter.



 On Thursday, March 19, 2015 at 3:21:12 AM UTC-7, Jörg Prante wrote:

 There are several concepts:

 - filter operation (bool, range/geo/script)
 - filter composition (composable or not, composable means bitsets are 
 used)
 - filter caching (ES stores filter results or not, if not cached, ES 
 must walk doc-by-doc to apply filter)

 #1 says you should take care what kind of inner filter the and/or/not 
 filter uses, and then you should arrange filters in the right order to 
 avoid unnecessary complexity
 #2 most of the filters are cacheable, but not by default. These doc try 
 to explain how the and filter consists of inner filter clauses and what 
 is happening because default caching is off. I can not see this is implying 
 bitsets.
 #3 correct interpretation

 The use of bitsets is a pointer for composable filters, these 
 should/must/mustnot filters use an internal Lucene bitset implementation 
 for efficient computation. 

 Jörg


 On Thu, Mar 19, 2015 at 5:58 AM, Ashish Mishra laughin...@gmail.com 
 wrote:

 I'm trying to optimize filter queries for performance and am slightly 
 confused by the online docs.  Looking at:

 1) https://www.elastic.co/blog/all-about-elasticsearch-filter-bitsets
 2) http://www.elastic.co/guide/en/elasticsearch/reference/
 current/query-dsl-and-filter.html
 3) http://www.elastic.co/guide/en/elasticsearch/guide/
 current/_filter_order.html

 #1 says that Bool filter uses bitsets, while And/Or/Not does doc-by-doc 
 matching.
 #2 says that And result is optionally cacheable (implying that it uses 
 bitsets).
 #3 says that Bool does doc-by-doc matching if the inner filters are not 
 cacheable.

 This is confusing, is there a clear guideline on when bitsets are used?

 Let's say I have two high-cardinality fields, x and y.  Field data for 
 y is loaded into memory, while x is not.  What is the optimal way to 
 structure this query?

   filter: {
 and: [
 {
   term: {
 x: F828477AF7,
 _cache: false  // Don't want to cache since query will not be 
 repeated
   }
 },
 {
   range: {
 y: {
 gt: CB70V63BD8AE  // String range query, should only 
 be executed on result of previous filters
 }
   }
 }
 ]
   }

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/52dd306b-d229-462b-8b3c-b9cb2fff8c5f%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/52dd306b-d229-462b-8b3c-b9cb2fff8c5f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/0dbceece-5c74-4867-90df-951f8f0cae8a%40googlegroups.com
  
 

Re: Is java elasticsearch a joke?

2015-03-24 Thread Sai Asuka
Yes, which is why I was baffled. In fact, the code I used was straight out 
of that specific page.
The client code also came straight from the docs. 
(http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/client.html):

So the code I used from the documentation that appears so basic does not 
work even though my cluster is running. The docs make it very simple, so I 
am surprised it does not work out of the box on my local machine with a 
single node (running on my local machine on OSX installed using brew for 
latest ES 1.5, the Java Elasticsearch jar is also 1.5:

Node node = nodeBuilder().clusterName(yourclustername).node(); // I 
replaced yourclustername with my cluster name
//Node node = 
nodeBuilder().clusterName(yourclustername).client(true).node(); // this 
does not work either
Client client = node.client();

String json = { +
\user\:\kimchy\, +
\postDate\:\2013-01-30\, +
\message\:\trying out Elasticsearch\ +
};

IndexResponse response = client.prepareIndex(twitter, tweet)
.setSource(json)
.execute()
.actionGet();

What is the problem?





On Wednesday, March 25, 2015 at 12:34:49 AM UTC-4, Mark Walkom wrote:

 I don't mean to be rude, but just because it doesn't work doesn't make it 
 a joke.

 Have you read the docs? 
 http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html

 On 25 March 2015 at 14:44, Sai Asuka asuka...@gmail.com javascript: 
 wrote:

 The problem is in either case, I see no new index or anything.

 On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote:

 I am using elasticsearch java client, indexing a document looks simple.
 I know 100% my cluster is running, and my name is create.

 Node node  = nodeBuilder().clusterName(mycluster).node();

 Client client = node.client();

 String json = { +

 \user\:\kimchy\, +

 \postDate\:\2013-01-30\, +

 \message\:\trying out Elasticsearch\ +

 };


 IndexResponse response = client.prepareIndex(twitter, tweet)

 .setSource(json)

 .execute()

 .actionGet();

 I read somewhere that you should shove .client(true), but I get:


 org.elasticsearch.discovery.MasterNotDiscoveredException

 What gives?

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/252076e0-b1d9-4142-b108-72b47f7841dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re[2]: ES: Ways to work with frequently updated fields of document

2015-03-24 Thread Александр Свиридов
 By different systems do you mean database, ES etc?


Среда, 25 марта 2015, 8:32 +11:00 от Mark Walkom markwal...@gmail.com:
Best best is to try to separate hot and cold/warm data so you are only 
updating things which *need* to be updated.
It may make sense to split this out into different systems, but this is really 
up to you.

On 25 March 2015 at 04:28, Александр Свиридов   ooo_satu...@mail.ru  wrote:
I have forum. And every topic has such field as viewCount - how many times 
topic was viewed by forum users. 

I wanted that all fields of topics were taken from ES (id,date,title,content 
and viewCount). However, this case after every topic view ES must reindex 
entire document again - I asked the question about particial update at stack 
-  
http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed
 .

It means that if topic is viewed 1000 times ES will index it 1000 times. And 
if I have a lot of users many documents will be indexed again and again. This 
is first strategy.

The second strategy, as I think is to take some fields of topic from index 
and some from database. At this case I take viewAcount from DB. However, then 
I can store all fields in DB and use index only as INDEX - to get ids of 
current topic.

Are there better strategies? What is the best way to solve such problem?


-- 
Александр Свиридов
-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to  elasticsearch+unsubscr...@googlegroups.com .
To view this discussion on the web visit  
https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru
 .
For more options, visit  https://groups.google.com/d/optout .

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to  elasticsearch+unsubscr...@googlegroups.com .
To view this discussion on the web visit  
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9P6L49jbDpOccA_z-ckKVFjz1o%2BXWWttN%2BYneL0YBfDg%40mail.gmail.com
 .
For more options, visit  https://groups.google.com/d/optout .

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1427261899.358293155%40f317.i.mail.ru.
For more options, visit https://groups.google.com/d/optout.


Re: Is java elasticsearch a joke?

2015-03-24 Thread Mark Walkom
I don't mean to be rude, but just because it doesn't work doesn't make it a
joke.

Have you read the docs?
http://www.elastic.co/guide/en/elasticsearch/client/java-api/current/index.html

On 25 March 2015 at 14:44, Sai Asuka asuka.s...@gmail.com wrote:

 The problem is in either case, I see no new index or anything.

 On Tuesday, March 24, 2015 at 11:43:49 PM UTC-4, Sai Asuka wrote:

 I am using elasticsearch java client, indexing a document looks simple.
 I know 100% my cluster is running, and my name is create.

 Node node  = nodeBuilder().clusterName(mycluster).node();

 Client client = node.client();

 String json = { +

 \user\:\kimchy\, +

 \postDate\:\2013-01-30\, +

 \message\:\trying out Elasticsearch\ +

 };


 IndexResponse response = client.prepareIndex(twitter, tweet)

 .setSource(json)

 .execute()

 .actionGet();

 I read somewhere that you should shove .client(true), but I get:


 org.elasticsearch.discovery.MasterNotDiscoveredException

 What gives?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4c62ee1f-6224-4749-bd28-16e47a200b09%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9h1na1Y4YHT%2Bug%2BDVEs4k9AstqB4Vx7OQzT9nbeXSVqg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Why _timestamp field is not returned with the format that we have set in the mapping?

2015-03-24 Thread Lee Chuen Ooi
Hi,
I want to make use of the _timestamp field to keep track the last indexed 
datetime of an doc in ElasticSearch.
I have set the mapping as below:

  mappings: {
 _default_: {
_timestamp: {
   enabled: true,
   store: true,
   format: -MM-dd'T'HH:mm:ss.SSS
}

I have mentioned the _timestamp field with date format (as red font 
highlighted). However, when I query the doc, the _timestamp field is still 
showing as the unix format as below: 

fields: {
   _timestamp: 1427177681930
}

*Question*
1) I am using ElasticSearch version 1.32. Is this a bug? 
2) How can we get the _timestamp field with the correct format that we 
want?  How to convert the _timestamp to a readable format? I use Sense to 
query. 


Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/68f60356-f3de-406b-890c-338e5cc31a64%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to merge terms and match_all

2015-03-24 Thread tao hiko
Hi Guys,

I'm trying to merge both statement to one search as below detail

*1. query.terms  : I get filter set (A,B,C) from drop down list on 
webpage*

{

  query: {

filtered: {

  query: {

bool: {

  should: [

{

  terms: {

field1.raw: [

A,B,C

]

  }

}

  ]

}

  }

}

  }

}


*2. query.match : I get filter data (D E F) from text box on webpage 
(Universal Search)*

{

  query: {

filtered: {

  query: {

match: {

  _all: D E F

}

  }

}

  }

}


 So I need to search by get filter information from drop down and text box 
on webpage to query data in ES in same time, Is it possible?

If you need more informatin, please let me know.

Thanks
Hiko

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8587e6ca-3035-4b91-9652-a4562902a616%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Stempel (Polish) Analysis plugin 2.5.0 released

2015-03-24 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Stempel (Polish) 
Analysis plugin, version 2.5.0.

The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) 
analysis module into elasticsearch..

https://github.com/elastic/elasticsearch-analysis-stempel/

Release Notes - elasticsearch-analysis-stempel - Version 2.5.0



Update:
 * [39] - Update to elasticsearch 1.5.0 
(https://github.com/elastic/elasticsearch-analysis-stempel/issues/39)
 * [36] - Analysis: Fix using PreBuiltXXXFactory 
(https://github.com/elastic/elasticsearch-analysis-stempel/pull/36)
 * [34] - Tests: upgrade randomizedtesting-runner to 2.1.10 
(https://github.com/elastic/elasticsearch-analysis-stempel/issues/34)
 * [33] - Tests: index.version.created must be set 
(https://github.com/elastic/elasticsearch-analysis-stempel/issues/33)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-stempel project repository: 
https://github.com/elastic/elasticsearch-analysis-stempel/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/55119ea2.28e4b40a.744c.07a6SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Japanese (kuromoji) Analysis plugin 2.5.0 released

2015-03-24 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Japanese (kuromoji) 
Analysis plugin, version 2.5.0.

The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis 
module into elasticsearch..

https://github.com/elastic/elasticsearch-analysis-kuromoji/

Release Notes - elasticsearch-analysis-kuromoji - Version 2.5.0



Update:
 * [56] - Update to elasticsearch 1.5.0 
(https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/56)
 * [49] - Tests: upgrade randomizedtesting-runner to 2.1.10 
(https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/49)
 * [47] - Tests: index.version.created must be set 
(https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/47)
 * [34] - Use PreBuiltXXXFactory 
(https://github.com/elastic/elasticsearch-analysis-kuromoji/issues/34)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-kuromoji project repository: 
https://github.com/elastic/elasticsearch-analysis-kuromoji/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5511a528.c807b40a.61ab.7d19SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: Do I have to explicitly exclude the _all field in queries?

2015-03-24 Thread Brian Levine
OK, I think I figured this out.  It's the space between Brian and 
Levine. The query:

query: Node.author:Brian Levine

is actually interpreted as Node.author:Brian OR Levine in which case 
Levine is searched for in the _all field.  Seems so obvious now!  ;-)

-b


On Tuesday, March 24, 2015 at 1:50:42 PM UTC-4, Brian Levine wrote:

 Hi all,

 I clearly haven't completely grokked something in how QueryString queries 
 are interpreted.  Consider the following query:

 {
 query: {
 query_string: {
analyze_wildcard: true,
query: Node.author:Brian Levine
 }
 },
 fields: [Node.author],
 explain:true
 }

 Note:  The Node.author field is not_analyzed.

 The results from this query include documents for which the Node.author 
 field contains neither Brian nor Levine.  In examining the the 
 explanation, I found that the documents were included because another field 
 in the document contained Levine.  A snippet from the explanation shows 
 that the _all field was considered:

 {
 value: 0.08775233,
 description: weight(_all:levine in 464) [PerFieldSimilarity], 
 result of:,
 ...

 Do I need to explicitly exclude the _all field in the query? 

 Separate question: Because the Node.author field is not_analyzed, I had 
 thought that the value Brian Levine would also not be analyzed and 
 therefore only documents whose Node.author field contained Brian Levine 
 exactly would be matched, yet the explanation shows that the brian and 
 levine tokens were considered. I also noticed that if I change the query 
 to:

 query: Node.author:(Brian Levine)

 then result set changes. Only the documents whose Node.author field 
 contains either brian OR levine are included (which is what I would 
 have expected). According to the explanation, the _all field is not 
 considered in this query.

 So I'm confused.  Clearly, I don't understand how my original query is 
 interpreted.

 Hopefully, someone can enlighten me.

 Thanks.

 -brian



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/63c5fff3-cfd5-4003-89ad-015d8fdb4879%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search templates: partials

2015-03-24 Thread Glen Smith
Nope. I reached my time limit for trying to get that approach to work and 
finished the job
using one really big template. It's one more thing on my things to explore 
again list.
I figured, though, that by postponing it, I would at least have the benefit 
of using a later version of ES
when I got back to it.
If you're still having issues in 1.4.4, maybe it's time to open an issue.

On Tuesday, March 24, 2015 at 12:46:44 PM UTC-4, Zdeněk Šebl wrote:

 I have same troubles now on ES 1.4.4

 Did you found solution how to use partials in Mustache file templates?

 Thanks,
 Zdenek

 Dne čtvrtek 7. srpna 2014 20:53:09 UTC+2 Glen Smith napsal(a):

 Developing on v1.2.2, I'm deploying mustache templates by file system 
 under config/scripts.

 When I do something like this: 

 parent.mustache

 {
 query: {
 filtered: {
 query: {
 bool: {
 must: [
 {match_all:{}}
 {{#condition}}
 ,{match: {condition: {{condition
 {{/condition}}
 ]
 }
 },
 {{ child}}
 }
 }
 }

 child.mustache

 filter: {
 bool: {
 must: [
 {query: {match_all:{}}}
 {{#status}}
 ,{
 query: {
 match: {
 status: {{status}}
 }
 }
 }
 {{/status}}
 ]
 }
 }

 I get:
 [2014-08-07 14:48:58,454][WARN ][script   ] [x_node] 
 failed to load/compile script [parent]
 org.elasticsearch.common.mustache.MustacheException: Template 'child' not 
 found

 I've tried pathing in parent (scripts/child, config/scripts/child) 
 with the same result.

 Are partials supported?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dca89f6c-ca6b-4f2f-8842-708fe822d014%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search templates: partials

2015-03-24 Thread Zdeněk Šebl
I have same troubles now on ES 1.4.4

Did you found solution how to use partials in Mustache file templates?

Thanks,
Zdenek

Dne čtvrtek 7. srpna 2014 20:53:09 UTC+2 Glen Smith napsal(a):

 Developing on v1.2.2, I'm deploying mustache templates by file system 
 under config/scripts.

 When I do something like this: 

 parent.mustache

 {
 query: {
 filtered: {
 query: {
 bool: {
 must: [
 {match_all:{}}
 {{#condition}}
 ,{match: {condition: {{condition
 {{/condition}}
 ]
 }
 },
 {{ child}}
 }
 }
 }

 child.mustache

 filter: {
 bool: {
 must: [
 {query: {match_all:{}}}
 {{#status}}
 ,{
 query: {
 match: {
 status: {{status}}
 }
 }
 }
 {{/status}}
 ]
 }
 }

 I get:
 [2014-08-07 14:48:58,454][WARN ][script   ] [x_node] 
 failed to load/compile script [parent]
 org.elasticsearch.common.mustache.MustacheException: Template 'child' not 
 found

 I've tried pathing in parent (scripts/child, config/scripts/child) 
 with the same result.

 Are partials supported?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6a4af502-4232-4c6c-98cf-a4c414fea3d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch ICU Analysis plugin 2.5.0 released

2015-03-24 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch ICU Analysis 
plugin, version 2.5.0.

The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding 
ICU relates analysis components..

https://github.com/elasticsearch/elasticsearch-analysis-icu/

Release Notes - elasticsearch-analysis-icu - Version 2.5.0



Update:
 * [49] - Update to elasticsearch 1.5.0 
(https://github.com/elastic/elasticsearch-analysis-icu/issues/49)
 * [43] - Tests: upgrade randomizedtesting-runner to 2.1.10 
(https://github.com/elastic/elasticsearch-analysis-icu/issues/43)
 * [41] - Tests: index.version.created must be set 
(https://github.com/elastic/elasticsearch-analysis-icu/issues/41)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-icu project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-icu/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5511950b.e662b40a.7ce1.8291SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


ES: Ways to work with frequently updated fields of document

2015-03-24 Thread Александр Свиридов
 I have forum. And every topic has such field as viewCount - how many times 
topic was viewed by forum users. 

I wanted that all fields of topics were taken from ES (id,date,title,content 
and viewCount). However, this case after every topic view ES must reindex 
entire document again - I asked the question about particial update at stack -  
http://stackoverflow.com/questions/28937946/partial-update-on-field-that-is-not-indexed
 .

It means that if topic is viewed 1000 times ES will index it 1000 times. And if 
I have a lot of users many documents will be indexed again and again. This is 
first strategy.

The second strategy, as I think is to take some fields of topic from index and 
some from database. At this case I take viewAcount from DB. However, then I can 
store all fields in DB and use index only as INDEX - to get ids of current 
topic.

Are there better strategies? What is the best way to solve such problem?


-- 
Александр Свиридов

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1427218099.583181766%40f355.i.mail.ru.
For more options, visit https://groups.google.com/d/optout.


Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread Yogesh Kansal
Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap
memory (I think since Marvel showed memory as 1 GB which corresponds to the
heap, my RAM is 50GB)
I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared
and I'm freaked out because of it!

So, after restarting my ES (curl shutdown) to increase the heap, the Marvel
has stopped showing me my data (it still shows that disk memory is lower so
that means the data is still on disk) and upon searching Sense shows
IndexMissingException[[my_new_twitter_river] missing]

Why is this happening?!?!

On Mon, Mar 23, 2015 at 9:25 PM, mjdu...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the heap?
 If it's 99% of the available heap that's bad and you will have cluster
 instability.  I suggest increasing your JVM heap size if you can, I can't
 find it right now but I remember a blog post that used twitter as a
 benchmark and they also could get to ~50M documents with the default 1G
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter river to
 index tweets. It has been going fine with almost 50M tweets indexed so far
 in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered
 between 10-20%, then started remaining around 30-40% but for the past 3-4
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon as I
 switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is the
 default 990.75MB)

 Thanks
 Yogesh

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADM0w%3Dh%3DMgfP%3DDSR0PmhVgXAGcApMDxZE%3D4_jVJXbRwk8CzTHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread mjdude5
When it restarted did it attach to the wrong data directory?  Take a look 
at _nodes/_local/stats?pretty and check the 'data' directory location.  Has 
the cluster recovered after the restart?  Check _cluster/health?pretty as 
well.

On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap 
 memory (I think since Marvel showed memory as 1 GB which corresponds to the 
 heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared 
 and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the 
 Marvel has stopped showing me my data (it still shows that disk memory is 
 lower so that means the data is still on disk) and upon searching Sense 
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com javascript: wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the heap?  
 If it's 99% of the available heap that's bad and you will have cluster 
 instability.  I suggest increasing your JVM heap size if you can, I can't 
 find it right now but I remember a blog post that used twitter as a 
 benchmark and they also could get to ~50M documents with the default 1G 
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter river 
 to index tweets. It has been going fine with almost 50M tweets indexed so 
 far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered 
 between 10-20%, then started remaining around 30-40% but for the past 3-4 
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon as 
 I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is the 
 default 990.75MB)

 Thanks
 Yogesh

  -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Search templates: partials

2015-03-24 Thread Zdeněk Šebl
Thanks for info. I am going to open new issue.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/338d2bb4-7bf2-4794-abc1-c6aa4a40ad55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Kibana: how to show values directly without aggregation?

2015-03-24 Thread Jason
ELK experts, I'm desperately need your help. I've spend hours and days on 
this simple thing but still can't figure it out.

My settings are Elasticsearch 1.4.4 and Kibana 4.0.1. My program feeds data 
into ElasticSearch. The data contain a timestamp and a value. I want to 
create a bar chart, with the timestamp as X-axis and the value as Y-axis, 
directly. No aggregation. How can I make it? Appreciate any hints. Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7dedd653-c168-4f4f-ac06-30bd3a9b155c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch data directory on Windows 7

2015-03-24 Thread Ravi Gupta
@Mark the link you posted is for linux paths. I assumed windows path will 
be on similar lines and expected index data to be on below path:

C:\ES home\data\elasticsearch\nodes\0\indices\idd

I see the path looks like it contains all the index data but I am puzzled 
why the size of data directory is only 77 KB.

There is a file of size 2 KB and it seems to contain some data that looks 
encoded (it appears my application data)
C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state

My question is how can 2.2GB size data is compressed into 2KB unless I am 
looking at wrong directory.




On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote:

 Take a look at 
 http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-dir-layout.html#_zip_and_tar_gz

 On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com javascript: 
 wrote:

 Hi ,

 I have downloaded and installed ES on windows 7 and imported 2million+ 
 records into it. (from a csv of size 2.2GB). I can search records using 
 fiddler and it works as expected.

 I was hoping to see size of below directory to increase few GB but the 
 size is only 77 kb.

 C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data

 This means ES stores data in some other directory on disk.

 Can you please tell me the directory location where ES stores the data?

 Regards,
 Ravi

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch data directory on Windows 7

2015-03-24 Thread Mark Walkom
Take a look at
http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-dir-layout.html#_zip_and_tar_gz

On 24 March 2015 at 16:34, Ravi Gupta guptarav...@gmail.com wrote:

 Hi ,

 I have downloaded and installed ES on windows 7 and imported 2million+
 records into it. (from a csv of size 2.2GB). I can search records using
 fiddler and it works as expected.

 I was hoping to see size of below directory to increase few GB but the
 size is only 77 kb.

 C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data

 This means ES stores data in some other directory on disk.

 Can you please tell me the directory location where ES stores the data?

 Regards,
 Ravi

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_GsyAuw_qQDwHZUY2PevFesfLymY0yPmZ5tQyaenxHzw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[Spark] SchemaRdd saveToEs produces Bad JSON errors

2015-03-24 Thread Yana Kadiyska
Hi,

I'm a very new user to ES so Im hoping someone can point me to what I'm 
doing wrong.

I did the following:

download a fresh copy of Spark1.2.1. Switch to bin folder and ran:

wget 
https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/2.1.0.BUILD-SNAPSHOT/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar
 ./spark-shell --jars elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar

import org.apache.spark.sql.SQLContext

case class KeyValue(key: Int, value: String)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext._

sc.parallelize(1 to 50).map(i=KeyValue(i, i.toString))
.saveAsParquetFile(large.parquet)
parquetFile(large.parquet).registerTempTable(large)

val schemaRDD = sql(SELECT * FROM large)
import org.elasticsearch.spark._

schemaRDD.saveToEs(test/spark)

​


At this point I get errors like this:

15/03/24 11:02:35 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have 
all completed, from pool
15/03/24 11:02:35 INFO DAGScheduler: Job 2 failed: runJob at EsSpark.scala:51, 
took 0.302236 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in 
stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 
10, localhost): org.apache.spark.util.TaskComplet
ionListenerException: Found unrecoverable error [Bad Request(400) - Invalid 
JSON fragment received[[38,38]][MapperParsingException[failed to parse]; 
nested: ElasticsearchParseException[Failed to derive x
content from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 
123, 125, 125, 10, 91, 51, 56, 44, 34, 51, 56, 34, 93, 10, 123, 34, 105, 110, 
100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51
, 57, 44, 34, 51, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 
123, 125, 125, 10, 91, 52, 48, 44, 34, 52, 48, 34, 93, 10, 123, 34, 105, 110, 
100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 4
9, 44, 34, 52, 49, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 
125, 125, 10, 91, 52, 50, 44, 34, 52, 50, 34, 93, 10, 123, 34, 105, 110, 100, 
101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 51,
44, 34, 52, 51, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 
125, 10, 91, 52, 52, 44, 34, 52, 52, 34, 93, 10, 123, 34, 105, 110, 100, 101, 
120, 34, 58, 123, 125, 125, 10, 91, 52, 53, 44,
 34, 52, 53, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 
125, 10, 91, 52, 54, 44, 34, 52, 54, 34, 93, 10, 123, 34, 105, 110, 100, 101, 
120, 34, 58, 123, 125, 125, 10, 91, 52, 55, 44, 34
, 52, 55, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 
10, 91, 52, 56, 44, 34, 52, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 
34, 58, 123, 125, 125, 10, 91, 52, 57, 44, 34, 5
2, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 
91, 53, 48, 44, 34, 53, 48, 34, 93, 10]]; ]]; Bailing out..
at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76)

​

Any idea what I'm doing wrong? (I started with the Beta3 jar before trying 
the nightly but also with no luck). I am running against elasticsearch1.4.4

/spark_1.2.1/bin$ curl localhost:9200
{
  status : 200,
  name : Wizard,
  cluster_name : elasticsearch,
  version : {
number : 1.4.4,
build_hash : c88f77ffc81301dfa9dfd81ca2232f09588bd512,
build_timestamp : 2015-02-19T13:05:36Z,
build_snapshot : false,
lucene_version : 4.10.3
  },
  tagline : You Know, for Search
}

​

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using scroll and different results sizes

2015-03-24 Thread Todd Nine
Hey all,
  We use ES as our indexing sub system.  Our canonical record store is 
Cassandra.  Due to the denormalization we perform for faster query speed, 
occasionally the state of documents in ES can lag behind the state of our 
Cassandra instance.  To accommodate this eventually consist system, our 
query path is the following.


Query ES -  returns scrollId and document ids- Load entities from 
Cassandra - Drop stale documents from results and asynchronously remove 
them.


If the user requested 10 entities, and only 8 were current, we will be 
missing 2 results and need to make another trip to ES to satisfy the result 
size.  Is it possible to do the following?


//get first page with 10 results
POST /testindex/mytype/_search?scroll=1m {from:0, size: 10} 

POST /testing/mytype/_search?scroll_id= result from previous {size:2}


I always seem to get back 10 on my second request.  If not other option is 
available, I'd rather truncate our result set size to  the user's 
requested size than take the performance hit of using from and size to drop 
results.  Our document counts can be 50million+, so I'm concerned with the 
performance implications of not using scroll.


Thoughts?


Thanks,
Todd




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4a6cf1a9-9b96-47c2-a3f6-0955b3e74283%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch with JSON-array, causing serialize -error

2015-03-24 Thread sebastian
same issue here. Any clues?

On Sunday, April 20, 2014 at 2:47:40 PM UTC-3, PyrK wrote:

 I'm using elasticsearch with mongodb -collection using elmongo 
 https://github.com/usesold/elmongo. I have a collection (elasticsearch 
 index's point of view json-array), that contains for example field: 
 random_point: [  0.10007477086037397,  0 ]

 That's most likely the reason I get this error, when trying to index my 
 collection.
 [2014-04-20 16:48:51,228][DEBUG][action.bulk  ] [Emma Frost] [
 mediacontent-2014-04-20t16:48:44.116z][4] failed to execute bulk item (
 index) index {[mediacontent-2014-04$

 org.elasticsearch.index.mapper.MapperParsingException: object mapping [
 random_point] trying to serialize a value with no field associated with it
 , current value [0.1000747708603739$

 at org.elasticsearch.index.mapper.object.ObjectMapper.
 serializeValue(ObjectMapper.java:595)

 at org.elasticsearch.index.mapper.object.ObjectMapper.parse(
 ObjectMapper.java:467)

 at org.elasticsearch.index.mapper.object.ObjectMapper.
 serializeValue(ObjectMapper.java:599)

 at org.elasticsearch.index.mapper.object.ObjectMapper.
 serializeArray(ObjectMapper.java:587)

 at org.elasticsearch.index.mapper.object.ObjectMapper.parse(
 ObjectMapper.java:459)

 at org.elasticsearch.index.mapper.DocumentMapper.parse(
 DocumentMapper.java:506)

 at org.elasticsearch.index.mapper.DocumentMapper.parse(
 DocumentMapper.java:450)

 at org.elasticsearch.index.shard.service.InternalIndexShard.
 prepareIndex(InternalIndexShard.java:327)

 at org.elasticsearch.action.bulk.TransportShardBulkAction.
 shardIndexOperation(TransportShardBulkAction.java:381)

 at org.elasticsearch.action.bulk.TransportShardBulkAction.
 shardOperationOnPrimary(TransportShardBulkAction.java:155)

 at org.elasticsearch.action.support.replication.
 TransportShardReplicationOperationAction$AsyncShardOperationAction.
 performOnPrimary(TransportShardReplicationOperationAction$

 at org.elasticsearch.action.support.replication.
 TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(
 TransportShardReplicationOperationAction.java:430)

 at java.util.concurrent.ThreadPoolExecutor.runWorker(
 ThreadPoolExecutor.java:1146)

 at java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:701)

 [2014-04-20 16:48:54,129][INFO ][cluster.metadata ] [Emma Frost] [
 mediacontent-2014-04-20t16:39:09.348z] deleting index


 Is there any ways to bypass this? That array is a needed value in my 
 collection. Is there anyways to give some option in elasticsearch to not to 
 index that JSON-field, tho it's not going to be searchable field at all?


 Best regards,

 PK


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bfd442a0-ef52-4ae9-be6c-5e98475cbff4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread Yogesh Kansal
Thanks. Will try starting again.
By the way, what will be the impact if I just delete the node/1 (then the
node.max_local_storage_nodes to 1 and start elasticsearch) ?

On Wed, Mar 25, 2015 at 1:36 AM, mjdu...@gmail.com wrote:

 As long as it's the first ES instance starting on that node it'll grab 0
 instead of 1.  I don't know if you can explicitly set the data node
 directory in the config.

 On Tuesday, March 24, 2015 at 2:16:57 PM UTC-4, Yogesh wrote:

 Thanks. Mine is a one node cluster so I am simply using the XCURL to shut
 it down and the doing bin/elasticsearch -d to start it up.
 To check if it has shutdown, I try to hit it using http. So, now how do I
 start it with the node/0 data directory?
 There is nothing there in node/1 data directory but I don't suppose
 deleting it would be the solution? (Sorry for the basic questions I am new
 to this!)

 On Tue, Mar 24, 2015 at 11:32 PM, mjd...@gmail.com wrote:

 Sometimes that happens when the new node starts up via monit or other
 automated thing before the old node is fully shutdown.  I'd suggest
 shutting down the node and verify it's done via ps before allowing the new
 node to start up.  In the case of monit if the check is hitting the http
 port then it'll think it's down before it actually fully quits.

 On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote:

 Thanks a lot mjdude! It does seem like it attached to the wrong data
 directory.
 In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0
 but node stats shows the data directory as elasticsearch/data/tool/nodes/
 1.
 Now, how do I change this?

 On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote:

 When it restarted did it attach to the wrong data directory?  Take a
 look at _nodes/_local/stats?pretty and check the 'data' directory
 location.  Has the cluster recovered after the restart?  Check
 _cluster/health?pretty as well.

 On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the
 heap memory (I think since Marvel showed memory as 1 GB which corresponds
 to the heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has
 appeared and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the
 Marvel has stopped showing me my data (it still shows that disk memory is
 lower so that means the data is still on disk) and upon searching Sense
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the
 heap?  If it's 99% of the available heap that's bad and you will have
 cluster instability.  I suggest increasing your JVM heap size if you 
 can, I
 can't find it right now but I remember a blog post that used twitter as 
 a
 benchmark and they also could get to ~50M documents with the default 1G
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter
 river to index tweets. It has been going fine with almost 50M tweets
 indexed so far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel)
 hovered between 10-20%, then started remaining around 30-40% but for 
 the
 past 3-4 days it has continuously been above 90%, reaching 99% at 
 times!
 I restarted elasticsearch thinking it might get resolved but as
 soon as I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is
 the default 990.75MB)

 Thanks
 Yogesh

  --
 You received this message because you are subscribed to a topic in
 the Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/to
 pic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa2
 0-4bba-881f-3f378985c8c9%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/to
 pic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 

Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread mjdude5
As long as it's the first ES instance starting on that node it'll grab 0 
instead of 1.  I don't know if you can explicitly set the data node 
directory in the config.

On Tuesday, March 24, 2015 at 2:16:57 PM UTC-4, Yogesh wrote:

 Thanks. Mine is a one node cluster so I am simply using the XCURL to shut 
 it down and the doing bin/elasticsearch -d to start it up.
 To check if it has shutdown, I try to hit it using http. So, now how do I 
 start it with the node/0 data directory?
 There is nothing there in node/1 data directory but I don't suppose 
 deleting it would be the solution? (Sorry for the basic questions I am new 
 to this!)

 On Tue, Mar 24, 2015 at 11:32 PM, mjd...@gmail.com javascript: wrote:

 Sometimes that happens when the new node starts up via monit or other 
 automated thing before the old node is fully shutdown.  I'd suggest 
 shutting down the node and verify it's done via ps before allowing the new 
 node to start up.  In the case of monit if the check is hitting the http 
 port then it'll think it's down before it actually fully quits.

 On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote:

 Thanks a lot mjdude! It does seem like it attached to the wrong data 
 directory.
 In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 
 but node stats shows the data directory as elasticsearch/data/tool/nodes/
 1.
 Now, how do I change this?

 On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote:

 When it restarted did it attach to the wrong data directory?  Take a 
 look at _nodes/_local/stats?pretty and check the 'data' directory 
 location.  Has the cluster recovered after the restart?  Check 
 _cluster/health?pretty as well.

 On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the 
 heap memory (I think since Marvel showed memory as 1 GB which corresponds 
 to the heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has 
 appeared and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the 
 Marvel has stopped showing me my data (it still shows that disk memory is 
 lower so that means the data is still on disk) and upon searching Sense 
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the 
 heap?  If it's 99% of the available heap that's bad and you will have 
 cluster instability.  I suggest increasing your JVM heap size if you 
 can, I 
 can't find it right now but I remember a blog post that used twitter as 
 a 
 benchmark and they also could get to ~50M documents with the default 1G 
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter 
 river to index tweets. It has been going fine with almost 50M tweets 
 indexed so far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered 
 between 10-20%, then started remaining around 30-40% but for the past 
 3-4 
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon 
 as I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is 
 the default 990.75MB)

 Thanks
 Yogesh

  -- 
 You received this message because you are subscribed to a topic in 
 the Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/to
 pic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40goo
 glegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/
 topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To 

Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread Yogesh Kansal
Thanks a lot mjdude! It does seem like it attached to the wrong data
directory.
In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 but
node stats shows the data directory as elasticsearch/data/tool/nodes/1.
Now, how do I change this?

On Tue, Mar 24, 2015 at 11:02 PM, mjdu...@gmail.com wrote:

 When it restarted did it attach to the wrong data directory?  Take a look
 at _nodes/_local/stats?pretty and check the 'data' directory location.  Has
 the cluster recovered after the restart?  Check _cluster/health?pretty as
 well.

 On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap
 memory (I think since Marvel showed memory as 1 GB which corresponds to the
 heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared
 and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the
 Marvel has stopped showing me my data (it still shows that disk memory is
 lower so that means the data is still on disk) and upon searching Sense
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the
 heap?  If it's 99% of the available heap that's bad and you will have
 cluster instability.  I suggest increasing your JVM heap size if you can, I
 can't find it right now but I remember a blog post that used twitter as a
 benchmark and they also could get to ~50M documents with the default 1G
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter river
 to index tweets. It has been going fine with almost 50M tweets indexed so
 far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered
 between 10-20%, then started remaining around 30-40% but for the past 3-4
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon as
 I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is the
 default 990.75MB)

 Thanks
 Yogesh

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/
 topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CADM0w%3DhB3Qxs2a01yLtcLPaKUodOh2jimniow76fPLZ2oz12aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch shield relam problem

2015-03-24 Thread Jay Modi
Phani,

We just released Shield 1.1 and 1.2 
(https://www.elastic.co/blog/shield-1-1-and-1-2-released). LDAP user search 
is included and may be worth trying out. If you were to use it, I think 
your configuration would look something like:

shield:
  authc:
realms:
  ldap1:
type: ldap
order: 0
url: ldap://ldapserver:389;
bind_dn: cn=Manager,dc=test,dc=org
bind_password: changeme
user_search:
  base_dn: ou=People,dc=test,dc=org
group_search:
  base_dn: dc=test,dc=org

This assumes the cn=Manager,dc=test,dc=org is a user with search 
credentials on the ldap. The earlier questions I had about groups would 
still apply

On Monday, March 23, 2015 at 6:08:48 PM UTC-4, Jay Modi wrote:

 Since you are using uid, your setup would look something like this

 shield:
   authc:
 realms:
   ldap1:
 type: ldap
 order: 0
 url: ldap://ldapserver:389;
 user_dn_templates:
   - uid={0}, ou=People,dc=test,dc=org

 This assumes all users are directly in the People OU. If that is not the 
 case, you'll have to update the template or add additional templates. Can 
 you tell me a little more about how the groups are setup in your ldap? What 
 is their objectClass and do they have the member, unqiueMember, or 
 memberUid attribute? You will probably need to configure the group search 
 and that additional information will be necessary to ensure it works.

 Also to help with debugging, it is helpful to set shield.authc: DEBUG in 
 the logging.yml file

 On Monday, March 23, 2015 at 2:43:29 AM UTC-4, phani.n...@goktree.com 
 wrote:

 Hi Jay,

   sorry for late reply . I am using openldap server .i followed the 
 configurations given by es people i did like in example but i am not able 
 to login with ldap credentials.is ldap in elastic search is mount ldap 
 or it will import users in to the file?
   i have tried following link 
   
 http://www.elastic.co/guide/en/shield/current/ldap.html . but i 
 didn't get proper result i have the following configurations to my LDAP 
 server.please find the following.

Principal : cn=Manager,dc=test,dc=org
 Base DN : ou=People,dc=test,dc=org

 filter : uid=%s
  
 the above are my ldap configuration details please suggest me 
 how can we achieve with above credentials my using above link (
 http://www.elastic.co/guide/en/shield/current/ldap.html ) 

 Thanks,
 phani


 On Wednesday, March 18, 2015 at 8:05:37 PM UTC+5:30, Jay Modi wrote:

 What type of LDAP server are you integrating with? We have some 
 documentation for LDAP setup, 
 http://www.elastic.co/guide/en/shield/current/ldap.html.

 If you are using Active Directory, there is a specific realm for it that 
 abstracts some of the LDAP setup to make it simpler: 
 http://www.elastic.co/guide/en/shield/current/active_directory.html

 On Wednesday, March 18, 2015 at 9:12:27 AM UTC-4, phani.n...@goktree.com 
 wrote:

 Thank you Jay for quick reply yes it got worked I changed the path to 
 es_home config.now authentication is performing fine next I am looking in 
 to LDAP integration with elastic search can you suggest me steps how can 
 we 
 integrate ldap to elasticsearch.


 Thanks
 phani.

 On Wednesday, March 18, 2015 at 6:20:29 PM UTC+5:30, Jay Modi wrote:

 Hi Phani,

 I think the correct thing to do is:

 export ES_JAVA_OPTS=-Des.path.conf=/etc/elasticsearch
 bin/shield/esusers useradd es_admin -r admin

 Verify that /etc/elasticsearch/shield/users exists and contains an 
 entry for the admin user. Once you have confirmed that, then try to 
 authenticate. 

 The issue with steps you have taken is that your elasticsearch 
 instance is looking for configuration in /etc/elasticsearch and the 
 configuration for Shield is in ES_HOME by default. The packaged versions 
 of 
 elasticsearch expect all configuration (including that for plugins) to be 
 in /etc/elasticsearch. We're looking at how we can make this easier.

 On Wednesday, March 18, 2015 at 5:33:36 AM UTC-4, 
 phani.n...@goktree.com wrote:

 HI Jay,

   Thank you for the reply i tried the following steps.

i did .rpm installation in linux servers my configuration file 
 located at /etc/elasticsearch (main es coniguration file)

   But when i install shied i see there is a configurations directory 
 created inside ES_HOME(/usr/share/elasticsearch/config) 

   I issued following command to add path :export 
 ES_JAVA_OPTS=-Des.path.conf=/usr/share/elasticsearch/config

 i am able to create user but when i try to authenticate it is 
 not validating even though we added the path. please suggest me if i am 
 doing wrong here?

  
  

 On Monday, March 16, 2015 at 10:12:00 PM UTC+5:30, Jay Modi wrote:

 Hi Phani,

 How did you install elasticsearch and where is your elasticsearch 
 configuration located? If you have used a RPM or DEB package, you will 
 need 
 to add an environment variable before running the 

Cluster state storage question

2015-03-24 Thread Robert Gardam
Hi,
I am starting to look at the size of my cluster state and I started to 
notice that the shard information is duplicated. 

One grouping seems to be from the view of the index and the other from 
which shards live on which host.

I'm sure there's a logical reason for this, i'm just interested to know why?

This probably also compresses pretty well.

Cheers,
Rob




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/43907b97-4ef9-4950-87a0-20edd6be663a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Phonetic Analysis plugin 2.5.0 released

2015-03-24 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Phonetic Analysis 
plugin, version 2.5.0.

The Phonetic Analysis plugin integrates phonetic token filter analysis with 
elasticsearch..

https://github.com/elastic/elasticsearch-analysis-phonetic/

Release Notes - elasticsearch-analysis-phonetic - Version 2.5.0



Update:
 * [39] - Update to elasticsearch 1.5.0 
(https://github.com/elastic/elasticsearch-analysis-phonetic/issues/39)
 * [34] - Tests: upgrade randomizedtesting-runner to 2.1.10 
(https://github.com/elastic/elasticsearch-analysis-phonetic/issues/34)
 * [33] - Tests: index.version.created must be set 
(https://github.com/elastic/elasticsearch-analysis-phonetic/issues/33)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-phonetic project repository: 
https://github.com/elastic/elasticsearch-analysis-phonetic/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5511a183.4ad3b40a.1a8a.0a88SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Do I have to explicitly exclude the _all field in queries?

2015-03-24 Thread Brian Levine
Hi all,

I clearly haven't completely grokked something in how QueryString queries 
are interpreted.  Consider the following query:

{
query: {
query_string: {
   analyze_wildcard: true,
   query: Node.author:Brian Levine
}
},
fields: [Node.author],
explain:true
}

Note:  The Node.author field is not_analyzed.

The results from this query include documents for which the Node.author 
field contains neither Brian nor Levine.  In examining the the 
explanation, I found that the documents were included because another field 
in the document contained Levine.  A snippet from the explanation shows 
that the _all field was considered:

{
value: 0.08775233,
description: weight(_all:levine in 464) [PerFieldSimilarity], result 
of:,
...

Do I need to explicitly exclude the _all field in the query? 

Separate question: Because the Node.author field is not_analyzed, I had 
thought that the value Brian Levine would also not be analyzed and 
therefore only documents whose Node.author field contained Brian Levine 
exactly would be matched, yet the explanation shows that the brian and 
levine tokens were considered. I also noticed that if I change the query 
to:

query: Node.author:(Brian Levine)

then result set changes. Only the documents whose Node.author field 
contains either brian OR levine are included (which is what I would 
have expected). According to the explanation, the _all field is not 
considered in this query.

So I'm confused.  Clearly, I don't understand how my original query is 
interpreted.

Hopefully, someone can enlighten me.

Thanks.

-brian

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/120103bc-a60d-418a-b092-09f71732b682%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Spark] SchemaRdd saveToEs produces Bad JSON errors

2015-03-24 Thread Costin Leau

Hi,

It appears the problem lies with what type of document you are trying to save 
to ES - which is basically invalid.
For some reason the table/rdd schema gets lost and the content is serialized 
without it:

|[38,38]|

I'm not sure why this happens hence why I raised an issue [1]

Thanks,

[1] https://github.com/elastic/elasticsearch-hadoop/issues/403


On 3/24/15 5:11 PM, Yana Kadiyska wrote:

Hi,

I'm a very new user to ES so Im hoping someone can point me to what I'm doing 
wrong.

I did the following:

download a fresh copy of Spark1.2.1. Switch to bin folder and ran:

|wget 
https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/2.1.0.BUILD-SNAPSHOT/elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar
  ./spark-shell --jars elasticsearch-hadoop-2.1.0.BUILD-20150324.023417-341.jar

import org.apache.spark.sql.SQLContext

case class KeyValue(key: Int, value: String)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext._

sc.parallelize(1 to 50).map(i=KeyValue(i, i.toString))
 .saveAsParquetFile(large.parquet)
parquetFile(large.parquet).registerTempTable(large)

val schemaRDD = sql(SELECT * FROM large)
import org.elasticsearch.spark._

schemaRDD.saveToEs(test/spark)
|
​


At this point I get errors like this:

|15/03/24 11:02:35 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks 
have all completed, from pool
15/03/24 11:02:35 INFO DAGScheduler: Job 2 failed: runJob at EsSpark.scala:51, 
took 0.302236 s
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in 
stage 2.0 failed 1 times, most recent failure: Lost task 2.0 in stage 2.0 (TID 
10, localhost): org.apache.spark.util.TaskComplet
ionListenerException: Found unrecoverable error [Bad Request(400) - Invalid JSON fragment 
received[[38,38]][MapperParsingException[failed to parse]; nested: 
ElasticsearchParseException[Failed to derive x
content from (offset=13, length=9): [123, 34, 105, 110, 100, 101, 120, 34, 58, 
123, 125, 125, 10, 91, 51, 56, 44, 34, 51, 56, 34, 93, 10, 123, 34, 105, 110, 
100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 51
, 57, 44, 34, 51, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 
123, 125, 125, 10, 91, 52, 48, 44, 34, 52, 48, 34, 93, 10, 123, 34, 105, 110, 
100, 101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 4
9, 44, 34, 52, 49, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 
125, 125, 10, 91, 52, 50, 44, 34, 52, 50, 34, 93, 10, 123, 34, 105, 110, 100, 
101, 120, 34, 58, 123, 125, 125, 10, 91, 52, 51,
44, 34, 52, 51, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 
125, 10, 91, 52, 52, 44, 34, 52, 52, 34, 93, 10, 123, 34, 105, 110, 100, 101, 
120, 34, 58, 123, 125, 125, 10, 91, 52, 53, 44,
  34, 52, 53, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 
125, 10, 91, 52, 54, 44, 34, 52, 54, 34, 93, 10, 123, 34, 105, 110, 100, 101, 
120, 34, 58, 123, 125, 125, 10, 91, 52, 55, 44, 34
, 52, 55, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 
10, 91, 52, 56, 44, 34, 52, 56, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 
34, 58, 123, 125, 125, 10, 91, 52, 57, 44, 34, 5
2, 57, 34, 93, 10, 123, 34, 105, 110, 100, 101, 120, 34, 58, 123, 125, 125, 10, 
91, 53, 48, 44, 34, 53, 48, 34, 93, 10]]; ]]; Bailing out..
 at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:76)
|
​

Any idea what I'm doing wrong? (I started with the Beta3 jar before trying the nightly but also with no luck). I am 
running against elasticsearch1.4.4


|/spark_1.2.1/bin$ curl localhost:9200
{
   status : 200,
   name : Wizard,
   cluster_name : elasticsearch,
   version : {
 number : 1.4.4,
 build_hash : c88f77ffc81301dfa9dfd81ca2232f09588bd512,
 build_timestamp : 2015-02-19T13:05:36Z,
 build_snapshot : false,
 lucene_version : 4.10.3
   },
   tagline : You Know, for Search
}
|
​

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email to 
elasticsearch+unsubscr...@googlegroups.com mailto:elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com 
https://groups.google.com/d/msgid/elasticsearch/c9c63ee8-0683-46b3-b023-de348d34b560%40googlegroups.com?utm_medium=emailutm_source=footer.

For more options, visit https://groups.google.com/d/optout.



--
Costin

--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5511A5B1.4070907%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread mjdude5
Sometimes that happens when the new node starts up via monit or other 
automated thing before the old node is fully shutdown.  I'd suggest 
shutting down the node and verify it's done via ps before allowing the new 
node to start up.  In the case of monit if the check is hitting the http 
port then it'll think it's down before it actually fully quits.

On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote:

 Thanks a lot mjdude! It does seem like it attached to the wrong data 
 directory.
 In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0 
 but node stats shows the data directory as elasticsearch/data/tool/nodes/1.
 Now, how do I change this?

 On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com javascript: wrote:

 When it restarted did it attach to the wrong data directory?  Take a look 
 at _nodes/_local/stats?pretty and check the 'data' directory location.  Has 
 the cluster recovered after the restart?  Check _cluster/health?pretty as 
 well.

 On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap 
 memory (I think since Marvel showed memory as 1 GB which corresponds to the 
 heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has appeared 
 and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the 
 Marvel has stopped showing me my data (it still shows that disk memory is 
 lower so that means the data is still on disk) and upon searching Sense 
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the 
 heap?  If it's 99% of the available heap that's bad and you will have 
 cluster instability.  I suggest increasing your JVM heap size if you can, 
 I 
 can't find it right now but I remember a blog post that used twitter as a 
 benchmark and they also could get to ~50M documents with the default 1G 
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter river 
 to index tweets. It has been going fine with almost 50M tweets indexed so 
 far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered 
 between 10-20%, then started remaining around 30-40% but for the past 3-4 
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon 
 as I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is the 
 default 990.75MB)

 Thanks
 Yogesh

  -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/
 topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3ca6aa0-f3d9-4067-933c-1a2eb0a15075%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Smart Chinese Analysis plugin 2.5.0 released

2015-03-24 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Smart Chinese 
Analysis plugin, version 2.5.0.

Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module 
into elasticsearch..

https://github.com/elastic/elasticsearch-analysis-smartcn/

Release Notes - elasticsearch-analysis-smartcn - Version 2.5.0



Update:
 * [36] - Update to elasticsearch 1.5.0 
(https://github.com/elastic/elasticsearch-analysis-smartcn/issues/36)
 * [32] - Tests: upgrade randomizedtesting-runner to 2.1.10 
(https://github.com/elastic/elasticsearch-analysis-smartcn/issues/32)
 * [31] - Tests: index.version.created must be set 
(https://github.com/elastic/elasticsearch-analysis-smartcn/issues/31)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-smartcn project repository: 
https://github.com/elastic/elasticsearch-analysis-smartcn/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5511a68b.655cb40a.704e.0a7dSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES JVM memory usage consistently above 90%

2015-03-24 Thread Yogesh Kansal
Thanks. Mine is a one node cluster so I am simply using the XCURL to shut
it down and the doing bin/elasticsearch -d to start it up.
To check if it has shutdown, I try to hit it using http. So, now how do I
start it with the node/0 data directory?
There is nothing there in node/1 data directory but I don't suppose
deleting it would be the solution? (Sorry for the basic questions I am new
to this!)

On Tue, Mar 24, 2015 at 11:32 PM, mjdu...@gmail.com wrote:

 Sometimes that happens when the new node starts up via monit or other
 automated thing before the old node is fully shutdown.  I'd suggest
 shutting down the node and verify it's done via ps before allowing the new
 node to start up.  In the case of monit if the check is hitting the http
 port then it'll think it's down before it actually fully quits.

 On Tuesday, March 24, 2015 at 1:56:54 PM UTC-4, Yogesh wrote:

 Thanks a lot mjdude! It does seem like it attached to the wrong data
 directory.
 In elasticsearch/data/tool/nodes there are two 0 and 1. My data is in 0
 but node stats shows the data directory as elasticsearch/data/tool/nodes/
 1.
 Now, how do I change this?

 On Tue, Mar 24, 2015 at 11:02 PM, mjd...@gmail.com wrote:

 When it restarted did it attach to the wrong data directory?  Take a
 look at _nodes/_local/stats?pretty and check the 'data' directory
 location.  Has the cluster recovered after the restart?  Check
 _cluster/health?pretty as well.

 On Tuesday, March 24, 2015 at 1:01:52 PM UTC-4, Yogesh wrote:

 Thanks Joel and mjdude. What I mean is that ES is using 99% of the heap
 memory (I think since Marvel showed memory as 1 GB which corresponds to the
 heap, my RAM is 50GB)
 I've increased the ES_HEAP_SIZE to 10g. But another problem has
 appeared and I'm freaked out because of it!

 So, after restarting my ES (curl shutdown) to increase the heap, the
 Marvel has stopped showing me my data (it still shows that disk memory is
 lower so that means the data is still on disk) and upon searching Sense
 shows IndexMissingException[[my_new_twitter_river] missing]

 Why is this happening?!?!

 On Mon, Mar 23, 2015 at 9:25 PM, mjd...@gmail.com wrote:

 Are you saying JVM is using 99% of the system memory or 99% of the
 heap?  If it's 99% of the available heap that's bad and you will have
 cluster instability.  I suggest increasing your JVM heap size if you can, 
 I
 can't find it right now but I remember a blog post that used twitter as a
 benchmark and they also could get to ~50M documents with the default 1G
 heap.

 On Sunday, March 22, 2015 at 3:30:57 AM UTC-4, Yogesh wrote:

 Hi,

 I have set up elasticsearch on one node and am using the Twitter
 river to index tweets. It has been going fine with almost 50M tweets
 indexed so far in 13 days.
 When I started indexing, the JVM usage (observed via Marvel) hovered
 between 10-20%, then started remaining around 30-40% but for the past 3-4
 days it has continuously been above 90%, reaching 99% at times!
 I restarted elasticsearch thinking it might get resolved but as soon
 as I switched it back on, the JVM usage went back to 90%.

 Why is this happening and how can I remedy it? (The JVM memory is the
 default 990.75MB)

 Thanks
 Yogesh

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/to
 pic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40goo
 glegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0f990291-fa20-4bba-881f-3f378985c8c9%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit https://groups.google.com/d/
 topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/057e53f7-3207-4df9-bbc4-80cea03e189f%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/kMSwBqpe2N4/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 

Elasticsearch on Mesos / Marathon using Docker container

2015-03-24 Thread domen . pogacnik
Hi all, 

Is anybody here running Elasticsearch on Mesos  Marathon using Docker 
containers? I’m wondering how much memory should I allocate to 
Elasticsearch Marathon task that is the only task running on the host e.g. 
all available memory as reported by Mesos or some smaller amount to leave 
something for the system? Please note that I’m not asking about the heap 
size but about memory allocated to the Marathon task.

Any suggestions?

Best,
Domen

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dc96a97a-1724-4390-af4e-362529cecd5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


New to elastic Search and Kibana environment, how to link elastic search to cabin Error

2015-03-24 Thread shmaim humaid
Dear All, 

I do not know how to link my elastic search to the index pattern in Kinbana;


I have elastic search running with two index items;

this is the error message am getting:

null is not an object (evaluating 'index.timeField.name')

http://109.123.216.106/index.js?_b=5888:127043:55
wrappedCallback@http://109.123.216.106/index.js?_b=5888:20888:81
http://109.123.216.106/index.js?_b=5888:20974:34
$eval@http://109.123.216.106/index.js?_b=5888:22017:28
$digest@http://109.123.216.106/index.js?_b=5888:21829:36
$apply@http://109.123.216.106/index.js?_b=5888:22121:31
done@http://109.123.216.106/index.js?_b=5888:17656:51
completeRequest@http://109.123.216.106/index.js?_b=5888:17870:15
onreadystatechange@http://109.123.216.106/index.js?_b=5888:17809:26



Any solution?
This is my first day to work with Kibana 



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dc4718c5-fc3b-40ce-b37d-cec99685304c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ElasticSearch data directory on Windows 7

2015-03-24 Thread Mark Walkom
Actually as Windows users don't have an installer they need to use the zip,
so the path in the link are valid.

On 24 March 2015 at 17:38, Ravi Gupta guptarav...@gmail.com wrote:

 @Mark the link you posted is for linux paths. I assumed windows path will
 be on similar lines and expected index data to be on below path:

 C:\ES home\data\elasticsearch\nodes\0\indices\idd

 I see the path looks like it contains all the index data but I am puzzled
 why the size of data directory is only 77 KB.

 There is a file of size 2 KB and it seems to contain some data that looks
 encoded (it appears my application data)
 C:\ES home\data\elasticsearch\nodes\0\indices\idd\_state

 My question is how can 2.2GB size data is compressed into 2KB unless I am
 looking at wrong directory.




 On Tuesday, 24 March 2015 17:28:52 UTC+11, Mark Walkom wrote:

 Take a look at http://www.elastic.co/guide/en/elasticsearch/reference/
 current/setup-dir-layout.html#_zip_and_tar_gz

 On 24 March 2015 at 16:34, Ravi Gupta gupta...@gmail.com wrote:

 Hi ,

 I have downloaded and installed ES on windows 7 and imported 2million+
 records into it. (from a csv of size 2.2GB). I can search records using
 fiddler and it works as expected.

 I was hoping to see size of below directory to increase few GB but the
 size is only 77 kb.

 C:\temp\elasticsearch-1.4.4\elasticsearch-1.4.4\data

 This means ES stores data in some other directory on disk.

 Can you please tell me the directory location where ES stores the data?

 Regards,
 Ravi

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/ba4ca28a-fb16-48a2-a636-1702ae89ce80%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/62cc4c9e-2846-4c73-bb75-f683419b0439%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-zDPEGDaW2x6dKtVPUkagG%2BvhwpEmn8aKc_KScasmuNg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Error: Configure an index pattern

2015-03-24 Thread shmaim humaid
Dear all, 

I do not know how to link my elastic search to the index pattern in Kinbana;


I have elastic search running with two index items;

this is the error message am getting:

null is not an object (evaluating 'index.timeField.name 
http://index.timefield.name/')

http://109.123.216.106/index.js?_b=5888:127043:55
wrappedCallback@http://109.123.216.106/index.js?_b=5888:20888:81http://109.123.216.106/index.js?_b=5888:20974:34
$eval@http://109.123.216.106/index.js?_b=5888:22017:28
$digest@http://109.123.216.106/index.js?_b=5888:21829:36
$apply@http://109.123.216.106/index.js?_b=5888:22121:31
done@http://109.123.216.106/index.js?_b=5888:17656:51
completeRequest@http://109.123.216.106/index.js?_b=5888:17870:15
onreadystatechange@http://109.123.216.106/index.js?_b=5888:17809:26



Any solution?
This is my first day to work with Kibana 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c68ddcef-4b1a-45f7-8511-e6525701e9d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Kibana] Possibility to hide _source field from Discover detail view?

2015-03-24 Thread Görge Albrecht
Thanks for the fast reply.
I created https://github.com/elastic/kibana/issues/3429.

On Monday, March 23, 2015 at 8:47:21 PM UTC+1, Mark Walkom wrote:

 There is not, if you want that it'd definitely be worth raising as a 
 request on github though!

 On 23 March 2015 at 21:40, Görge Albrecht goerge@gmail.com 
 javascript: wrote:

 Hi All,

 Is there any way in Kibana 4 to hide the _source field from the detail 
 view on the Discover table?
 As all fields are already shown in the detail view, the additional 
 _source field seems to increase the visual noise without adding any value.

 Thanks in advance,
 Görge

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/715cdb0b-a98b-427a-abcb-1231069be6f7%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/715cdb0b-a98b-427a-abcb-1231069be6f7%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a421efd7-1590-44f3-a041-60497db9c7a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Load ES Nested Mapping from Hive

2015-03-24 Thread Ankur Wahi
Hi,

I'm trying to load an ES nested mapping from Hive.
My tables are listed below:

CREATE TABLE ORDER(
  order_id bigint,
  total_amount bigint,
  customer bigint)
STORED AS ORC

CREATE TABLE ORDER_DETAILS(
  order_id bigint,
  Order_Item_id bigint,
  Item_Size bigint,
  Item_amount bigint,
  Item_type string)
STORED AS ORC

Order and Order_Details have a 1-M relationship

I want to store the data above into Elasticseacrh, such that Order_Details 
is a Nested Mapping of Order.

My ES Mapping would look somehwat like this

{
order : {
orderid : {type: float },
total_amount : {type: float },
customer : {type: float }
properties : {
order_details : {
type : nested,
properties: {
Order_Item_id : {type: float },
Item_Size  : {type: float },
Item_amount : {type: float },
Item_type: {type: string }
}
}
}
}
}


My question is how do I convert the 1-M relation in Hive and load it into 
ES mapping?

Possible Solution
I will create another Hive table which will contain the join query of the 
two source Hive tables
But what is structure of this Hive table, so that it maps to ES mapping?

Any help/suggestions?

Thanks
Ankur

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cdfee0e9-e0fa-47e4-b847-55b010984574%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Unexpected behavior (to me :-) from function_score

2015-03-24 Thread jan58 . wandelaar
Hi All,

Starting to play with function_score and doc_values...
I have a set of 30 documents with integers, ranging from 1 to 30.

Now I want a query that returns all documents with value 5 on top, followed 
by the rest of the documents, scored by their distance to 5.

Query:
  query: {
 function_score: {
 query: {match_all: {}},
 boost_mode: replace,
 functions: [
  {
gauss: {
  int_field: {
origin: 5,
scale: 5,
offset: 2,
decay : 0.1
  }}
  }
 ]
 }
  }

This returns the document with 5 with a score of 1 (expected), followed by 
the rest of the documents with a score 0 (NOT expected).

Does someone see what I'm doing wrong?

Looking at the explain, I see weird values for the doc_values.
Following is the explain for the top document (with 5 as its value). Note 
the stated values for doc_value and origin!
_explanation: {
   value: 1,
   description: function score, product of:,
   details: [
  {
 value: 1,
 description: Math.min of,
 details: [
{
   value: 1,
   description: function score, score mode 
[multiply],
   details: [
  {
 value: 1,
 description: function score, product 
of:,
 details: [
{
   value: 1,
   description: match filter: *:*
},
{
   value: 1,
   description: Function for field 
int_field:,
   details: [
  {
 value: 1,
 description: 
exp(-0.5*pow(MIN of: [Math.max(Math.abs(-6.20093664E13(=doc value) - 
-6.20093664E13(=origin))) - 2.0(=offset), 0)],2.0)/5.428681023790649)
  }
   ]
}
 ]
  }
   ]
},
{
   value: 3.4028235e+38,
   description: maxBoost
}
 ]
  },

Best regards,
Jan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e2d438dd-26af-43df-b50d-21fca69121f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: PUT gzipped data into elasticsearch

2015-03-24 Thread joergpra...@gmail.com
Logstash has both Java and HTTP output, but I assume you want to use HTTP.

Set http.compression parameter to true in ES configuration, then you can
use gzip-compressed HTTP traffic using Accept-Encoding header.

http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-http.html

Jörg

On Tue, Mar 24, 2015 at 2:30 PM, Marcel Matus matusmar...@gmail.com wrote:

 Hi Joe,
 no, we post data into elasticsearch using Logstash. Actually, the complete
 way is: APP - Logstash - RMQ - Logstash - ES.
 And when I include F5 VIP's, there are 6 services on the way to process
 these data...

 Marcel


 On Tue, Mar 24, 2015 at 1:50 PM, joergpra...@gmail.com 
 joergpra...@gmail.com wrote:

 Are you using Java API?

 Jörg

 On Tue, Mar 24, 2015 at 11:59 AM, Marcel Matus matusmar...@gmail.com
 wrote:

 Hi,
 some of our data are big ones (1 - 10 MB), and if there are milions of
 those, it causes us trouble in our internal network.
 We would like to compress these data in generation time, and send over
 the network to elasticsearch in compressed state. I tried to find some
 solution, but usually I found only retrieving gzipped data. Is there a way,
 how to put compressed data into elasticsearch?

 Thanks,
 Marcel

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/bce57798-0b51-4c9e-a0ca-06e41b2ca160%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/iWT19g3_Viw/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFagGZYQ1wCZe%2BCMrUSUJ9gE7erjA7fk1aSOLfsdZkXjg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Marcel Matus

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHUcNyj4cAi0oDGqgLmeSxLwVwtCssF72pyhufKV8ML911auHA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3DzuBnV4LWr7ySj6arirFzjxgCARzq%2Bvh05mFqyJ7%2Bpg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.