Tips for failing handshake with shield

2015-02-11 Thread Jettro Coenradie
Hi,
I am trying to get SSL/TLS to work on my raspberry pi cluster. To bad I get 
the following message:

[2015-02-11 17:51:35,037][ERROR][shield.transport.netty   ] [jc-pi-red] 
SSL/TLS handshake failed, closing channel: null

So I guess something is wrong in my certificate chain. I have created my 
own CA and followed the steps from the getting started guide. Any tips on 
how to start debugging?

thanks

Jettro

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf83e1d4-10b8-47b2-a5a0-c5c2491dc899%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch as Logwatch

2015-02-11 Thread Greg Murnane
I believe you can tell logwatch to output its reports as a file, which
could then be ingested with logstash.
Alternatively, logstash has an imap input that you could use to get emails
into Elasticsearch.

-- 
The information transmitted in this email is intended only for the 
person(s) or entity to which it is addressed and may contain confidential 
and/or privileged material. Any review, retransmission, dissemination or 
other use of, or taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is prohibited. If you 
received this email in error, please contact the sender and permanently 
delete the email from any computer.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABnN7hJ5UrywGBEtezo%2BgMTZg3rrCeWdYHBZT_yp_8mvOTVbOQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


is there any data size limit for backup and restore?

2015-02-11 Thread Subbarao Kondragunta

I have one index which has almost more than 10GB data size.
But entire size of all indices with all documents is more than 50 GB data 
size.
Is elastic search allows this much data to take backup and restore?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57119e74-d17c-43d1-aba6-4608f4a4e98e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticon Agenda

2015-02-11 Thread Glen Smith
The currently published agenda:
http://www.elasticon.com/apex/Elastic_ON_Agenda
for Monday, March 9 shows just Registration Opens until 5pm.

If registration will really be the only official activity, traveling in on 
Monday might be most convenient.
Are there likely to be any other events added during that time?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0cd25a35-d26c-439f-a9df-b3fa000ee1fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch as Logwatch

2015-02-11 Thread Taylor Wood
Greg, 

  I started researching exactly what you suggested: Pumping the daily 
logwatch results into elasticsearch.   I already have an ELK in place and 
working.  Do you know what is needed to implement this?   There is no 
logwatch or email input.

Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01e701d2-59f8-436c-8c58-5de03653b65a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES OOMing and not triggering cache circuit breakers, using LocalManualCache

2015-02-11 Thread Wilfred Hughes
Hi all

I have an ES 1.2.4 cluster which is occasionally running out of heap. I 
have ES_HEAP_SIZE=31G and according to the heap dump generated, my biggest 
memory users were:

org.elasticsearch.common.cache.LocalCache$LocalManualCache 55%
org.elasticsearch.indices.cache.filter.IndicesFilterCache 11%

and nothing else used more than 1%.

It's not clear to me what this cache is. I can't find any references to 
ManualCache in the elasticsearch source code, and the docs: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/index-modules-fielddata.html
 
suggest to me that the circuit breakers should stop requests or reduce 
cache usage rather that OOMing.

At the moment my cache was filled up, the node was actually trying to index 
some data:

[2015-02-11 08:14:29,775][WARN ][index.translog   ] [data-node-2] 
[logstash-2015.02.11][0] failed to flush shard on translog threshold
org.elasticsearch.index.engine.FlushFailedEngineException: 
[logstash-2015.02.11][0] Flush failed
at 
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
at 
org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: this writer hit an 
OutOfMemoryError; cannot commit
at 
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
at 
org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
... 5 more
[2015-02-11 08:14:29,812][DEBUG][action.bulk  ] [data-node-2] 
[logstash-2015.02.11][0] failed to execute bulk item (index) index 
{[logstash-2015.02.11][syslog_slurm][1
org.elasticsearch.index.engine.CreateFailedEngineException: 
[logstash-2015.02.11][0] Create failed for 
[syslog_slurm#12UUWk5mR_2A1FGP5W3_1g]
at 
org.elasticsearch.index.engine.internal.InternalEngine.create(InternalEngine.java:393)
at 
org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:384)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:430)
at 
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction
at 
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:433)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
at 
org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
at org.apache.lucene.util.fst.FST.init(FST.java:286)
at org.apache.lucene.util.fst.Builder.init(Builder.java:163)
at 
org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:422)
at 
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:572)
at 
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:547)
at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:214)
at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
at 
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.finishTerm(BlockTreeTermsWriter.java:1039)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:548)
at 
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
at 
org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
at 
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:465)
at 
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518)
at 

Upgrade to 1.4.2 from 1.1.1 causes spike in Disk IO

2015-02-11 Thread George Stathis
Hi there,

We just completed this past Monday an ES upgrade from 1.1.1 to 1.4.2 and a 
Java upgrade from 7 to 8. No other changes. Same boxes, same specs, no new 
queries or filters, no increased requests. What I see in Marvel since the 
upgrade is a large spike in Disk IO:

https://lh6.googleusercontent.com/-i6OYBiqputk/VNuJC-47eII/A_g/TWxTcres-Uk/s1600/Screen%2BShot%2B2015-02-11%2Bat%2B11.37.58%2BAM.png

and a large dip in field data, filter cache and overall index memory 
consumption:

https://lh5.googleusercontent.com/-2DRu7af1npc/VNuJNqsvOQI/A_o/56trKYYMiQw/s1600/Screen%2BShot%2B2015-02-11%2Bat%2B11.37.47%2BAM.png

Should I be looking at the new circuit breaker 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#circuit-breaker
 settings?

Any pointers are welcome!

-GS

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5e6dbf8a-4dd9-46d3-8cdb-a45c79a6b43c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES OOMing and not triggering cache circuit breakers, using LocalManualCache

2015-02-11 Thread Zachary Tong
LocalManualCache is a component of Guava's LRU cache 
https://code.google.com/p/guava-libraries/source/browse/guava-gwt/src-super/com/google/common/cache/super/com/google/common/cache/CacheBuilder.java,
 
which is used by Elasticsearch for both the filter and field data cache. 
 Based on your node stats, I'd agree it is the field data usage which is 
causing your OOMs.  CircuitBreaker helps prevent OOM, but it works on a 
per-request basis.  It's possible for individual requests to pass the CB 
because they use small subsets of fields, but over-time the set of fields 
loaded into Field Data continues to grow and you'll OOM anyway.

I would prefer to set a field data limit, rather than an expiration.  A 
hard limit prevents OOM because you don't allow the cache to grow anymore. 
 An expiration does not guarantee that, since you could get a burst of 
activity that still fills up the heap and OOMs before the expiration can 
work.

-Z

On Wednesday, February 11, 2015 at 12:50:45 PM UTC-5, Wilfred Hughes wrote:

 After examining some other nodes that were using a lot of their heap, I 
 think this is actually field data cache:


 $ curl http://localhost:9200/_cluster/stats?humanpretty;
 ...
 fielddata: {
   memory_size: 21.3gb,
   memory_size_in_bytes: 22888612852,
   evictions: 0
 },
 filter_cache: {
   memory_size: 6.1gb,
   memory_size_in_bytes: 6650700423,
   evictions: 12214551
 },

 Since this is storing logstash data, I'm going to add the following lines 
 to my elasticsearch.yml and see if I observe a difference once deployed to 
 production.

 # Don't hold field data caches for more than a day, since data is
 # grouped by day and we quickly lose interest in historical data.
 indices.fielddata.cache.expire: 1d


 On Wednesday, 11 February 2015 16:29:22 UTC, Wilfred Hughes wrote:

 Hi all

 I have an ES 1.2.4 cluster which is occasionally running out of heap. I 
 have ES_HEAP_SIZE=31G and according to the heap dump generated, my biggest 
 memory users were:

 org.elasticsearch.common.cache.LocalCache$LocalManualCache 55%
 org.elasticsearch.indices.cache.filter.IndicesFilterCache 11%

 and nothing else used more than 1%.

 It's not clear to me what this cache is. I can't find any references to 
 ManualCache in the elasticsearch source code, and the docs: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/index-modules-fielddata.html
  
 suggest to me that the circuit breakers should stop requests or reduce 
 cache usage rather that OOMing.

 At the moment my cache was filled up, the node was actually trying to 
 index some data:

 [2015-02-11 08:14:29,775][WARN ][index.translog   ] [data-node-2] 
 [logstash-2015.02.11][0] failed to flush shard on translog threshold
 org.elasticsearch.index.engine.FlushFailedEngineException: 
 [logstash-2015.02.11][0] Flush failed
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
 at 
 org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalStateException: this writer hit an 
 OutOfMemoryError; cannot commit
 at 
 org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
 at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
 ... 5 more
 [2015-02-11 08:14:29,812][DEBUG][action.bulk  ] [data-node-2] 
 [logstash-2015.02.11][0] failed to execute bulk item (index) index 
 {[logstash-2015.02.11][syslog_slurm][1
 org.elasticsearch.index.engine.CreateFailedEngineException: 
 [logstash-2015.02.11][0] Create failed for 
 [syslog_slurm#12UUWk5mR_2A1FGP5W3_1g]
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.create(InternalEngine.java:393)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:384)
 at 
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:430)
 at 
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
 at 
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction
 at 
 

Groovy scripting vulnerability

2015-02-11 Thread elasticsearch
Hi all

Elasticsearch versions 1.3.0-1.3.7 and 1.4.0-1.4.2 have a vulnerability in 
the Groovy scripting engine. The vulnerability allows an attacker to 
construct Groovy scripts that escape the sandbox and execute shell commands 
as the user running the Elasticsearch Java VM.

We have released Elasticsearch 1.3.8 and 1.4.3 to address this issue. 
Please read the blogpost and either upgrade or update your config to 
disable dynamic Groovy scripting:

https://www.elasticsearch.org/blog/elasticsearch-1-4-3-and-1-3-8-released/

thanks

Clint

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb811038-becb-45a6-bff7-c268d4e3fd78%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: php api and self signed certs

2015-02-11 Thread Zachary Tong
Glad you got it working!  Just a note, I don't monitor the ES mailing list 
very closely...but if you have any more problems/questions with the PHP 
client in the future, feel free to just open a tick at the repo and I would 
be happy to help :)

-Z

On Wednesday, February 11, 2015 at 7:09:57 AM UTC-5, Paul Halliday wrote:

 That is the hint I needed.

 I had it w/o quotes at first and the page was failing. I just assumed this 
 was because of the lack of quotes, which once added allowed the page to 
 load. I should have looked at the error console, which was showing the 
 missing include :)

 All is good and working as expected.

 On Tuesday, February 10, 2015 at 11:17:20 PM UTC-4, Ian MacLennan wrote:

 I suspect that it is because you put your constant names in quotes, and 
 as a result the keys are going to be wrong.  i.e. you 
 have 'CURLOPT_SSL_VERIFYPEER' = true, while it should probably be 
 CURLOPT_SSL_VERIFYPEER = true as shown in the link you provided.

 Ian

 On Tuesday, February 10, 2015 at 10:42:05 AM UTC-5, Paul Halliday wrote:

 Well, I got it working but had to make the declarations in CurlHandle.php

 I added:

 $curlOptions[CURLOPT_SSL_VERIFYPEER] = true;
 $curlOptions[CURLOPT_CAINFO] = '/full/path/to/cacert.pem';

 directly above:

 curl_setopt_array($handle, $curlOptions);
 return new static($handle, $curlOptions);
 ...

 What's strange is that I also tried placing them in the default 
 $curlOptions but those seemed to get dumped somewhere prior to 
 curl_setopt_array()

 Yucky, but working.


 On Monday, February 9, 2015 at 7:05:34 PM UTC-4, Paul Halliday wrote:

 Hi,

 I am trying to get the configuration example from here to work:  


 http://www.elasticsearch.org/guide/en/elasticsearch/client/php-api/current/_security.html#_guzzleconnection_self_signed_certificate

 My settings look like this:
  
  19 // Elasticsearch
  20 $clientparams = array();
  21 $clientparams['hosts'] = array(
  22 'https://host01:443'
  23 );
  24 
  25 $clientparams['guzzleOptions'] = array(
  26 '\Guzzle\Http\Client::SSL_CERT_AUTHORITY' = 'system',
  27 '\Guzzle\Http\Client::CURL_OPTIONS' = [
  28'CURLOPT_SSL_VERIFYPEER' = true,
  29 'CURLOPT_SSL_VERIFYHOST' = 2,
  30 'CURLOPT_CAINFO' = '.inc/cacert.pem',
  31 'CURLOPT_SSLCERTTYPE' = 'PEM',
  32 ]
  33 );

 The error:

 PHP Fatal error:  Uncaught exception 
 'Elasticsearch\\Common\\Exceptions\\TransportException' with message 'SSL 
 certificate problem: self signed certificate'

 What did I miss?

 Thanks!




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4821d7fc-5ce0-4b1a-a46f-fb20c61eb475%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed Shard Recovery

2015-02-11 Thread Mario Rodriguez
Bumping up.
Does anyone have insight into this??

Mario

On Tuesday, February 10, 2015 at 3:02:12 PM UTC-6, Mario Rodriguez wrote:



 On Tuesday, February 10, 2015 at 2:55:59 PM UTC-6, Mario Rodriguez wrote:

 I have 7 indices in our ES cluster, and one of the indices has an issue 
 that is preventing recovery.

 Here is what I am seeing in the log:

 [2015-02-10 00:00:02,483][WARN ][indices.recovery ] [ES_Server1] 
 [prodcustomer][1] recovery from 
 [[ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]]] 
 failed
 org.elasticsearch.transport.RemoteTransportException: 
 [ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
 Caused by: org.elasticsearch.index.engine.RecoveryEngineException: 
 [prodcustomer][1] Phase[1] Execution failed
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
 at 
 org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
 at 
 org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
 at 
 org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
 at 
 org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
 at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: 
 org.elasticsearch.indices.recovery.RecoverFilesRecoveryException: 
 [prodcustomer][1] Failed to transfer [0] files with total size of [0b]
 at 
 org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:280)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1068)
 ... 9 more
 Caused by: java.io.EOFException: read past EOF: 
 MMapIndexInput(path=D:\ElasticSearchData\new_es_cluster\nodes\0\indices\prodcustomer\1\index\_checksums-1418875637019)
 at 
 org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:81)
 at org.apache.lucene.store.DataInput.readInt(DataInput.java:96)
 at 
 org.apache.lucene.store.ByteBufferIndexInput.readInt(ByteBufferIndexInput.java:132)
 at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.readLegacyChecksums(Store.java:523)
 at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.buildMetadata(Store.java:438)
 at 
 org.elasticsearch.index.store.Store$MetadataSnapshot.init(Store.java:433)
 at org.elasticsearch.index.store.Store.getMetadata(Store.java:144)
 at 
 org.elasticsearch.indices.recovery.RecoverySource$1.phase1(RecoverySource.java:145)
 ... 10 more
 [2015-02-10 00:00:02,483][WARN ][indices.cluster  ] [ES_Server1] 
 [prodcustomer][1] failed to start shard
 org.elasticsearch.indices.recovery.RecoveryFailedException: 
 [prodcustomer][1]: Recovery failed from 
 [ES_Server2][QDf3ZP3tQ3Kgund8YX2BBQ][ES_Server2][inet[/5.5.5.5:9300]] 
 into 
 [ES_Server1][R3ArnIZsSVSsd-VLEaU_Ug][ES_Server1][inet[ES_Server1.verify.local/
 4.4.4.4:9300]]
 at 
 org.elasticsearch.indices.recovery.RecoveryTarget.doRecovery(RecoveryTarget.java:306)
 at 
 org.elasticsearch.indices.recovery.RecoveryTarget.access$200(RecoveryTarget.java:65)
 at 
 org.elasticsearch.indices.recovery.RecoveryTarget$3.run(RecoveryTarget.java:184)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: org.elasticsearch.transport.RemoteTransportException: 
 [ES_Server2][inet[/5.5.5.5:9300]][index/shard/recovery/startRecovery]
 Caused by: org.elasticsearch.index.engine.RecoveryEngineException: 
 [prodcustomer][1] Phase[1] Execution failed
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.recover(InternalEngine.java:1072)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.recover(InternalIndexShard.java:636)
 at 
 org.elasticsearch.indices.recovery.RecoverySource.recover(RecoverySource.java:135)
 at 
 org.elasticsearch.indices.recovery.RecoverySource.access$2500(RecoverySource.java:72)
 at 
 org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:440)
 at 
 org.elasticsearch.indices.recovery.RecoverySource$StartRecoveryTransportRequestHandler.messageReceived(RecoverySource.java:426)
 at 
 org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run(MessageChannelHandler.java:275)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 

Re: ES OOMing and not triggering cache circuit breakers, using LocalManualCache

2015-02-11 Thread Wilfred Hughes
After examining some other nodes that were using a lot of their heap, I 
think this is actually field data cache:


$ curl http://localhost:9200/_cluster/stats?humanpretty;
...
fielddata: {
  memory_size: 21.3gb,
  memory_size_in_bytes: 22888612852,
  evictions: 0
},
filter_cache: {
  memory_size: 6.1gb,
  memory_size_in_bytes: 6650700423,
  evictions: 12214551
},

Since this is storing logstash data, I'm going to add the following lines 
to my elasticsearch.yml and see if I observe a difference once deployed to 
production.

# Don't hold field data caches for more than a day, since data is
# grouped by day and we quickly lose interest in historical data.
indices.fielddata.cache.expire: 1d


On Wednesday, 11 February 2015 16:29:22 UTC, Wilfred Hughes wrote:

 Hi all

 I have an ES 1.2.4 cluster which is occasionally running out of heap. I 
 have ES_HEAP_SIZE=31G and according to the heap dump generated, my biggest 
 memory users were:

 org.elasticsearch.common.cache.LocalCache$LocalManualCache 55%
 org.elasticsearch.indices.cache.filter.IndicesFilterCache 11%

 and nothing else used more than 1%.

 It's not clear to me what this cache is. I can't find any references to 
 ManualCache in the elasticsearch source code, and the docs: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/index-modules-fielddata.html
  
 suggest to me that the circuit breakers should stop requests or reduce 
 cache usage rather that OOMing.

 At the moment my cache was filled up, the node was actually trying to 
 index some data:

 [2015-02-11 08:14:29,775][WARN ][index.translog   ] [data-node-2] 
 [logstash-2015.02.11][0] failed to flush shard on translog threshold
 org.elasticsearch.index.engine.FlushFailedEngineException: 
 [logstash-2015.02.11][0] Flush failed
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.flush(InternalIndexShard.java:604)
 at 
 org.elasticsearch.index.translog.TranslogService$TranslogBasedFlush$1.run(TranslogService.java:202)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalStateException: this writer hit an 
 OutOfMemoryError; cannot commit
 at 
 org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4416)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2989)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3096)
 at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3063)
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:797)
 ... 5 more
 [2015-02-11 08:14:29,812][DEBUG][action.bulk  ] [data-node-2] 
 [logstash-2015.02.11][0] failed to execute bulk item (index) index 
 {[logstash-2015.02.11][syslog_slurm][1
 org.elasticsearch.index.engine.CreateFailedEngineException: 
 [logstash-2015.02.11][0] Create failed for 
 [syslog_slurm#12UUWk5mR_2A1FGP5W3_1g]
 at 
 org.elasticsearch.index.engine.internal.InternalEngine.create(InternalEngine.java:393)
 at 
 org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:384)
 at 
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:430)
 at 
 org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:158)
 at 
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction
 at 
 org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:433)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.lucene.util.fst.BytesStore.writeByte(BytesStore.java:83)
 at org.apache.lucene.util.fst.FST.init(FST.java:286)
 at org.apache.lucene.util.fst.Builder.init(Builder.java:163)
 at 
 org.apache.lucene.codecs.BlockTreeTermsWriter$PendingBlock.compileIndex(BlockTreeTermsWriter.java:422)
 at 
 org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:572)
 at 
 org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter$FindBlocks.freeze(BlockTreeTermsWriter.java:547)
 at 

Re: Tips for failing handshake with shield

2015-02-11 Thread Jay Modi
There should be another message on the other end of the connection that has 
more details of the actual failure. That message can be seen on the client 
side of the connection when the connection was closed by the server side.

Also, please 
see 
http://www.elasticsearch.org/guide/en/shield/current/trouble-shooting.html#_sslhandshakeexception_causing_connections_to_fail
 
for common problems and some tips on how to resolve them.

On Wednesday, February 11, 2015 at 8:54:27 AM UTC-8, Jettro Coenradie wrote:

 Hi,
 I am trying to get SSL/TLS to work on my raspberry pi cluster. To bad I 
 get the following message:

 [2015-02-11 17:51:35,037][ERROR][shield.transport.netty   ] [jc-pi-red] 
 SSL/TLS handshake failed, closing channel: null

 So I guess something is wrong in my certificate chain. I have created my 
 own CA and followed the steps from the getting started guide. Any tips on 
 how to start debugging?

 thanks

 Jettro


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/652d10f6-79ee-45a5-b919-eca3bc15a5d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


SCAN type search and filtered query

2015-02-11 Thread Nicolas
Hi everyone,

I would like to use a scan search with a filtered query but it doesn't seem 
to be possible.

With a match_all query I get ( size * nb of shards ) hits.
With a filtered query I get ( size ) hits as a total. This is why I think 
the scan type has been disabled.

Here are the queries I'm using :

curl -XGET 'http://mydomain/mytype/_search?search_type=scanscroll=1m' -d '
{
   size:1,
   query:{
  match_all:{

  }
   }
}'

and

curl -XGET 'http://mydomain/mytype/_search?search_type=scanscroll=1m' -d '
{
   size:1,
   query:{
  filtered:{
 query:{
match_all:{

}
 },
 filter:{
bool:{
   must:[
  {
 range:{
createdAt:{
   from:2,
   to:1423673779284,
   include_lower:true,
   include_upper:false
}
 }
  },
  {
 terms:{
id:[
   94839384
]
 }
  }
   ]
}
 }
  }
   }
}'


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf4dbaa4-5d18-4665-864c-217bef58a0f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Phantom Snapshot Process

2015-02-11 Thread Matt Hughes
Found the answer.  The /_cluster/state endpoint has a list of all snapshots 
in progress.  For whatever reason, mine was stuck in INIT for a very long 
time.  I deleted via the /_snapshot/repo/snapshot_name endpoint.

On Wednesday, February 11, 2015 at 9:31:10 AM UTC-5, Matt Hughes wrote:

 My ES cluster seems to think it is in my middle of creating a snapshot yet 
 I don't see any IN_PROGRESS snapshots in my repository.  It's supposed to 
 be an hourly snapshot, but I don't see anything that has started within a 
 few days.

 Yet every time I try and do anything with the repository, it says there is 
 a snapshot in progress.  

 Is there any way to query ES that reveals *what* snapshot is currently in 
 progress?  I only perform snapshots on my master, yet 'jstack' shows 
 nothing related to snapshotting going on.  


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/423e436f-1f78-493b-8483-9f60188d989c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch search result problem ( Urgently)

2015-02-11 Thread Veli Can TEFON
Hello,

I add data add Elasticsearch this code :

$data = array(name=HelperMC::strToUtf8(trim($name)),
urlname=HelperURL::strToURL(trim($name)),
price=number_format(HelperParser::clearTextFromPrice($price), 
2, ',', '.'),
price_for_filter=HelperParser::clearTextFromPrice($price),
image=$imgname,
description=HelperMC::strToUtf8($description),
keywords=HelperMC::strToUtf8($keywords),
url=$k['url'],
sitemap_id=intval($sitemapId),
shop_id=$shop['shop_id'],
logo=$shop['logo'],
key=$hashKey,
type=intval(1),
shop_rating_count=intval($shop['rating_count']),
shop_rating_point=intval($shop['rating_point']),
created_at=date(Y-m-d H:i:s),
updated_at=date(Y-m-d H:i:s));
   
//create elasticsearch index
HelperES::insertEntry(json_encode($data),$hashKey); 



My insertEntry Function:

private static $elasticBase = http://localhost:9200/shopping/items;;

public static function insertEntry($data_string,$id = false){
if($id)
$url = self::$elasticBase./.$id;
else
$url = self::$elasticBase;

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , $url);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, POST);  
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
@$result = curl_exec($ch);
curl_close($ch);

return $result;
}



and my search Function :

public static function search($keyword, $defaultOperator = OR ,$start 
= 0, $size = 25, $from = 0, $to = 1, $shopsArr = array(),$orderBy = 
price, $order = asc){
$searchUrl = self::$elasticBase./_search;

if(count($shopsArr)  0){
$query = '{
from : '.$start.', size : '.$size.',
query : {
filtered : {
 query: {
query_string: { query: '.$keyword.
',
fields: [name,urlname],
default_operator: '.$defaultOperator.
'
}
  },
filter : {
 bool : {
must : {
terms : { sitemap_id : '
.json_encode($shopsArr).' }
},
must : {
range : {

price_for_filter : {
from : '.
$from.',
to  : '.
$to.'
}
  }
}
}
}
}
},
sort : [
  {'.$orderBy.' : {order : '.$order
.', mode : avg}}
 ]
}';
}else{
$query = '{
from : '.$start.', size : '.$size.',
query : {
filtered : {
 query: {
query_string: { 
query: '.$keyword.',
fields: [name,urlname],
default_operator: '.$defaultOperator.
'
}
  },
filter : {
range : {
price_for_filter : {
from : '.$from.',
to  : '.$to.'
}
}
}
}
},
sort : [
 {'.$orderBy.' : {order : '.$order.', 
mode : avg}}
 ]
}';
}


echo $query;


$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , $searchUrl);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, POST);  
curl_setopt($ch, CURLOPT_POSTFIELDS, $query); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 

Re: Large results sets and paging for Aggregations

2015-02-11 Thread Mark Harwood


 1.) Would I assume that as my document count would increase, the time for 
 aggregation calculation would as well increase? 


Yes - documents are processed serially through the tree but it happens 
really quickly and each shard is doing this at the same time to produce its 
summary which is the key to scaling if things start slowing.

 

 2.) How would I relate this analogy with sub aggregations.


Each row of pins in the bean machine represents another decision point for 
direction of travel - this is the equivalent of sub aggregation. The 
difference is in the Bean machine each layer can only go in a choice of 2 
directions - the ball goes left or right around a pin. In your first layer 
of aggregation there are not 2 but 5k choices of direction - one bucket for 
each member. Each member bucket is further broken down by 12 months. 

 

 My observation says that as you increase the number of child aggregations, 
 so it increases the execution time along with memory utilization. What 
 happens in case of sub aggregations?


Hopefully the previous comment should have made that clear. Each sub agg 
represents an additional decision point that has to be negotiated by each 
document - consider direction based on member ID, then next direction based 
on month.

 

 3.) I didn't get your last statement:
  There is however a fixed overhead for all queries which *is* a 
 function of number of docs and that is the Field Data cache required to 
 hold the dates/member IDs in RAM - if this becomes a problem then you may 
 want to look at on-disk alternative structure in the form of DocValues.


We are fast at routing docs through the agg tree because we can lookup 
things like memberID for each matching doc really quickly and then 
determine the appropriate direction of travel. We rely on these look-up 
stores to be pre-warmed (i.e. held in Field Data arrays in the JVM or 
cached by the file system when using DocValues disk-based equivalents) to 
make our queries fast.
They are a fixed cost shared by all queries that make use of them.

 

  
  4.) Off the topic, but I guess best to ask it here since we are talking 
 about it. :) - DocValues - Since it was introduced in 1.0.0 and most of our 
 mapping was defined in ES 0.9, can I change the mapping of existing fields 
 now? Might be I can take this conversation in another thread but would love 
 to hear about 1-3 points. You made this thread very interesting for me.


I recommend you shift that off to another topic.
 


 Thanks
 Piyush 



 On Wednesday, 11 February 2015 15:12:37 UTC+5:30, Mark Harwood wrote:

 5k doesn't sound  too scary.

 Think of the aggs tree like a Bean Machine [1] - one of those wooden 
 boards with pins arranged on it like a christmas tree and you drop balls at 
 the top of the board and they rattle down a choice of path to the bottom.
 In the case of aggs, your buckets are the pins and documents are the balls

 The memory requirement for processing the agg tree is typically the 
 number of pins, not the number of balls you drop into the tree as these 
 just fall out of the bottom of the tree.
 So in your case it is 5k members multiplied by 12 months each = 60k 
 unique buckets, each of which will maintain a counter of how many docs pass 
 through that point. So you could pass millions or billions of docs through 
 and the working memory requirement for the query would be the same.
 There is however a fixed overhead for all queries which *is* a function 
 of number of docs and that is the Field Data cache required to hold the 
 dates/member IDs in RAM - if this becomes a problem then you may want to 
 look at on-disk alternative structure in the form of DocValues.

 Hope that helps.

 [1] http://en.wikipedia.org/wiki/Bean_machine

 On Wednesday, February 11, 2015 at 7:04:04 AM UTC, piyush goyal wrote:

 Hi Mark,

 Before getting into queries, here is a little bit info about the project:

 1.) A community where members keep on increasing, decreasing and 
 changing. Maintained in a different type.
 2.) Approximately 3K to 4K documents of data of each user inserted into 
 ES per month in a different type maintained by member ID.
 3.) Mapping is flat, there are no nested and array type of data.

 Requirement:

 Here is a sample requirement:

 1.) Getting a report against each member ID against the count of data 
 for last three month.
 2.) Query used to get the data is:

 {
   query: {
 constant_score: {
   filter: {
 bool: {
   must: [
 {term: {
   datatype: XYZ
 }
 }, {
   range: {
 response_timestamp: {
   from: 2014-11-01,
   to: 2015-01-31
 }
   }
 }
   ]
 }
   }
 }
   },aggs: {
 memberIDAggs: {
   terms: {
 field: member_id,
 size: 0
   },aggs: {
 dateHistAggs: {
   date_histogram: {
 

Delete index after backup

2015-02-11 Thread Yarden Bar
HI all,

Is there any procedure to *archive* index to HDFS (or other repo) and 
delete it?

I've looked into the 'Snapshot/Restore' docs and understood that if I 
execute snapshot - delete index - snapshot the index deletion will get 
reflected in the snapshot process.

Thanks,
Yarden


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2fca7f58-4fc3-475d-92dd-3d06b486b81d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: jdbc river won't start

2015-02-11 Thread Abid Hussain
That was it. I deleted the river using
DELETE /_river/my_river/_meta

but that was wrong, I should have used:
DELETE /_river/my_river/

Thanks!


Am Mittwoch, 11. Februar 2015 15:13:47 UTC+1 schrieb Jörg Prante:

 You are sure you executed

 curl -XDELETE 'localhost:9200/_river/my_river/'

 before creating the new river with same name? I see there is '_version: 
 12', this means, you did not stop the old river instance with DELETE.

 Jörg

 On Wed, Feb 11, 2015 at 2:38 PM, Abid Hussain hus...@novacom.mygbiz.com 
 javascript: wrote:

 Hi,

 we are running ES on a single node cluster.

 I'm facing the issue that a river is not being started after being 
 re-created. In detail, we first delete the index the river refers to, then 
 delete the river and then create it again.

 *GET /_river/my_river/_meta*
 {
_index: _river,
_type: my_river,
_id: _meta,
_version: 12,
found: true,
_source: {
   type: jdbc,
   jdbc: {
  driver: com.mysql.jdbc.Driver,
  url: jdbc:mysql://localhost:3306/my_db,
  user: user,
  password: ***,
  index: my_index,
  type: my_customer,
  sql: SELECT * FROM my_customer
   }
}
 }

 Unfortunately, the river is not being started after creating it. When I 
 request the river state it tells me that it ran last time yesterday:
 {
state: [
   {
  name: my_river,
  type: jdbc,
  started: 2015-02-10T01:00:06.997Z,
  last_active_begin: 2015-02-10T01:00:07.030Z,
  last_active_end: 2015-02-10T01:53:34.364Z,
  map: {
 lastExecutionStartDate: 1423530007825,
 aborted: false,
 lastEndDate: 1423533216703,
 counter: 1,
 lastExecutionEndDate: 1423533214363,
 lastStartDate: 1423530007608,
 suspended: false
  }
   }
]
 }

 The connection properties didn't change since last succesful run.

 In the logs is nothing said, except the message after deleting the index.

 Any idea what could be the problem?

 Regards,

 Abid

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/5e8ffbc7-c33f-4157-b8fe-a170d38fc1fc%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/5e8ffbc7-c33f-4157-b8fe-a170d38fc1fc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b5be310-90cb-474a-82fd-cd233573d78c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Delete index after backup

2015-02-11 Thread Yarden Bar
but will the index deletion get reflected in the next snapshot invocation?

On Wednesday, February 11, 2015 at 5:11:23 PM UTC+2, David Pilato wrote:

 Removing an index does not remove a snapshot.

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 11 févr. 2015 à 16:07, Yarden Bar ayash@gmail.com javascript: 
 a écrit :

 HI all,

 Is there any procedure to *archive* index to HDFS (or other repo) and 
 delete it?

 I've looked into the 'Snapshot/Restore' docs and understood that if I 
 execute snapshot - delete index - snapshot the index deletion will get 
 reflected in the snapshot process.

 Thanks,
 Yarden



 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/2fca7f58-4fc3-475d-92dd-3d06b486b81d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/2fca7f58-4fc3-475d-92dd-3d06b486b81d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/58a39a47-6dc2-4f4e-9515-9895dc38fbcf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Phantom Snapshot Process

2015-02-11 Thread Matt Hughes
My ES cluster seems to think it is in my middle of creating a snapshot yet 
I don't see any IN_PROGRESS snapshots in my repository.  It's supposed to 
be an hourly snapshot, but I don't see anything that has started within a 
few days.

Yet every time I try and do anything with the repository, it says there is 
a snapshot in progress.  

Is there any way to query ES that reveals *what* snapshot is currently in 
progress?  I only perform snapshots on my master, yet 'jstack' shows 
nothing related to snapshotting going on.  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fe880960-c717-4cb4-8fa7-963602fdb58e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch search result problem

2015-02-11 Thread Veli Can TEFON
i think problem price_for_filter but working this - http://goo.gl/AuxCrR


11 Şubat 2015 Çarşamba 06:09:13 UTC+2 tarihinde Veli Can TEFON yazdı:

 Hello,

 I add data add Elasticsearch this code :

 $data = array(name=HelperMC::strToUtf8(trim($name)),
 urlname=HelperURL::strToURL(trim($name)),
 price=number_format(HelperParser::clearTextFromPrice($price
 ), 2, ',', '.'),
 price_for_filter=HelperParser::clearTextFromPrice($price),
 image=$imgname,
 description=HelperMC::strToUtf8($description),
 keywords=HelperMC::strToUtf8($keywords),
 url=$k['url'],
 sitemap_id=intval($sitemapId),
 shop_id=$shop['shop_id'],
 logo=$shop['logo'],
 key=$hashKey,
 type=intval(1),
 shop_rating_count=intval($shop['rating_count']),
 shop_rating_point=intval($shop['rating_point']),
 created_at=date(Y-m-d H:i:s),
 updated_at=date(Y-m-d H:i:s));

 //create elasticsearch index
 HelperES::insertEntry(json_encode($data),$hashKey); 



 My insertEntry Function:

 private static $elasticBase = http://localhost:9200/shopping/items;;

 public static function insertEntry($data_string,$id = false){
 if($id)
 $url = self::$elasticBase./.$id;
 else
 $url = self::$elasticBase;

 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL , $url);
 curl_setopt($ch, CURLOPT_CUSTOMREQUEST, POST);  
 curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string); 
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
 curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
 @$result = curl_exec($ch);
 curl_close($ch);

 return $result;
 }



 and my search Function :

 public static function search($keyword, $defaultOperator = OR ,$start 
 = 0, $size = 25, $from = 0, $to = 1, $shopsArr = array(),$orderBy = 
 price, $order = asc){
 $searchUrl = self::$elasticBase./_search;

 if(count($shopsArr)  0){
 $query = '{
 from : '.$start.', size : '.$size.',
 query : {
 filtered : {
  query: {
 query_string: { query: '.$keyword
 .',
 fields: [name,urlname],
 default_operator: '.
 $defaultOperator.'
 }
   },
 filter : {
  bool : {
 must : {
 terms : { sitemap_id : 
 '.json_encode($shopsArr).' }
 },
 must : {
 range : {
 
 price_for_filter : {
 from : '
 .$from.',
 to  : '
 .$to.'
 }
   }
 }
 }
 }
 }
 },
 sort : [
   {'.$orderBy.' : {order : '.
 $order.', mode : avg}}
  ]
 }';
 }else{
 $query = '{
 from : '.$start.', size : '.$size.',
 query : {
 filtered : {
  query: {
 query_string: { 
 query: '.$keyword.',
 fields: [name,urlname],
 default_operator: '.
 $defaultOperator.'
 }
   },
 filter : {
 range : {
 price_for_filter : {
 from : '.$from.',
 to  : '.$to.'
 }
 }
 }
 }
 },
 sort : [
  {'.$orderBy.' : {order : '.$order.', 
 mode : avg}}
  ]
 }';
 }
 

 

Re: jdbc river won't start

2015-02-11 Thread joergpra...@gmail.com
You are sure you executed

curl -XDELETE 'localhost:9200/_river/my_river/'

before creating the new river with same name? I see there is '_version:
12', this means, you did not stop the old river instance with DELETE.

Jörg

On Wed, Feb 11, 2015 at 2:38 PM, Abid Hussain huss...@novacom.mygbiz.com
wrote:

 Hi,

 we are running ES on a single node cluster.

 I'm facing the issue that a river is not being started after being
 re-created. In detail, we first delete the index the river refers to, then
 delete the river and then create it again.

 *GET /_river/my_river/_meta*
 {
_index: _river,
_type: my_river,
_id: _meta,
_version: 12,
found: true,
_source: {
   type: jdbc,
   jdbc: {
  driver: com.mysql.jdbc.Driver,
  url: jdbc:mysql://localhost:3306/my_db,
  user: user,
  password: ***,
  index: my_index,
  type: my_customer,
  sql: SELECT * FROM my_customer
   }
}
 }

 Unfortunately, the river is not being started after creating it. When I
 request the river state it tells me that it ran last time yesterday:
 {
state: [
   {
  name: my_river,
  type: jdbc,
  started: 2015-02-10T01:00:06.997Z,
  last_active_begin: 2015-02-10T01:00:07.030Z,
  last_active_end: 2015-02-10T01:53:34.364Z,
  map: {
 lastExecutionStartDate: 1423530007825,
 aborted: false,
 lastEndDate: 1423533216703,
 counter: 1,
 lastExecutionEndDate: 1423533214363,
 lastStartDate: 1423530007608,
 suspended: false
  }
   }
]
 }

 The connection properties didn't change since last succesful run.

 In the logs is nothing said, except the message after deleting the index.

 Any idea what could be the problem?

 Regards,

 Abid

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5e8ffbc7-c33f-4157-b8fe-a170d38fc1fc%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5e8ffbc7-c33f-4157-b8fe-a170d38fc1fc%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFXKzzVP%3DP3Og4OpUCQDVa8MNg5%2BCPpCAo-S5upy_VdSg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Weird problem with routing performance

2015-02-11 Thread bizzorama
Hi all,

i have a weird (at least in my oppinion) situation on my environment.

my hardware:
3 data nodes with 32GB of ram (16GB of Java heap for ES)

using data from logstash, stored per day manner
with 360 5*shard + 1 replica indexes -* NO ROUTING *- grouped by alias 
named* norouting*
and same indexes (different names of course) reindexed with new mapping:
and 360 4*shard + 1 replica indexes - *WITH ROUTING * - grouped by alias 
named* routing*

about *2 billions *of docs,
about* 0.8 TB* of data

I have 4 customers that I want to split to their own indices by routing to 
increase performance when querying the whole year of data.
3 customers are similar size, and one is about 2x times larger than others.

When I performed tests (I cleared the cache with _cache/clear before each 
query)

Simple query that I was using for indexes with routing:

-
/*routing*/_search?routing=CUSTOMER1
{
query: {
filtered: {
query: {
match_all: {}
},
filter: {
term: {
Customer.raw: CUSTOMER1
}
}
}
}
}

result: *360 indices read, 18 sec response*

-

and the indexes without routing

/*norouting*/_search
{
query: {
filtered: {
query: {
match_all: {}
},
filter: {
term: {
Customer.raw: CUSTOMER1
}
}
}
}
}

result: *1800 indices read, 7 sec response*

*... no matter which customer I want to find, results are similar, 
searching with routing is always almoust 2.5 times slower !*

This does not make any sense ... should be opposite in my oppinion.
Any help would be much appreciated !. After I've read about routing I was 
happy and hoping I could speed up the queries without adding additional 
hardware ... now I'm very sceptical.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eb1ab06d-f738-4b5e-a333-710f3b26b30a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Delete index after backup

2015-02-11 Thread David Pilato
Removing an index does not remove a snapshot.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 11 févr. 2015 à 16:07, Yarden Bar ayash.jor...@gmail.com a écrit :
 
 HI all,
 
 Is there any procedure to archive index to HDFS (or other repo) and delete it?
 
 I've looked into the 'Snapshot/Restore' docs and understood that if I execute 
 snapshot - delete index - snapshot the index deletion will get reflected 
 in the snapshot process.
 
 Thanks,
 Yarden
 
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com 
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/2fca7f58-4fc3-475d-92dd-3d06b486b81d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/2fca7f58-4fc3-475d-92dd-3d06b486b81d%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8F990642-B4CE-430D-80FE-72A218F944ED%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Google Cloud click-to-deploy

2015-02-11 Thread Avi Gabay
Hi,
So we are a small startup and we use Google cloud as our cloud provider.
Just last month Google announce how they've made it easy to deploy 
elasticsearch on their servers with just several clicks.
I thought that this will be a nice thing to look at as it can save us the 
hustle of deploying machines by ourselves and doing and this configuration.
So I literally clicked to deploy and go the cluster running and now I 
want to add more instances to the cluster as I'm expanding and I cannot 
seem to find anywhere how to do so. Any one got a clue? I've looked 
everywhere and there's no documentation on this issues.

Thanks! 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/855ff71d-0186-43a3-907e-e20dd15a7935%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is Unsecured OPTIONS Method A Vulnerability?

2015-02-11 Thread stv . bell07
I've been working lately on a project utilizing ElasticSearch and Kibana. 
To secure the ElasticSearch API I've hidden it behind a reverse proxy.

The proxy uses a cookie to authenticate the request and forward it to the 
ElasticSearch server, but if no cookie is present or if the cookie does not 
validate, then 401 is returned.

Here's the catch. Kibana uses CORS to communicate with ElasticSearch, so 
while I can enable the Kibana HTTP client to use the withCredentials option 
which will include cookies, it only does so for the four CRUD HTTP verbs. 
Glaringly, any OPTIONS requests from Kibana will not include the cookie.

This makes sense on a certain level due to the description of the intended 
purpose for the OPTIONS verb in the HTTP spec.

As such, in order to get my front-end functioning through this reverse 
proxy I've had to white-list all OPTIONS requests. I'm concerned with 
whether or not this could be abused to get commands through to the ES 
server that I otherwise wouldn't want. I trust that Kibana is using the 
verb properly, but if an attacker crafted an OPTIONS request at a server 
with the request /_shutdown, would the ElasticServer know that since this 
is an OPTIONS request it should ignore anything else in the request? 

Admittedly I'm a bit in the dark about how the ES server receives and 
handles commands over http beyond the typical RESTful functionality. Anyone 
can shed some light?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3be24cfc-c247-4ab7-9733-e494f527529b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ClassCast when sorting on a long field (AtomicFieldData$WithOrdinals$1 cannot be cast to AtomicNumericFieldData)

2015-02-11 Thread Eric Wittmann
Also note:  I'm using the java TransportClient to index the documents.  I 
get the ClassCast problem when querying using the TransportClient, but I 
*also* get the error when querying 
using http://localhost:9200/_plugin/marvel/sense/index.html.

fwiw

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/94305ece-3e91-403e-9fd6-44979be74ac5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch CouchDB River plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch CouchDB River 
plugin, version 2.4.2.

The CouchDB River plugin allows to hook into couchdb _changes feed and 
automatically index it into elasticsearch..

https://github.com/elasticsearch/elasticsearch-river-couchdb/

Release Notes - elasticsearch-river-couchdb - Version 2.4.2


Fix:
 * [80] - Potential NPE when closing a river 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/80)

Update:
 * [90] - [Test] Remove couchdb test database after tests 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/90)
 * [81] - Don't create index when initializing the river 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/pull/81)


Doc:
 * [85] - [Doc] Add test instructions 
(https://github.com/elasticsearch/elasticsearch-river-couchdb/issues/85)


Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-river-couchdb project repository: 
https://github.com/elasticsearch/elasticsearch-river-couchdb/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbc625.2864b40a.1ec6.d5edSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Using filtered query with a wildcard

2015-02-11 Thread Sagar Shah
Hello everyone,
I came across few articles that says filtered query are generally faster.
I need to perform some wildcard search on one of the fields. 
So my questions is...
Can I perform wildcard search on the field along with filtered query? If 
so, can anyone please provide an example?

I tried few combinations of JSON for request body, but that did not work. 
Example:

Method:POST

{
query: {
filtered: {
filter: {
bool: {
must: [
{
wildcard: {
wildcardField: *Text*
}
},
{
term: {
messageId: 14406
}
}
]
}
}
}
}
}

Appreciate your inputs!

Regards,
Sagar Shah

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67eb86d0-1f10-47a0-94e0-b5f6ed166d4d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch ICU Analysis plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch ICU Analysis 
plugin, version 2.4.2.

The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding 
ICU relates analysis components..

https://github.com/elasticsearch/elasticsearch-analysis-icu/

Release Notes - elasticsearch-analysis-icu - Version 2.4.2



Update:
 * [46] - Update to elasticsearch 1.4.3 / Lucene 4.10.3 
(https://github.com/elasticsearch/elasticsearch-analysis-icu/issues/46)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-icu project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-icu/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbbf80.c807b40a.02a5.192aSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticon Agenda

2015-02-11 Thread Mark Walkom
Other than the Welcome Reception, no :)

On 12 February 2015 at 03:04, Glen Smith g...@smithsrock.com wrote:

 The currently published agenda:
 http://www.elasticon.com/apex/Elastic_ON_Agenda
 for Monday, March 9 shows just Registration Opens until 5pm.

 If registration will really be the only official activity, traveling in on
 Monday might be most convenient.
 Are there likely to be any other events added during that time?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/0cd25a35-d26c-439f-a9df-b3fa000ee1fe%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/0cd25a35-d26c-439f-a9df-b3fa000ee1fe%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9Vc7kZoq08N5Nj0siEZeLXs_PAV2ETG-yFaJ8_eG7cfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: SparkSQL and ElasticSearch not inferring JSON Schema correctly, possible bugs?

2015-02-11 Thread Aris V
Costin Leua saw this on the Spark User Mailing List, and I have filed this 
as a bug in github:

https://github.com/elasticsearch/elasticsearch-hadoop/issues/377



On Tuesday, February 10, 2015 at 5:18:57 PM UTC-8, Aris V wrote:

 I'm using ElasticSearch with elasticsearch-spark-BUILD-SNAPSHOT and 
 Spark/SparkSQL 1.2.0, from Costin Leau's advice.

 I want to query ElasticSearch for a bunch of JSON documents from within 
 SparkSQL, and then use a SQL query to simply query for a column, which is 
 actually a JSON key -- normal things that SparkSQL does using the 
 SQLContext.jsonFile(filePath) facility. The difference I am using the 
 ElasticSearch container.

 The big problem: when I do something like 

 SELECT jsonKeyA from tempTable;

 I actually get the WRONG KEY out of the JSON documents! I discovered that 
 if I have JSON keys physically in the order D, C, B, A in the json 
 documents, the elastic search connector discovers those keys BUT then sorts 
 them alphabetically as A,B,C,D - so when I SELECT A from tempTable, I 
 actually get column D (because the physical JSONs had key D in the first 
 position). This only happens when reading from elasticsearch and SparkSQL.

 It gets much worse: When a key is missing from one of the documents and 
 that key should be NULL, the whole application actually crashes and gives 
 me a java.lang.IndexOutOfBoundsException -- the schema that is inferred is 
 totally screwed up. 

 In the above example with physical JSONs containing keys in the order 
 D,C,B,A, if one of the JSON documents is missing the key/column I am 
 querying for, I get that java.lang.IndexOutOfBoundsException exception.

 I am using the BUILD-SNAPSHOT because otherwise I couldn't build the 
 elasticsearch-spark project, Costin said so.

 Any clues here? Any fixes?

 Aris


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ebb742a1-17d5-4c04-8c5c-221361699fde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ClassCast when sorting on a long field (AtomicFieldData$WithOrdinals$1 cannot be cast to AtomicNumericFieldData)

2015-02-11 Thread Eric Wittmann
This problem is really strange because my data has two long fields, and my 
search/filter works for one of them but not the other.

Even weirder, I tried to create a curl reproducer for this with the exact 
same data and that works fine!

Details here:

https://gist.github.com/EricWittmann/86864fd897a6f7496fd4

That shows the mapping of the auditEntry document type.  It then shows two 
queries, the first using id as the sort.  This query fails with:

QueryPhaseExecutionException[[apiman_manager][0]: 
query[ConstantScore(cache(_type:auditEntry))],from[0],size[20],sort[custom:
\id\: 
org.elasticsearch.index.fielddata.fieldcomparator.LongValuesComparatorSource@730c1326!]:
 
Query Failed [Failed to execute main query]]; nested: 
ClassCastException[org.elasticsearch.index.fielddata.AtomicFieldData$WithOrdinals$1
 
cannot be cast to org.elasticsearch.index.fielddata.AtomicNumericFieldData];


The second is the same query but sorting on createdOn.  This query seems 
to work fine.

Any help appreciated!

-Eric

PS: Originally found this using ES 1.3.2, but also tested using 1.3.8.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/28c75509-417d-4355-ac07-f20a30118b87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to structure posts with comments for a blog?

2015-02-11 Thread Andrew Axelrod


What's the best way to structure a post/comment system using elasticsearch?
I'm using elasticsearch as a secondary database.

There would be a post with a multithreaded commenting system, maybe two 
levels deep. 
Each post can have up to 500-1000 comments. 
There will be incremental counters for both likes and comments for each 
comment and post.  This means a lot of indexing.

Is it better to go with nested objects OR parent/child OR something else?  
Any advice on structure and how often to update elastricsearch would be 
most appreaciated.  

Thanks,

Andrew

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5a294b9d-aed8-454c-aca8-958a225049b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Stempel (Polish) Analysis plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Stempel (Polish) 
Analysis plugin, version 2.4.2.

The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) 
analysis module into elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-stempel/

Release Notes - elasticsearch-analysis-stempel - Version 2.4.2



Update:
 * [37] - Update to elasticsearch 1.4.3 / Lucene 4.10.3 
(https://github.com/elasticsearch/elasticsearch-analysis-stempel/issues/37)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-stempel project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-stempel/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbc06b.2864b40a.1ec6.d282SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Re: is there any data size limit for backup and restore?

2015-02-11 Thread Mark Walkom
Yes, that is not a problem.
There is no limit to the size of the backup.

On 12 February 2015 at 03:17, Subbarao Kondragunta subbu2perso...@gmail.com
 wrote:


 I have one index which has almost more than 10GB data size.
 But entire size of all indices with all documents is more than 50 GB data
 size.
 Is elastic search allows this much data to take backup and restore?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/57119e74-d17c-43d1-aba6-4608f4a4e98e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/57119e74-d17c-43d1-aba6-4608f4a4e98e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_YTQP8wrDh9YMGWbr%3Drs5LP-QBhLfBK0od-hzxvzH0Xw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using filtered query with a wildcard

2015-02-11 Thread David Pilato
Not answering to your question, but...

If you are about performance, don’t use wildcard queries.
But prefer ngram analyzers with match queries.

That will be really more efficient.

In particular here, wildcard beginning with a star is the worse scenario 
performance wise.


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
https://twitter.com/elasticsearchfr | @scrutmydocs 
https://twitter.com/scrutmydocs



 Le 11 févr. 2015 à 21:03, Sagar Shah sagarshah1...@gmail.com a écrit :
 
 Hello everyone,
 I came across few articles that says filtered query are generally faster.
 I need to perform some wildcard search on one of the fields. 
 So my questions is...
 Can I perform wildcard search on the field along with filtered query? If so, 
 can anyone please provide an example?
 
 I tried few combinations of JSON for request body, but that did not work. 
 Example:
 
 Method:POST
 
 {
 query: {
 filtered: {
 filter: {
 bool: {
 must: [
 {
 wildcard: {
 wildcardField: *Text*
 }
 },
 {
 term: {
 messageId: 14406
 }
 }
 ]
 }
 }
 }
 }
 }
 
 Appreciate your inputs!
 
 Regards,
 Sagar Shah
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com 
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/67eb86d0-1f10-47a0-94e0-b5f6ed166d4d%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/67eb86d0-1f10-47a0-94e0-b5f6ed166d4d%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout 
 https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B8FA2C00-FBA6-4550-B839-9D31FF47752D%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Twitter River plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Twitter River 
plugin, version 2.4.2.

The Twitter river indexes the public twitter stream, aka the hose, and makes it 
searchable.

https://github.com/elasticsearch/elasticsearch-river-twitter/

Release Notes - elasticsearch-river-twitter - Version 2.4.2


Fix:
 * [96] - Elasticsearch 1.4.0 hangs on running twitter river 
(https://github.com/elasticsearch/elasticsearch-river-twitter/pull/96)

Update:
 * [98] - [Test] Tweet for user stream sent too early 
(https://github.com/elasticsearch/elasticsearch-river-twitter/issues/98)
 * [90] - Remove explicit op_type=create 
(https://github.com/elasticsearch/elasticsearch-river-twitter/pull/90)
 * [89] - Put mapping only if river does not exist 
(https://github.com/elasticsearch/elasticsearch-river-twitter/pull/89)

New:
 * [92] - Add retry_after option 
(https://github.com/elasticsearch/elasticsearch-river-twitter/pull/92)

Doc:
 * [87] - Add restart after install line 
(https://github.com/elasticsearch/elasticsearch-river-twitter/pull/87)
 * [75] - Docs: add sample tweet in docs 
(https://github.com/elasticsearch/elasticsearch-river-twitter/issues/75)


Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-river-twitter project repository: 
https://github.com/elasticsearch/elasticsearch-river-twitter/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbc496.c807b40a.02a5.1c9cSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Japanese (kuromoji) Analysis plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Japanese (kuromoji) 
Analysis plugin, version 2.4.2.

The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis 
module into elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/

Release Notes - elasticsearch-analysis-kuromoji - Version 2.4.2



Update:
 * [53] - Update to elasticsearch 1.4.3 / Lucene 4.10.3 
(https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/issues/53)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-kuromoji project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-kuromoji/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbc0ea.655cb40a.27eb.b644SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Phonetic Analysis plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Phonetic Analysis 
plugin, version 2.4.2.

The Phonetic Analysis plugin integrates phonetic token filter analysis with 
elasticsearch..

https://github.com/elasticsearch/elasticsearch-analysis-phonetic/

Release Notes - elasticsearch-analysis-phonetic - Version 2.4.2



Update:
 * [37] - Update to elasticsearch 1.4.3 / Lucene 4.10.3 
(https://github.com/elasticsearch/elasticsearch-analysis-phonetic/issues/37)




Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-analysis-phonetic project repository: 
https://github.com/elasticsearch/elasticsearch-analysis-phonetic/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbc101.c1e6b40a.19c6.9f2cSMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Searching IP Address Range Using CIDR Mask

2015-02-11 Thread Ed Brown
Hi, 
I have an ip type defined in my index mapping. I would like to search the 
IP addresses using CIDR notation. I see IP range for aggregations using 
CIDR masks, but nothing else.
Is what I want to do currently possible?

Thanks.
---
Ed Brown
Software Architect

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3cc3d80-cb6f-43ad-b48c-57b96006946f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Hibernate Search with Elasticsearch

2015-02-11 Thread William Yang
Hi everyone, 

I am new to the forum and I am also new to ElasticSearch! 
However for this project, I am trying to replace HibernateSearch with 
ElasticSearch. I am using Spring Framework, and I have also taken a look at 
both Spring-ElasticSearch and Spring-Data-ElasticSearch. However, I am 
still new to using these open source programs and don't really know where I 
should start. I'd appreciate it if someone could give me a push in the 
right direction on how to incorporate ElasticSearch into my code. Thank 
you! 
 
(I posted this on the archives so if this pops up twice I'm sorry!! I 
didn't understand how this forum worked)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[ANN] Elasticsearch Mapper Attachment plugin 2.4.2 released

2015-02-11 Thread Elasticsearch Team
Heya,


We are pleased to announce the release of the Elasticsearch Mapper Attachment 
plugin, version 2.4.2.

The mapper attachments plugin adds the attachment type to Elasticsearch using 
Apache Tika..

https://github.com/elasticsearch/elasticsearch-mapper-attachments/

Release Notes - elasticsearch-mapper-attachments - Version 2.4.2



Update:
 * [94] - Upgrade Tika to 1.7 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/94)

New:
 * [99] - [test] Add standalone runner 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/99)

Doc:
 * [97] - [Doc] copy_to using attachment field type 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/97)


Issues, Pull requests, Feature requests are warmly welcome on 
elasticsearch-mapper-attachments project repository: 
https://github.com/elasticsearch/elasticsearch-mapper-attachments/
For questions or comments around this plugin, feel free to use elasticsearch 
mailing list: https://groups.google.com/forum/#!forum/elasticsearch

Enjoy,

-The Elasticsearch team

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54dbd728.2864b40a.1ec6.dd53SMTPIN_ADDED_MISSING%40gmr-mx.google.com.
For more options, visit https://groups.google.com/d/optout.


Is there a security mailing list?

2015-02-11 Thread David Reagan
Is there a security announcement list for ELK somewhere? 

Something like what Ubuntu sends out when they patch security 
vulnerabilities. 

I don't see one listed on http://www.elasticsearch.org/community/

The list would have sent out emails about the issues listed 
on http://www.elasticsearch.org/community/security/

Hmm... Looking at the security page, I think I remember getting emailed 
about the latest cve. But I don't remember where the email came from...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/94134d54-b9bb-4676-80f0-7ea808098e8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there a security mailing list?

2015-02-11 Thread Mark Walkom
We do not have a security list. We do notify our direct customers as well
as blogging on our site and tweeting and we also register an CVE, which
goes out via those usual channels.

Moving forward we'll also post to the lists, apologies for not doing so for
this latest CVE.

On 12 February 2015 at 09:40, David Reagan jer...@gmail.com wrote:

 Is there a security announcement list for ELK somewhere?

 Something like what Ubuntu sends out when they patch security
 vulnerabilities.

 I don't see one listed on http://www.elasticsearch.org/community/

 The list would have sent out emails about the issues listed on
 http://www.elasticsearch.org/community/security/

 Hmm... Looking at the security page, I think I remember getting emailed
 about the latest cve. But I don't remember where the email came from...

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/94134d54-b9bb-4676-80f0-7ea808098e8f%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/94134d54-b9bb-4676-80f0-7ea808098e8f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9-8mdo2N4QeO9u3o0NxF_gehK9MTVXB9_LZwfUydoYMg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Async Replication and Multi DC

2015-02-11 Thread Parag Shah
Here is what I felt I could do with shard allocation filtering:

Use the following properties at the node level in each DC:

In DC 1:

Node 1:

node.group1 = xxx
node.group2 = yyy
node.group4 = zzz

Node 2:

node.group2 = yyy
node.group3 = zzz
node.group5 = bbb

Node 3:

node.group1 = xxx
node.group2 = zzz
node.group6 = ccc


In DC 2:

Node 4:

node.group1 = xxx
node.group4 = aaa
node.group6 = ccc

Node 5: 

node.group2 = yyy
node.group5 = bbb
node.group6 = ccc

Node 6:

node.group3 = zzz
node.group4 = aaa
node.group5 = bbb


All clients in DC1 would send requests with setting:

index.routing.allocation.include.group1 = xxx
OR 
index.routing.allocation.include.group2 = yyy
OR 
index.routing.allocation.include.group3 = zzz

in round robin fashion

Similarly clients in DC 2 would send requests with setting:

index.routing.allocation.include.group4 = aaa
OR
index.routing.allocation.include.group5 = bbb
OR 
index.routing.allocation.include.group6 = ccc

again in round robin fashion, so as to distribute the load evenly across 
the nodes.

Will this work or should I just do snapshot/restore across DCs?

Regards
Parag

On Wednesday, February 11, 2015 at 4:56:08 PM UTC-8, Parag Shah wrote:

 Hi all,

  I am trying to setup async replication with a multi-dc setup. Here is 
 my scenario:

 1. I will have 3 nodes in each DC. For now, assume I have only 2 DCs. 
 2. I would like to use async replication, such that it writes 
 synchronously to the primary shard, but writes to only one node 
 asynchronously in the remote DC.
 3. I would like to have 2 replicas. So, 3 copies in all.

 I have looked at the cluster awareness settings here 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html.
  
 However, I am not sure how it will work in practice, because if I simply 
 set async replication on each indexing request, it may choose to 
 replicate all 3 copies locally (including the primary) or even worse, 1 
 copy locally and 2 copies in the remote DC.

 If I use the dc as the cluster awareness attribute, it will create only 1 
 copy in the local dc and another one in the remote DC. That won't satisfy 
 by 3 copies requirement.

 If I create an additional attribute which is zone like this:

 cluster.routing.allocation.awareness.attributes: dc, zone

 and divide my DC into zone 1 (1 node?) and zone 2 (2 nodes?) and 
 vice-versa in the remote DC,  I might be able to distribute the load 
 between zones in a DC and between DCs for the 3rd copy. However, there is 
 still the possibility that 2 of it's replica shards might be allocated in 
 the remote DC, which will prevent me from obtaining local (DC) quorums.

 Can someone shed light on how shard allocation works when you have more 
 cluster allocation attributes than 1?

 Also, if there is some known best practice for doing Async Replication 
 with MultiDC such that local quorums are always possible for reads/writes, 
 it will be greatly appreciated. In summary, I am looking for 3 copies 
 including primary. 2 in the local DC and 1 in the remote DC and want to 
 write synchronously only to the primary shard.

 Thanks in advance,
 Parag


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/35a97ba7-f916-4a15-a84b-2b4d73603405%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Async Replication and Multi DC

2015-02-11 Thread Parag Shah
Hi all,

 I am trying to setup async replication with a multi-dc setup. Here is 
my scenario:

1. I will have 3 nodes in each DC. For now, assume I have only 2 DCs. 
2. I would like to use async replication, such that it writes synchronously 
to the primary shard, but writes to only one node asynchronously in the 
remote DC.
3. I would like to have 2 replicas. So, 3 copies in all.

I have looked at the cluster awareness settings here 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html.
 
However, I am not sure how it will work in practice, because if I simply 
set async replication on each indexing request, it may choose to 
replicate all 3 copies locally (including the primary) or even worse, 1 
copy locally and 2 copies in the remote DC.

If I use the dc as the cluster awareness attribute, it will create only 1 
copy in the local dc and another one in the remote DC. That won't satisfy 
by 3 copies requirement.

If I create an additional attribute which is zone like this:

cluster.routing.allocation.awareness.attributes: dc, zone

and divide my DC into zone 1 (1 node?) and zone 2 (2 nodes?) and vice-versa 
in the remote DC,  I might be able to distribute the load between zones in 
a DC and between DCs for the 3rd copy. However, there is still the 
possibility that 2 of it's replica shards might be allocated in the remote 
DC, which will prevent me from obtaining local (DC) quorums.

Can someone shed light on how shard allocation works when you have more 
cluster allocation attributes than 1?

Also, if there is some known best practice for doing Async Replication with 
MultiDC such that local quorums are always possible for reads/writes, it 
will be greatly appreciated. In summary, I am looking for 3 copies 
including primary. 2 in the local DC and 1 in the remote DC and want to 
write synchronously only to the primary shard.

Thanks in advance,
Parag

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8f2a2c8c-4910-487c-9e5e-bfcfb96a235d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Async Replication and Multi DC

2015-02-11 Thread Mark Walkom
Do not setup cross DC clusters, it's not recommended and ends up providing
more headaches than it is worth.

You are better off replicating your data using either snapshot+restore or
sending it to each individual cluster through your code.

On 12 February 2015 at 11:56, Parag Shah par...@gmail.com wrote:

 Hi all,

  I am trying to setup async replication with a multi-dc setup. Here is
 my scenario:

 1. I will have 3 nodes in each DC. For now, assume I have only 2 DCs.
 2. I would like to use async replication, such that it writes
 synchronously to the primary shard, but writes to only one node
 asynchronously in the remote DC.
 3. I would like to have 2 replicas. So, 3 copies in all.

 I have looked at the cluster awareness settings here
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html.
 However, I am not sure how it will work in practice, because if I simply
 set async replication on each indexing request, it may choose to
 replicate all 3 copies locally (including the primary) or even worse, 1
 copy locally and 2 copies in the remote DC.

 If I use the dc as the cluster awareness attribute, it will create only 1
 copy in the local dc and another one in the remote DC. That won't satisfy
 by 3 copies requirement.

 If I create an additional attribute which is zone like this:

 cluster.routing.allocation.awareness.attributes: dc, zone

 and divide my DC into zone 1 (1 node?) and zone 2 (2 nodes?) and
 vice-versa in the remote DC,  I might be able to distribute the load
 between zones in a DC and between DCs for the 3rd copy. However, there is
 still the possibility that 2 of it's replica shards might be allocated in
 the remote DC, which will prevent me from obtaining local (DC) quorums.

 Can someone shed light on how shard allocation works when you have more
 cluster allocation attributes than 1?

 Also, if there is some known best practice for doing Async Replication
 with MultiDC such that local quorums are always possible for reads/writes,
 it will be greatly appreciated. In summary, I am looking for 3 copies
 including primary. 2 in the local DC and 1 in the remote DC and want to
 write synchronously only to the primary shard.

 Thanks in advance,
 Parag

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8f2a2c8c-4910-487c-9e5e-bfcfb96a235d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/8f2a2c8c-4910-487c-9e5e-bfcfb96a235d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9fun9091RLeXZci29dHFHwXs-L8X%3DzEsZ%2BxOerhg9%3DCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using filtered query with a wildcard

2015-02-11 Thread Sagar Shah
Thank you David.
I will evaluate and look into ngram analyzer to see how it can help on the
specific scenario.
Appreciate your help!

On Thu, Feb 12, 2015 at 2:26 AM, David Pilato da...@pilato.fr wrote:

 Not answering to your question, but...

 If you are about performance, don’t use wildcard queries.
 But prefer ngram analyzers with match queries.

 That will be really more efficient.

 In particular here, wildcard beginning with a star is the worse scenario
 performance wise.


 --
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr
 https://twitter.com/elasticsearchfr | @scrutmydocs
 https://twitter.com/scrutmydocs



 Le 11 févr. 2015 à 21:03, Sagar Shah sagarshah1...@gmail.com a écrit :

 Hello everyone,
 I came across few articles that says filtered query are generally faster.
 I need to perform some wildcard search on one of the fields.
 So my questions is...
 Can I perform wildcard search on the field along with filtered query? If
 so, can anyone please provide an example?

 I tried few combinations of JSON for request body, but that did not work.
 Example:

 Method:POST

 {
 query: {
 filtered: {
 filter: {
 bool: {
 must: [
 {
 wildcard: {
 wildcardField: *Text*
 }
 },
 {
 term: {
 messageId: 14406
 }
 }
 ]
 }
 }
 }
 }
 }

 Appreciate your inputs!

 Regards,
 Sagar Shah

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/67eb86d0-1f10-47a0-94e0-b5f6ed166d4d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/67eb86d0-1f10-47a0-94e0-b5f6ed166d4d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/ZOOwZx5xoCs/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/B8FA2C00-FBA6-4550-B839-9D31FF47752D%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/B8FA2C00-FBA6-4550-B839-9D31FF47752D%40pilato.fr?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 

Regards,
Sagar Shah

Too many people think more of security instead of opportunity. They seem
more afraid of life than death!!!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANZnv5i0MnBn4BCx_8VX_%2BAgvdH3OoSQ4vJuV_juTY35xwLM%2BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hibernate Search with Elasticsearch

2015-02-11 Thread William Yang
Oh I will be using MySQL in the future, right now I'm adding it upon
AppFuse (which is a website that lets you log in, add users, and search for
users).
So I guess, Yes, right now I am only using Hibernate search. What should I
do about the postgresql part until I get to using MySQL?

And yes, when I said run in Spring, I meant how to build it in the Spring
Framework...there was an error (that I don't remember right now) that
wouldn't let me compile it because a dependency was missing.

On Wed, Feb 11, 2015 at 10:25 PM, David Pilato da...@pilato.fr wrote:

 postgresql is a database. I guess you are using Hibernate with a database,
 right?
 Unless you are using only Hibernate search?

 What do you mean run in Spring?
 Spring is the framework right?

 mvn jetty:run should work fine

 David

 Le 12 févr. 2015 à 07:11, William Yang williamyan...@gmail.com a écrit :

 Hey David,

 Thanks for responding - I found that link too and tried to follow it but I
 got confused what postgresql is. If I don't know what it is, where do I
 find it?
 Also I think I had some problems with a common dependency, I don't have
 the exact error code on me right now, but there was something wrong with
 the dependency.

 What would I put as the goal when I try to run it in spring? clean
 install, or just jetty:run?

 On Wed, Feb 11, 2015 at 10:04 PM, David Pilato da...@pilato.fr wrote:

 Have a look here. https://github.com/dadoonet/legacy-search

 It's an Hibernate demo project which uses Spring as well.
 If you navigate through branches, it will show you how to add
 elasticsearch.

 I won't try to connect Hibernate Search and replace Lucene implementation
 by elasticsearch.

 My 2 cents
 David

 Le 12 févr. 2015 à 00:18, William Yang williamyan...@gmail.com a
 écrit :

 Hi everyone,

 I am new to the forum and I am also new to ElasticSearch!
 However for this project, I am trying to replace HibernateSearch with
 ElasticSearch. I am using Spring Framework, and I have also taken a look
 at both Spring-ElasticSearch and Spring-Data-ElasticSearch. However, I
 am still new to using these open source programs and don't really know
 where I should start. I'd appreciate it if someone could give me a push in
 the right direction on how to incorporate ElasticSearch into my code.
 Thank you!

 (I posted this on the archives so if this pops up twice I'm sorry!! I
 didn't understand how this forum worked)

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/XfY9hJsKcOM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAHV%3D-Pnid4LTy8ArhN6uqcTvH8LVDNt4MCCxrQmiYatAQxJNpQ%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAHV%3D-Pnid4LTy8ArhN6uqcTvH8LVDNt4MCCxrQmiYatAQxJNpQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/XfY9hJsKcOM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/B29F2368-4B2E-41F9-A84C-16519688106B%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/B29F2368-4B2E-41F9-A84C-16519688106B%40pilato.fr?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message 

Re: Elastic Search - Sentiment analysis

2015-02-11 Thread Ap
Hi Jürgen,

Thanks for the reply. Can you tell me how can the analysis be done to 
identify that it was a relevant financial document, say based on terms or 
phrases ?

On Wednesday, February 11, 2015 at 2:03:20 AM UTC-8, Ap wrote:

 How do I identify relevant financial documents based on the 
 terms/phrases/sentiments present inside the Document ?  

 eg. Relevant document might contain-- Wall Street hits a new High.
   Irrelevant document might contain -- I was walkng on Wall Street 
 and met an old friend.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52f8fe67-9077-4fb1-8953-c5efd7e8ae3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?

2015-02-11 Thread Martin Widmer
Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?
How?

Thanks for your advice.

Martin

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?

2015-02-11 Thread Itamar Syn-Hershko
Try using assembly redirects, if that doesn't work it means no...

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer  Consultant
Lucene.NET committer and PMC member

On Wed, Feb 11, 2015 at 12:38 PM, Martin Widmer swissm...@gmail.com wrote:

 Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?
 How?

 Thanks for your advice.

 Martin

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5a2a4684-d74c-462b-903d-be973dbd327d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zuz-T0hfrACUC27LeOzP9QMf-m3YjOcEWSXhzgAutnZ5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES JAVA API | QueryBuilders.java | FilterBuilders does not have nullable queryBuilder as arguement

2015-02-11 Thread piyush goyal

Ok. That helps a lot in getting the things but still I do feel, since ES is 
internally marking the query as MatchAll query, the method definition 
should have Nullable query as well.

Thanks once again for helping out. :)

Regards
Piyush


On Wednesday, 11 February 2015 13:29:32 UTC+5:30, David Pilato wrote:

 Actually in a filtered query, filters are applied first.
 The match all query then only said that all filtered documents match.


 David

 Le 11 févr. 2015 à 08:30, piyush goyal coolpi...@gmail.com javascript: 
 a écrit :

 Ahh.. Now I got you.

 But does that mean, if my rest query through sense does not have any query 
 part and only has a filter part, by default ES adds a matchAll Query to the 
 query part. So one more question, might be I am asking a wrong question but 
 just to clear up my doubts. 

 So now this query will do separate things:

 1.) What a normal match all query does.
 2.) What ever my filter operations are there, the query will perform the 
 same as well.

 And then ES picks up the intersection of two. Am I right?

 Regards
 Piyush

 On Wednesday, 11 February 2015 12:25:43 UTC+5:30, David Pilato wrote:

 I think I answered.
 This is what is done by default in REST: 
 https://github.com/elasticsearch/elasticsearch/blob/1816951b6b0320e7a011436c7c7519ec2bfabc6e/src/main/java/org/elasticsearch/index/query/FilteredQueryParser.java#L54

 David

 Le 11 févr. 2015 à 07:46, piyush goyal coolpi...@gmail.com a écrit :

 Hi Folks,

 Any inputs?

 Regards
 Piyush

 On Tuesday, 10 February 2015 16:23:43 UTC+5:30, piyush goyal wrote:

 I don't need the query part. All I need is a filter.

 Not sure how matchAllQuery will help.

 On Tuesday, 10 February 2015 16:14:51 UTC+5:30, David Pilato wrote:

 Use a matchAllQuery for the query part.
 All scores will be set to 1.

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 10 févr. 2015 à 11:24, piyush goyal coolpi...@gmail.com a écrit :

 Hi All,

 If I try to write a filtered query through sense, it allows me to just 
 add a filter and query is not a mandatory field. For example:
 {
   query: {
 filtered: {
   filter: {
 term: {
   response_timestamp: 2015-01-01
 }
   }
 }
   }

 is a valid query through sense. However, if I try to implement the same 
 through JAVA API, I have to use the abstract class QueryBuilders.java and 
 its method:

  filteredQuery(QueryBuilder queryBuilder, @Nullable FilterBuilder 
 filterBuilder)



 Please note that here FilterBuilder argument is nullable and 
 QueryBuilder argument is not. Which means that eventually I have to write 
 a 
 query inside the Filtered part. If this correct, then how can I write a 
 complete query with aggregations such that I don't want any score to be 
 calculated and the response time is faster?

 Regards
 Piyush

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/d1bfad33-93d3-407a-8c0a-8184045f88ec%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/d1bfad33-93d3-407a-8c0a-8184045f88ec%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c44055a5-1654-48fa-9787-503228e39de9%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/c44055a5-1654-48fa-9787-503228e39de9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/f5e22135-746c-4613-92b3-dbac84247677%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/f5e22135-746c-4613-92b3-dbac84247677%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf379e01-76ee-4ed3-94ba-92bb2a95845f%40googlegroups.com.
For more options, visit 

Transport Error 400 - Failed to derive xcontent

2015-02-11 Thread James
Hi all,

I'm trying to run a scrapy crawler to feed elasticsearch but I seem to be 
getting this error and I can't understand how to begin debugging this. 

File 
/var/www/app/venv/local/lib/python2.7/sitepackages/elasticsearch/connection/base.py,
 
line 97, in _raise_error HTTP_EXCEPTIONS.get(status_code, 
TransportError)(status_code, error_message, additional_info) 
elasticsearch.exceptions.RequestError: TransportError(400, u'Failed to 
derive xcontent from 
org.elasticsearch.common.bytes.ChannelBufferBytesReference@a')

Can anyone help me on my way to understand what is going wrong and how I 
might start to debug it? 

Any help would be appreciated.

James


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1bc780e4-6930-49e4-a24b-6f01a20d51c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


finding child documents without parents

2015-02-11 Thread Ron Sher
We have child documents and we suspect that some of them don't have their 
parents. Can someone suggest a query I can use to find child documents without 
parents? 
Thanks,
Ron

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76a913db-8ed9-48e9-bd0f-29495c1b6e0f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Large results sets and paging for Aggregations

2015-02-11 Thread Mark Harwood
5k doesn't sound  too scary.

Think of the aggs tree like a Bean Machine [1] - one of those wooden 
boards with pins arranged on it like a christmas tree and you drop balls at 
the top of the board and they rattle down a choice of path to the bottom.
In the case of aggs, your buckets are the pins and documents are the balls

The memory requirement for processing the agg tree is typically the number 
of pins, not the number of balls you drop into the tree as these just fall 
out of the bottom of the tree.
So in your case it is 5k members multiplied by 12 months each = 60k unique 
buckets, each of which will maintain a counter of how many docs pass 
through that point. So you could pass millions or billions of docs through 
and the working memory requirement for the query would be the same.
There is however a fixed overhead for all queries which *is* a function of 
number of docs and that is the Field Data cache required to hold the 
dates/member IDs in RAM - if this becomes a problem then you may want to 
look at on-disk alternative structure in the form of DocValues.

Hope that helps.

[1] http://en.wikipedia.org/wiki/Bean_machine

On Wednesday, February 11, 2015 at 7:04:04 AM UTC, piyush goyal wrote:

 Hi Mark,

 Before getting into queries, here is a little bit info about the project:

 1.) A community where members keep on increasing, decreasing and changing. 
 Maintained in a different type.
 2.) Approximately 3K to 4K documents of data of each user inserted into ES 
 per month in a different type maintained by member ID.
 3.) Mapping is flat, there are no nested and array type of data.

 Requirement:

 Here is a sample requirement:

 1.) Getting a report against each member ID against the count of data for 
 last three month.
 2.) Query used to get the data is:

 {
   query: {
 constant_score: {
   filter: {
 bool: {
   must: [
 {term: {
   datatype: XYZ
 }
 }, {
   range: {
 response_timestamp: {
   from: 2014-11-01,
   to: 2015-01-31
 }
   }
 }
   ]
 }
   }
 }
   },aggs: {
 memberIDAggs: {
   terms: {
 field: member_id,
 size: 0
   },aggs: {
 dateHistAggs: {
   date_histogram: {
 field: response_timestamp,
 interval: month
   }
 }
   }
 }
   },size: 0
 }

 Now since the current member count is approximately 1K which will increase 
 to 5K in next 10 months. 5K * 4K * 3 times of documents to be used for this 
 aggregation. I guess a major hit on system. And this is only two level of 
 aggregation. Next requirement by our analyst is to get per month data into 
 three different categories. 

 What is the optimum solution to this problem?

 Regards
 Piyush

 On Tuesday, 10 February 2015 16:15:22 UTC+5:30, Mark Harwood wrote:

 these kind of queries are hit more for qualitative analysis.

 Do you have any example queries? The pay as you go summarisation need 
 not be about just maintaining quantities.  In the demo here [1] I derive 
 profile names for people, categorizing them as newbies, fanboys or 
 haters based on a history of their reviewing behaviours in a marketplace. 

 By the way, are there any other strategies suggested by ES for these 
 kind of scenarios?

 Igor hit on one which is to use some criteria eg. date to limit the 
 volume of what you analyze in any one query request.

 [1] 
 http://www.elasticsearch.org/videos/entity-centric-indexing-london-meetup-sep-2014/



 On Tuesday, February 10, 2015 at 10:05:24 AM UTC, piyush goyal wrote:

 Thanks Mark. Your suggestion of pay-as-you-go seems amazing. But 
 considering the dynamics of the application, these kind of queries are hit 
 more for qualitative analysis. There are hundred of such queries(I am not 
 exaggerating) which are being hit daily by our analytic team. Keeping count 
 of all those qualitative checks daily and maintaining them as documents is 
 a headache itself. Addition/update/removals of these documents would cause 
 us huge maintenance overheads. Hence was thinking of getting something of 
 getting pagination on aggregations which would definitely help us to keep 
 our ES memory leaks away.

 By the way, are there any other strategies suggested by ES for these 
 kind of scenarios?

 Thanks

 On Tuesday, 10 February 2015 15:20:40 UTC+5:30, Mark Harwood wrote:

  Why can't aggs be based on shard based calculations 

 They are. The shard_size setting will determine how many member 
 *summaries* will be returned from each shard - we won't stream each 
 member's thousands of related records back to a centralized point to 
 compute a final result. The final step is to summarise the summaries from 
 each shard.

  if the number of members keep on increasing, day by day ES has to 
 keep more and more data into memory to calculate the aggs

 This is a 

Elastic Search - Sentiment analysis

2015-02-11 Thread Ap
How do I identify relevant financial documents based on the 
terms/phrases/sentiments present inside the Document ?  

eg. Relevant document might contain-- Wall Street hits a new High.
  Irrelevant document might contain -- I was walkng on Wall Street and 
met an old friend.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5001e852-42cc-4c0e-9266-1f0b3f8b088b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elastic Search - Sentiment analysis

2015-02-11 Thread Jürgen Wagner (DVT)
Well, you use a linguistic extraction tool in the processing pipeline to
yield the information you need. It will be indexed together with the
document and may be queried, aggregated, etc.

On 11.02.2015 11:03, Ap wrote:
 How do I identify relevant financial documents based on the
 terms/phrases/sentiments present inside the Document ?  

 eg. Relevant document might contain-- Wall Street hits a new High.
   Irrelevant document might contain -- I was walkng on Wall
 Street and met an old friend.


 -- 
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com
 mailto:elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/5001e852-42cc-4c0e-9266-1f0b3f8b088b%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/5001e852-42cc-4c0e-9266-1f0b3f8b088b%40googlegroups.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout.


-- 

Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
уважением
*i.A. Jürgen Wagner*
Head of Competence Center Intelligence
 Senior Cloud Consultant

Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
E-Mail: juergen.wag...@devoteam.com
mailto:juergen.wag...@devoteam.com, URL: www.devoteam.de
http://www.devoteam.de/


Managing Board: Jürgen Hatzipantelis (CEO)
Address of Record: 64331 Weiterstadt, Germany; Commercial Register:
Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54DB2D04.3080101%40devoteam.com.
For more options, visit https://groups.google.com/d/optout.
attachment: juergen_wagner.vcf

Re: How to recover saved kibana dashboard

2015-02-11 Thread bharath singh
Hi,
Accidentally I deleted kibana-int. Now am not able to save my dashboard and 
its showing Save failed. Dashboard cannot be saved to Elastisearch. In 
the logs I found this

[2015-02-11 192.168.xx.xx,322][DEBUG][action.admin.indices.create] [Tribe 
Chief] no known master node, scheduling a retry
[2015-02-11 192.168.xx.xx,323][DEBUG][action.admin.indices.create] [Tribe 
Chief] observer: timeout notification from cluster service. timeout setting 
[1m], time since start [1m] 

Please help

Regards,
Bharath Singh

On Friday, December 6, 2013 at 12:11:19 AM UTC+5:30, Alexander Gray II 
wrote:

 Quick Question:
 I saved a few Kibana3 dashboards via the kibana url, but I don't know 
 where they are saved.  I don't think they are saved on disk, so they must 
 be saved in Elasticsearch, right?

 So my root problem is that I saved a few kibana dashboards, and then I had 
 to restart my cluster and now my dashboards are all gone.

 How do I recover them?

 Thanks,
 Alex


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e682b71d-371c-4dbf-b823-4017db67d473%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?

2015-02-11 Thread Martin Widmer
?xml version=1.0 encoding=utf-8 ?
configuration
  runtime
assemblyBinding xmlns=urn:schemas-microsoft-com:asm.v1
  dependentAssembly
assemblyIdentity name=Newtonsoft.Json 
publicKeyToken=30ad4fe6b2a6aeed culture=neutral
  publisherPolicy apply=no /
  !--
  bindingRedirect oldVersion=6.0.0.0 newVersion=4.5.0.0 /
  --
/assemblyIdentity
  /dependentAssembly
/assemblyBinding
  /runtime
/configuration

I have tried this and also the commented out part... My project won't 
compile in both cases. I think I am doing something wrong.

On Wednesday, February 11, 2015 at 11:38:19 AM UTC+1, Martin Widmer wrote:

 Can I use Newtonsoft.Json 4.5.0.0 instead of 6.0.0.0 with NEST?
 How?

 Thanks for your advice.

 Martin


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/27a80981-6687-415d-8931-e686c0c13dd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Paid help with ES/ELK?

2015-02-11 Thread K. C. Ramakrishna
Hi Steve,

We are the authors of elasticray - elasticsearch plugin for Liferay - 
https://www.liferay.com/marketplace/-/mp/application/41044606
https://github.com/R-Knowsys/elasticray
also elasticray.com

We have completed the official elasticsearch certification courses too. If 
you are still looking for help, do contact us at rknowsys.com

Thanks,
kc

On Saturday, February 7, 2015 at 12:52:27 AM UTC+5:30, Steve Johnson wrote:

 I hope a posting like this is not taboo in this forum...

 We are struggling to understand how to properly configure an ELK stack for 
 our production environment.  We think we have things set up pretty much 
 right, and then ES throws us a curve ball.  We've had a couple of things 
 happen over the last few days that are simply baffling to us.  We've decide 
 we need the help of someone who really knows ES.

 Support companies all seem to want to sell only long-term contracts. We 
 need short-term help.  We are therefore thinking that we need to find an 
 individual ES expert who we can pay on an hourly basis to help us set up 
 our ES cluster and learn how it works and how to maintain it.

 If anyone reading this fits this description, or knows of some other 
 person or organization that does, please contact me at elastic at 
 filethis dot c0m.  If you're offering your services directly, please 
 let me know as much as you can about your experience with ES, including the 
 number of years you've worked with it and the sizes of the clusters you've 
 worked with.

 TIA for all help!

 Steve



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8de66772-320e-44c9-9512-e4d36ee83512%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to recover saved kibana dashboard

2015-02-11 Thread bharath singh
okies

On Wednesday, February 11, 2015 at 4:43:07 PM UTC+5:30, Mark Walkom wrote:

 You should really create start your own thread for this :)

 On 11 February 2015 at 21:32, bharath singh bharath...@gmail.com 
 javascript: wrote:

 Hi,
 Accidentally I deleted kibana-int. Now am not able to save my dashboard 
 and its showing Save failed. Dashboard cannot be saved to Elastisearch. 
 In the logs I found this

 [2015-02-11 192.168.xx.xx,322][DEBUG][action.admin.indices.create] [Tribe 
 Chief] no known master node, scheduling a retry
 [2015-02-11 192.168.xx.xx,323][DEBUG][action.admin.indices.create] [Tribe 
 Chief] observer: timeout notification from cluster service. timeout setting 
 [1m], time since start [1m] 

 Please help

 Regards,
 Bharath Singh

 On Friday, December 6, 2013 at 12:11:19 AM UTC+5:30, Alexander Gray II 
 wrote:

 Quick Question:
 I saved a few Kibana3 dashboards via the kibana url, but I don't know 
 where they are saved.  I don't think they are saved on disk, so they must 
 be saved in Elasticsearch, right?

 So my root problem is that I saved a few kibana dashboards, and then I 
 had to restart my cluster and now my dashboards are all gone.

 How do I recover them?

 Thanks,
 Alex

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e682b71d-371c-4dbf-b823-4017db67d473%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e682b71d-371c-4dbf-b823-4017db67d473%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/53a0f79d-718a-4c24-8fc5-87d230b50818%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Availability of logstash-forwarder debian packages

2015-02-11 Thread Dennis Plöger
Hi!

Until recently I used the elasticsearch-package repositories 
(packages.elasticsearch.org) to install logstash-forwarder. However, it now 
seems as if logstash-forwarder isn't available anymore via deb 
http://packages.elasticsearch.org/logstashforwarder/debian stable main. 
Did the structure change on the package server or is the package not 
available anymore? I found no blog post or something like that.

Kind regards

Dennis

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a65c82dd-bd3f-4ea1-b074-40a4117dbb3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: php api and self signed certs

2015-02-11 Thread Paul Halliday
That is the hint I needed.

I had it w/o quotes at first and the page was failing. I just assumed this 
was because of the lack of quotes, which once added allowed the page to 
load. I should have looked at the error console, which was showing the 
missing include :)

All is good and working as expected.

On Tuesday, February 10, 2015 at 11:17:20 PM UTC-4, Ian MacLennan wrote:

 I suspect that it is because you put your constant names in quotes, and as 
 a result the keys are going to be wrong.  i.e. you 
 have 'CURLOPT_SSL_VERIFYPEER' = true, while it should probably be 
 CURLOPT_SSL_VERIFYPEER = true as shown in the link you provided.

 Ian

 On Tuesday, February 10, 2015 at 10:42:05 AM UTC-5, Paul Halliday wrote:

 Well, I got it working but had to make the declarations in CurlHandle.php

 I added:

 $curlOptions[CURLOPT_SSL_VERIFYPEER] = true;
 $curlOptions[CURLOPT_CAINFO] = '/full/path/to/cacert.pem';

 directly above:

 curl_setopt_array($handle, $curlOptions);
 return new static($handle, $curlOptions);
 ...

 What's strange is that I also tried placing them in the default 
 $curlOptions but those seemed to get dumped somewhere prior to 
 curl_setopt_array()

 Yucky, but working.


 On Monday, February 9, 2015 at 7:05:34 PM UTC-4, Paul Halliday wrote:

 Hi,

 I am trying to get the configuration example from here to work:  


 http://www.elasticsearch.org/guide/en/elasticsearch/client/php-api/current/_security.html#_guzzleconnection_self_signed_certificate

 My settings look like this:
  
  19 // Elasticsearch
  20 $clientparams = array();
  21 $clientparams['hosts'] = array(
  22 'https://host01:443'
  23 );
  24 
  25 $clientparams['guzzleOptions'] = array(
  26 '\Guzzle\Http\Client::SSL_CERT_AUTHORITY' = 'system',
  27 '\Guzzle\Http\Client::CURL_OPTIONS' = [
  28'CURLOPT_SSL_VERIFYPEER' = true,
  29 'CURLOPT_SSL_VERIFYHOST' = 2,
  30 'CURLOPT_CAINFO' = '.inc/cacert.pem',
  31 'CURLOPT_SSLCERTTYPE' = 'PEM',
  32 ]
  33 );

 The error:

 PHP Fatal error:  Uncaught exception 
 'Elasticsearch\\Common\\Exceptions\\TransportException' with message 'SSL 
 certificate problem: self signed certificate'

 What did I miss?

 Thanks!




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a83720d1-5840-48c3-a952-ea6c38619ba5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to recover saved kibana dashboard

2015-02-11 Thread Mark Walkom
You should really create start your own thread for this :)

On 11 February 2015 at 21:32, bharath singh bharathsing...@gmail.com
wrote:

 Hi,
 Accidentally I deleted kibana-int. Now am not able to save my dashboard
 and its showing Save failed. Dashboard cannot be saved to Elastisearch.
 In the logs I found this

 [2015-02-11 192.168.xx.xx,322][DEBUG][action.admin.indices.create] [Tribe
 Chief] no known master node, scheduling a retry
 [2015-02-11 192.168.xx.xx,323][DEBUG][action.admin.indices.create] [Tribe
 Chief] observer: timeout notification from cluster service. timeout setting
 [1m], time since start [1m]

 Please help

 Regards,
 Bharath Singh

 On Friday, December 6, 2013 at 12:11:19 AM UTC+5:30, Alexander Gray II
 wrote:

 Quick Question:
 I saved a few Kibana3 dashboards via the kibana url, but I don't know
 where they are saved.  I don't think they are saved on disk, so they must
 be saved in Elasticsearch, right?

 So my root problem is that I saved a few kibana dashboards, and then I
 had to restart my cluster and now my dashboards are all gone.

 How do I recover them?

 Thanks,
 Alex

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e682b71d-371c-4dbf-b823-4017db67d473%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e682b71d-371c-4dbf-b823-4017db67d473%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92DKky05dPTtByK%3DF7wtxQGVXmHQKf_uddCf_X-BGSHA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Not able to save kibana dashboard

2015-02-11 Thread bharath singh
Hi,

Am not able to save the kibana dashboard. When I give save it displays 
Save failed. Dashboard could not save to elasticsearch. Also I was able 
to see the below logs in elasticsearch log

[2015-02-11 11:41:12,650][DEBUG][action.admin.indices.create] [Tribe Chief] 
no known master node, scheduling a retry
[2015-02-11 11:42:12,651][DEBUG][action.admin.indices.create] [Tribe Chief] 
observer: timeout notification from cluster service. timeout setting [1m], 
time since start [1m]

Please help.

Regards,
Bharath Singh

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/80924e1c-b6e2-4cf7-b554-bed35fa93a0d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Not able to save kibana dashboard

2015-02-11 Thread bharath singh
This worked. we need to run 
curl -X PUT http://elasticsearch_cluster_server:9200/kibana-int/

It started working like charm..

Regards,
Bharath Singh

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/91148a7a-2ba7-43e6-9fa6-868683e70291%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Availability of logstash-forwarder debian packages

2015-02-11 Thread Magnus Bäck
On Wednesday, February 11, 2015 at 12:53 CET,
 Dennis Plöger dploeger2...@gmail.com wrote:

 Until recently I used the elasticsearch-package repositories
 (packages.elasticsearch.org) to install logstash-forwarder. However,
 it now seems as if logstash-forwarder isn't available anymore via deb
 http://packages.elasticsearch.org/logstashforwarder/debian stable
 main. Did the structure change on the package server or is the
 package not available anymore? I found no blog post or something like
 that.

Please see the following GitHub issue:
https://github.com/elasticsearch/logstash-forwarder/issues/184

-- 
Magnus Bäck| Software Engineer, Development Tools
magnus.b...@sonymobile.com | Sony Mobile Communications

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20150211124816.GA7634%40seldlx20533.corpusers.net.
For more options, visit https://groups.google.com/d/optout.


Re: Large results sets and paging for Aggregations

2015-02-11 Thread piyush goyal
aah..! This seems to be the best explanation of how aggregation works. 
Thanks a ton Mark for that. :) Few other questions:

1.) Would I assume that as my document count would increase, the time for 
aggregation calculation would as well increase? Reason: Trying to figure 
out if bucket creation is at individual shard level, then document count 
would happen asynchronously at each shard level thus decreasing the 
execution time significantly. Also at shard level, as and when my document 
count increases(satisfying the criteria as per query) considering if this 
process is linear time, the execution time would increase. 

2.) How would I relate this analogy with sub aggregations. My observation 
says that as you increase the number of child aggregations, so it increases 
the execution time along with memory utilization. What happens in case of 
sub aggregations?

3.) I didn't get your last statement:
 There is however a fixed overhead for all queries which *is* a 
function of number of docs and that is the Field Data cache required to 
hold the dates/member IDs in RAM - if this becomes a problem then you may 
want to look at on-disk alternative structure in the form of DocValues.
 
 4.) Off the topic, but I guess best to ask it here since we are talking 
about it. :) - DocValues - Since it was introduced in 1.0.0 and most of our 
mapping was defined in ES 0.9, can I change the mapping of existing fields 
now? Might be I can take this conversation in another thread but would love 
to hear about 1-3 points. You made this thread very interesting for me.

Thanks
Piyush 



On Wednesday, 11 February 2015 15:12:37 UTC+5:30, Mark Harwood wrote:

 5k doesn't sound  too scary.

 Think of the aggs tree like a Bean Machine [1] - one of those wooden 
 boards with pins arranged on it like a christmas tree and you drop balls at 
 the top of the board and they rattle down a choice of path to the bottom.
 In the case of aggs, your buckets are the pins and documents are the balls

 The memory requirement for processing the agg tree is typically the number 
 of pins, not the number of balls you drop into the tree as these just fall 
 out of the bottom of the tree.
 So in your case it is 5k members multiplied by 12 months each = 60k unique 
 buckets, each of which will maintain a counter of how many docs pass 
 through that point. So you could pass millions or billions of docs through 
 and the working memory requirement for the query would be the same.
 There is however a fixed overhead for all queries which *is* a function 
 of number of docs and that is the Field Data cache required to hold the 
 dates/member IDs in RAM - if this becomes a problem then you may want to 
 look at on-disk alternative structure in the form of DocValues.

 Hope that helps.

 [1] http://en.wikipedia.org/wiki/Bean_machine

 On Wednesday, February 11, 2015 at 7:04:04 AM UTC, piyush goyal wrote:

 Hi Mark,

 Before getting into queries, here is a little bit info about the project:

 1.) A community where members keep on increasing, decreasing and 
 changing. Maintained in a different type.
 2.) Approximately 3K to 4K documents of data of each user inserted into 
 ES per month in a different type maintained by member ID.
 3.) Mapping is flat, there are no nested and array type of data.

 Requirement:

 Here is a sample requirement:

 1.) Getting a report against each member ID against the count of data for 
 last three month.
 2.) Query used to get the data is:

 {
   query: {
 constant_score: {
   filter: {
 bool: {
   must: [
 {term: {
   datatype: XYZ
 }
 }, {
   range: {
 response_timestamp: {
   from: 2014-11-01,
   to: 2015-01-31
 }
   }
 }
   ]
 }
   }
 }
   },aggs: {
 memberIDAggs: {
   terms: {
 field: member_id,
 size: 0
   },aggs: {
 dateHistAggs: {
   date_histogram: {
 field: response_timestamp,
 interval: month
   }
 }
   }
 }
   },size: 0
 }

 Now since the current member count is approximately 1K which will 
 increase to 5K in next 10 months. 5K * 4K * 3 times of documents to be used 
 for this aggregation. I guess a major hit on system. And this is only two 
 level of aggregation. Next requirement by our analyst is to get per month 
 data into three different categories. 

 What is the optimum solution to this problem?

 Regards
 Piyush

 On Tuesday, 10 February 2015 16:15:22 UTC+5:30, Mark Harwood wrote:

 these kind of queries are hit more for qualitative analysis.

 Do you have any example queries? The pay as you go summarisation need 
 not be about just maintaining quantities.  In the demo here [1] I derive 
 profile names for people, categorizing them as newbies, fanboys or 
 haters based on a history of their reviewing behaviours in a 

jdbc river won't start

2015-02-11 Thread Abid Hussain
Hi,

we are running ES on a single node cluster.

I'm facing the issue that a river is not being started after being 
re-created. In detail, we first delete the index the river refers to, then 
delete the river and then create it again.

*GET /_river/my_river/_meta*
{
   _index: _river,
   _type: my_river,
   _id: _meta,
   _version: 12,
   found: true,
   _source: {
  type: jdbc,
  jdbc: {
 driver: com.mysql.jdbc.Driver,
 url: jdbc:mysql://localhost:3306/my_db,
 user: user,
 password: ***,
 index: my_index,
 type: my_customer,
 sql: SELECT * FROM my_customer
  }
   }
}

Unfortunately, the river is not being started after creating it. When I 
request the river state it tells me that it ran last time yesterday:
{
   state: [
  {
 name: my_river,
 type: jdbc,
 started: 2015-02-10T01:00:06.997Z,
 last_active_begin: 2015-02-10T01:00:07.030Z,
 last_active_end: 2015-02-10T01:53:34.364Z,
 map: {
lastExecutionStartDate: 1423530007825,
aborted: false,
lastEndDate: 1423533216703,
counter: 1,
lastExecutionEndDate: 1423533214363,
lastStartDate: 1423530007608,
suspended: false
 }
  }
   ]
}

The connection properties didn't change since last succesful run.

In the logs is nothing said, except the message after deleting the index.

Any idea what could be the problem?

Regards,

Abid

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5e8ffbc7-c33f-4157-b8fe-a170d38fc1fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Availability of logstash-forwarder debian packages

2015-02-11 Thread Dennis Plöger
Thanks, Magnus.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/716ff2cd-a202-432f-81bc-39fd05668bee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Google Cloud click-to-deploy

2015-02-11 Thread David Pilato
Answer I got from Google team:

As of right now they'd have 2 options but neither of them are simple and 
point and click .
They could create a new click to deploy ES cluster (in a new project), and move 
the data over manually, OR manually bring up a new instance and edit the config 
files with the cluster IP addresses allowing it to join the existing cluster.

We definitely recognize the need for altering deployments -- as the underlying 
technology to C2D (Deployment Manager) matures, it will pave the way for these 
more advanced features.

HTH
David

 Le 11 févr. 2015 à 22:36, Avi Gabay a...@coralogix.com a écrit :
 
 Hi,
 So we are a small startup and we use Google cloud as our cloud provider.
 Just last month Google announce how they've made it easy to deploy 
 elasticsearch on their servers with just several clicks.
 I thought that this will be a nice thing to look at as it can save us the 
 hustle of deploying machines by ourselves and doing and this configuration.
 So I literally clicked to deploy and go the cluster running and now I want 
 to add more instances to the cluster as I'm expanding and I cannot seem to 
 find anywhere how to do so. Any one got a clue? I've looked everywhere and 
 there's no documentation on this issues.
 
 Thanks! 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/855ff71d-0186-43a3-907e-e20dd15a7935%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/F98AEE4E-611F-4589-9E9D-893AFC78C86A%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Hibernate Search with Elasticsearch

2015-02-11 Thread David Pilato
Have a look here. https://github.com/dadoonet/legacy-search

It's an Hibernate demo project which uses Spring as well.
If you navigate through branches, it will show you how to add elasticsearch.

I won't try to connect Hibernate Search and replace Lucene implementation by 
elasticsearch.

My 2 cents
David

 Le 12 févr. 2015 à 00:18, William Yang williamyan...@gmail.com a écrit :
 
 Hi everyone, 
 
 I am new to the forum and I am also new to ElasticSearch! 
 However for this project, I am trying to replace HibernateSearch with 
 ElasticSearch. I am using Spring Framework, and I have also taken a look at 
 both Spring-ElasticSearch and Spring-Data-ElasticSearch. However, I am still 
 new to using these open source programs and don't really know where I should 
 start. I'd appreciate it if someone could give me a push in the right 
 direction on how to incorporate ElasticSearch into my code. Thank you! 
  
 (I posted this on the archives so if this pops up twice I'm sorry!! I didn't 
 understand how this forum worked)
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Re: Hibernate Search with Elasticsearch

2015-02-11 Thread William Yang
Hey David,

Thanks for responding - I found that link too and tried to follow it but I
got confused what postgresql is. If I don't know what it is, where do I
find it?
Also I think I had some problems with a common dependency, I don't have the
exact error code on me right now, but there was something wrong with the
dependency.

What would I put as the goal when I try to run it in spring? clean install,
or just jetty:run?

On Wed, Feb 11, 2015 at 10:04 PM, David Pilato da...@pilato.fr wrote:

 Have a look here. https://github.com/dadoonet/legacy-search

 It's an Hibernate demo project which uses Spring as well.
 If you navigate through branches, it will show you how to add
 elasticsearch.

 I won't try to connect Hibernate Search and replace Lucene implementation
 by elasticsearch.

 My 2 cents
 David

 Le 12 févr. 2015 à 00:18, William Yang williamyan...@gmail.com a écrit :

 Hi everyone,

 I am new to the forum and I am also new to ElasticSearch!
 However for this project, I am trying to replace HibernateSearch with
 ElasticSearch. I am using Spring Framework, and I have also taken a look
 at both Spring-ElasticSearch and Spring-Data-ElasticSearch. However, I am
 still new to using these open source programs and don't really know where I
 should start. I'd appreciate it if someone could give me a push in the
 right direction on how to incorporate ElasticSearch into my code. Thank
 you!

 (I posted this on the archives so if this pops up twice I'm sorry!! I
 didn't understand how this forum worked)

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.

  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/XfY9hJsKcOM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr
 https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHV%3D-Pnid4LTy8ArhN6uqcTvH8LVDNt4MCCxrQmiYatAQxJNpQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Hibernate Search with Elasticsearch

2015-02-11 Thread David Pilato
postgresql is a database. I guess you are using Hibernate with a database, 
right?
Unless you are using only Hibernate search?

What do you mean run in Spring?
Spring is the framework right?

mvn jetty:run should work fine

David

 Le 12 févr. 2015 à 07:11, William Yang williamyan...@gmail.com a écrit :
 
 Hey David,
 
 Thanks for responding - I found that link too and tried to follow it but I 
 got confused what postgresql is. If I don't know what it is, where do I find 
 it?
 Also I think I had some problems with a common dependency, I don't have the 
 exact error code on me right now, but there was something wrong with the 
 dependency.
 
 What would I put as the goal when I try to run it in spring? clean install, 
 or just jetty:run?
 
 On Wed, Feb 11, 2015 at 10:04 PM, David Pilato da...@pilato.fr wrote:
 Have a look here. https://github.com/dadoonet/legacy-search
 
 It's an Hibernate demo project which uses Spring as well.
 If you navigate through branches, it will show you how to add elasticsearch.
 
 I won't try to connect Hibernate Search and replace Lucene implementation by 
 elasticsearch.
 
 My 2 cents
 David
 
 Le 12 févr. 2015 à 00:18, William Yang williamyan...@gmail.com a écrit :
 
 Hi everyone, 
 
 I am new to the forum and I am also new to ElasticSearch! 
 However for this project, I am trying to replace HibernateSearch with 
 ElasticSearch. I am using Spring Framework, and I have also taken a look at 
 both Spring-ElasticSearch and Spring-Data-ElasticSearch. However, I am 
 still new to using these open source programs and don't really know where I 
 should start. I'd appreciate it if someone could give me a push in the 
 right direction on how to incorporate ElasticSearch into my code. Thank 
 you! 
  
 (I posted this on the archives so if this pops up twice I'm sorry!! I 
 didn't understand how this forum worked)
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/52ce9435-750d-41ef-a97a-a6561fea4368%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.
 
 -- 
 You received this message because you are subscribed to a topic in the 
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit 
 https://groups.google.com/d/topic/elasticsearch/XfY9hJsKcOM/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to 
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/EA2119B7-B656-4C5D-916A-C79D186C3336%40pilato.fr.
 
 For more options, visit https://groups.google.com/d/optout.
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAHV%3D-Pnid4LTy8ArhN6uqcTvH8LVDNt4MCCxrQmiYatAQxJNpQ%40mail.gmail.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/B29F2368-4B2E-41F9-A84C-16519688106B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


[Hadoop][Spark] Exclude metadata fields from _source

2015-02-11 Thread Itai Yaffe
Hey,
I've recently started using Elasticsearch for Spark (Scala application).
I've added elasticsearch-spark_2.10 version 2.1.0.BUILD-SNAPSHOT to my 
Spark application pom file, and used 
org.apache.spark.rdd.RDD[String].saveJsonToEs() to send documents to 
Elasticsearch.
When the documents are loaded to Elasticsearch, my metadata fields (e.g id, 
index, etc.) are being loaded as part of the _source field.
Is there a way to exclude them from the _source?
I've tried using the new es.mapping.exclude configuration property (added 
in this commit 
https://github.com/elasticsearch/elasticsearch-hadoop/commit/aae4f0460a23bac9567ea2ad335c74245a1ba069
 
- that's why I needed to take the latest build rather than using version 
2.1.0.Beta3), but it doesn't seem to have any affect (although I'm not sure 
it's even possible to exclude fields I'm using for mapping, e.g 
es.mapping.id).

A code snippet (I'm using a single-node Elasticsearch cluster for testing 
purposes and running the Spark app from my desktop) :
val conf = new SparkConf()...
conf.set(es.index.auto.create, false)
conf.set(es.nodes.discovery, false)
conf.set(es.nodes, XXX:9200)
conf.set(es.update.script, XXX)
conf.set(es.update.script.params, param1:events)
conf.set(es.update.retry.on.conflict , 2)
conf.set(es.write.operation, upsert)
conf.set(es.input.json, true)
val documentsRdd =  ...
documentsRdd.saveJsonToEs(test/user, 
scala.collection.Map(es.mapping.id - _id, es.mapping.exclude - 
_id))

The JSON looks like that :
{
  _id: ,
  _type: user,
  _index: test,
  params: {
events: [
  {
...
  }
]
  }

Thanks!
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8055055f-8787-492b-97f4-144b2a7f7fce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.