date:20140829

Re: stuck thread problem?

2014-08-29 Thread Martin Forssen

FYI, this turned out to be a real bug. A fix has been committed and will be 
included in the next release.

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:

 I did report it https://github.com/elasticsearch/elasticsearch/issues/7478



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/04d9c094-112d-4d7d-bd48-e4fa2ff3a774%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Explicitly Copying Replica Shards That Fail to Start

2014-08-29 Thread David Kleiner

Thank you Mark!

Setting

{
index : {
number_of_replicas : 0
} }

and then back to 1 cleared the bad replicas and rebuilt them from primaries.

Much appreciated,

David

On Thursday, August 28, 2014 3:53:32 PM UTC-7, Mark Walkom wrote:

Yep, the easiest way is to drop the replica and then add it back and see
how you go.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com javascript:
web: www.campaignmonitor.com

On 29 August 2014 08:40, David Kleiner david@gmail.com javascript:
wrote:

Greetings,

I am still having a problem with recovery of 5 replica shards in 2
indices of mine, 3-way cluster. The replica shards fail to initialize and
are jumping around two secondary nodes. The primary shards are fine.

What is my path to recovery? Is copying master shard to secondary nodes
a correct way? I tried issuing routing commands to cancel
recovery/allocation, it helped with some secondary shards but not with the
5 in question.

I also tried dumping index with failing secondary shards but two nodes
crashed (well, lost connection to cluster) so dump failed.

Would setting replica # to 0, copying masters to 2 nodes and setting
replica # to 1 a viable alternative?

Thank you,

David

https://groups.google.com/d/msgid/elasticsearch/8e7c4f11-2790-49d6-8c65-87e9aa05aa3b%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a2482a81-5be8-4ed2-ad43-e37330446376%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Patrick Proniewski

Thank you!

On 29 août 2014, at 08:49, Martin Forssen m...@recordedfuture.com wrote:

 FYI, this turned out to be a real bug. A fix has been committed and will be 
 included in the next release.
 
 On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
 
 I did report it https://github.com/elasticsearch/elasticsearch/issues/7478
 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

Re: I search same thing, but once can get and once can not get???

2014-08-29 Thread fiefdx yang

I can sure there is only one master node exists and 16 nodes work as one
cluster.
I think I know what it happened. I found that when successive requests,
elasticsearch will execute query by once primary shards once replica shards
at default configuration.
But what I can not understand is why the primary shads and the replica
shards given different result at the same time point?
This happened when I index some new documents but not refresh, if I refresh
the cluster, then the primary shards and the replica shards will give the
same result.

On Thursday, August 28, 2014 6:41:22 PM UTC+8, Greg Murnane wrote:

This is a symptom that could happen with bad GC events, or with split
brain. Can you look at the GC logging output to see how long the stop the
world pauses you're seeing are? You can also run a query like curl -XGET '
http://localhost:9200/_cluster/state/master_node?local=true' on each of
the nodes to make sure that they agree on which one is the master node.

Look also at wait CPU and disk utilization when you run a query. Unless
you have a physical disk for each node on this system, it's likely that
there can be IO contention with 16 nodes querying the disks.

If all that looks ok, if you are running replicas, then you can try
pulling out a replica and an original, and loading them into an isolated ES
node on another system, and query there. It's possible that some of the
replicas could be corrupted, and this would allow you to detect that.

Out of curiousity, though, I wonder what the purpose of running so many
nodes on a single machine is. ES is very effective at using the entire CPU
with only one node, and replicating your heap size 16 times, adding IO
contention, and splitting the cache 16 ways all seem like they would hurt
performance immensely.

The information transmitted in this email is intended only for the
person(s) or entity to which it is addressed and may contain confidential
and/or privileged material. Any review, retransmission, dissemination or
other use of, or taking of any action in reliance upon, this information by
persons or entities other than the intended recipient is prohibited. If you
received this email in error, please contact the sender and permanently
delete the email from any computer.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d86c4ac0-910f-4d00-8276-bec8aed220ad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

help needed scripting update to list element (bulk request)

2014-08-29 Thread eunever32

Hi
Say I have a list of elements like this:

PUT twitter/twit/1
{
   list: [
   {
  a: b,
  c: d,
*  e: f*
},
{
  1: 2,
  3: 4
}
]
}

And I want to change the value of e (currently f) to say new_f  such 
that the document looks like: 

{
   list: [
   {
  a: b,
  c: d,
*  e: new_f*
},
{
  1: 2,
  3: 4
}
]
}

Is there a way to do this ? Maybe in MVEL ?

Do I match on document 
   {
  a: b,
  c: d,
  e: f
}

ie if list.contains(document) { some kind of update; }  // is this possible 
?


I know MVEL is being deprecated in 1.4 however it will do for now. 

I want to use bulk request. 

I know it's possible to remove the element like this: 
bulkRequestBuilder.setScript(if (ctx._source.list.contains(document)) 
{ctx._source.list.remove(document)} }).setScriptParams etc


but is it possible to update a field in the document also ?

Thanks.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/da238495-7cf3-4215-a77e-2144499b8859%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bulk UDP API

2014-08-29 Thread Bart Vandewoestyne

I'm trying to index data using the bulk UDP API on a single node 
Elasticsearch 1.3.2.  In my elasticsearch config I have

bulk.udp.enabled: true

My bulk file has 85000 documents and has the following characteristics:

bart@hp-g7-02:~/git/data$ ls -al mydata.json 
-rw-rw-r-- 1 bart bart 97818287 Aug 28 15:43 mydata.json

bart@hp-g7-02:~/git/data$ wc -l mydata.json 
170001 mydata.json

bart@hp-g7-02:~/git/data$ file mydata.json 
mydata.json: UTF-8 Unicode English text, with very long lines

Indexing the data using the bulk API described at 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html
 
works.  I see the documents in my elasticsearch store once the bulk upload 
is finished.

However, if I use the same bulk file and try to index it using the command

cat mydata.json | nc -w 0 -u localhost 9700

then only 1 document gets indexed, and I see lots of parsing errors like 
the following in my log files:

[2014-08-29 11:28:41,649][WARN ][bulk.udp ] [Mysterio] 
failed to execute bulk request
org.elasticsearch.common.jackson.core.JsonParseException: Unrecognized 
token '_index': was expecting ('true', 'false' or 'null')
 at [Source: [B@656f95ce; line: 1, column: 15]
at org.elasticsearch.common.jackson.core.JsonParser._constructError(
JsonParser.java:1419)
at org.elasticsearch.common.jackson.core.base.ParserMinimalBase.
_reportError(ParserMinimalBase.java:508)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser.
_reportInvalidToken(UTF8StreamJsonParser.java:3201)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser.
_handleUnexpectedValue(UTF8StreamJsonParser.java:2360)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser.
_nextTokenNotInObject(UTF8StreamJsonParser.java:794)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser.
nextToken(UTF8StreamJsonParser.java:690)
at org.elasticsearch.common.xcontent.json.JsonXContentParser.
nextToken(JsonXContentParser.java:50)
at org.elasticsearch.action.bulk.BulkRequest.add(BulkRequest.java:
266)
at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.
java:256)
at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.
java:252)
at org.elasticsearch.bulk.udp.BulkUdpService$Handler.messageReceived
(BulkUdpService.java:181)
at org.elasticsearch.common.netty.channel.
SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.
java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.
fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.socket.nio.
NioDatagramWorker.read(NioDatagramWorker.java:98)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioSelector.run(AbstractNioSelector.java:318)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.
NioDatagramWorker.run(NioDatagramWorker.java:343)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(
ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.
DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

I find it strange that things work using the usual bulk API, but not with 
the bulk UDP API.

Am I overlooking something or doing something wrong?

Thanks,
Bart

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6a676c4f-afd1-48a1-ab40-8c258aa3c54e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Replica assignement on the same host

2014-08-29 Thread 'Nicolas Fraison' via elasticsearch

Hi,

I have an ES cluster with 12 data nodes spread on 6 servers (so 2 nodes per 
server) and I saw that replicas of a shard can be allocated on the same 
server(on each nodes hosted by a server)

To avoid this I haveset those parameters to the cluster:
node.host: server_name
cluster.routing.allocation.awareness.attributes: zone, host

But I'm wondering if there are not a specific parameter for this instead of 
using clustering awareness allocation?

Nicolas

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4d3924ab-77e4-49ef-9039-52df801ff46d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Replica assignement on the same host

2014-08-29 Thread Mark Walkom

That's the best method as per
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 29 August 2014 20:45, 'Nicolas Fraison' via elasticsearch 
elasticsearch@googlegroups.com wrote:

 Hi,

 I have an ES cluster with 12 data nodes spread on 6 servers (so 2 nodes
 per server) and I saw that replicas of a shard can be allocated on the same
 server(on each nodes hosted by a server)

 To avoid this I haveset those parameters to the cluster:
 node.host: server_name
 cluster.routing.allocation.awareness.attributes: zone, host

 But I'm wondering if there are not a specific parameter for this instead
 of using clustering awareness allocation?

 Nicolas

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4d3924ab-77e4-49ef-9039-52df801ff46d%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/4d3924ab-77e4-49ef-9039-52df801ff46d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624YaxHLT%3DsptzqcSQw3i9u9oozO_2DstFJ6vCs-VC_bzOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Martijn v Groningen

Hi Patrick,

Did you see the same stuck thread via jstack or the hot thread api that
Martin reported? This can only happen if scan search was enabled (by
setting search_type=scan in a search request)
If that isn't the case then something else is maybe stuck.

Martijn

On 29 August 2014 09:58, Patrick Proniewski elasticsea...@patpro.net
wrote:

Thank you!

On 29 août 2014, at 08:49, Martin Forssen m...@recordedfuture.com wrote:

FYI, this turned out to be a real bug. A fix has been committed and will
be
included in the next release.

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:

I did report it
https://github.com/elasticsearch/elasticsearch/issues/7478

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

elasticsearch template to use standard analyzer but addional token_filter word_delimiter

2014-08-29 Thread Marc

Hi,

I am using logstash and elasticsearch for log analysis.
The standard analyzer does a pretty good job; however, it will not split 
things like word1.word2.
Therefore, I want to add the token_filter word_delimiter.
How would such an additional logstash template look?
Also how to limit this addition just to certain fields?

Thx
Marc

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2009958-6ab5-49f0-8027-f3259289442c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: stuck thread problem?

2014-08-29 Thread Martijn v Groningen

Hi Patrick,

I this problem happens again then you should execute the hot threads api:
curl localhost:9200/_nodes/hot_threads

Documentation:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads

Just pick a node in your cluster and run that command. This is the
equivalent of running jstack on all the nodes in your cluster.

Martijn

On 29 August 2014 13:34, Patrick Proniewski elasticsea...@patpro.net
wrote:

Hi,

I don't know how to debug a JAVA process. Haven't heard about jstack until
it was mentioned in this thread.
All I know is what I've posted in my first message.

I've restarted ES, and currently I've no stuck thread to investigate. In
the mean time, you can teach me how I should use jstack, so next time it
happens I'll be ready.

On 29 août 2014, at 13:19, Martijn v Groningen
martijn.v.gronin...@gmail.com wrote:

Hi Patrick,

Martijn

On 29 August 2014 09:58, Patrick Proniewski elasticsea...@patpro.net
wrote:

Thank you!

On 29 août 2014, at 08:49, Martin Forssen m...@recordedfuture.com
wrote:

FYI, this turned out to be a real bug. A fix has been committed and
will
be
included in the next release.

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:

I did report it
https://github.com/elasticsearch/elasticsearch/issues/7478

https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
.
For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com
.
For more options, visit https://groups.google.com/d/optout.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TxbSpXgVwRmfF5X3%2BDzsWgz8iNgaTjWXOP7iT1NdfHLow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

which class file trigger writing of segments.gen / segments_1

2014-08-29 Thread Jason Wee

Hello people,

Anybody know which class/component in elastic search trigger writing 
of segments.gen and segments_1? I'm currently using elastic search version 
1.2.1. It would be great if you can provide link pin point which line in 
the class does that.

Thank you.

/Jason

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bdfc3860-cb26-4f21-9597-4f500eb950e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: which class file trigger writing of segments.gen / segments_1

2014-08-29 Thread joergpra...@gmail.com

This is Lucene, when indexing starts. Look at the SegmentsInfo class
https://lucene.apache.org/core/4_9_0/core/org/apache/lucene/index/SegmentInfos.html

Jörg

On Fri, Aug 29, 2014 at 2:38 PM, Jason Wee peich...@gmail.com wrote:

Hello people,

Anybody know which class/component in elastic search trigger writing
of segments.gen and segments_1? I'm currently using elastic search version
1.2.1. It would be great if you can provide link pin point which line in
the class does that.

Thank you.

/Jason

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bdfc3860-cb26-4f21-9597-4f500eb950e2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bdfc3860-cb26-4f21-9597-4f500eb950e2%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3Ddauci86-7Y-RaN%2BJW94kqXU%3DwTA3kgxLO5Mj%3DLL0aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bulk UDP API

2014-08-29 Thread joergpra...@gmail.com

Maybe it is the line feeds in mydata.json, probably you are not using UNIX
LFs with single \n ?

Jörg


On Fri, Aug 29, 2014 at 11:36 AM, Bart Vandewoestyne 
bart.vandewoest...@gmail.com wrote:

 I'm trying to index data using the bulk UDP API on a single node
 Elasticsearch 1.3.2.  In my elasticsearch config I have

 bulk.udp.enabled: true

 My bulk file has 85000 documents and has the following characteristics:

 bart@hp-g7-02:~/git/data$ ls -al mydata.json
 -rw-rw-r-- 1 bart bart 97818287 Aug 28 15:43 mydata.json

 bart@hp-g7-02:~/git/data$ wc -l mydata.json
 170001 mydata.json

 bart@hp-g7-02:~/git/data$ file mydata.json
 mydata.json: UTF-8 Unicode English text, with very long lines

 Indexing the data using the bulk API described at
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html
 works.  I see the documents in my elasticsearch store once the bulk upload
 is finished.

 However, if I use the same bulk file and try to index it using the command

 cat mydata.json | nc -w 0 -u localhost 9700

 then only 1 document gets indexed, and I see lots of parsing errors like
 the following in my log files:

 [2014-08-29 11:28:41,649][WARN ][bulk.udp ] [Mysterio]
 failed to execute bulk request
 org.elasticsearch.common.jackson.core.JsonParseException: Unrecognized
 token '_index': was expecting ('true', 'false' or 'null')
  at [Source: [B@656f95ce; line: 1, column: 15]
 at org.elasticsearch.common.jackson.core.JsonParser.
 _constructError(JsonParser.java:1419)
 at org.elasticsearch.common.jackson.core.base.ParserMinimalBase.
 _reportError(ParserMinimalBase.java:508)
 at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser
 ._reportInvalidToken(UTF8StreamJsonParser.java:3201)
 at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser
 ._handleUnexpectedValue(UTF8StreamJsonParser.java:2360)
 at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser
 ._nextTokenNotInObject(UTF8StreamJsonParser.java:794)
 at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser
 .nextToken(UTF8StreamJsonParser.java:690)
 at org.elasticsearch.common.xcontent.json.JsonXContentParser.
 nextToken(JsonXContentParser.java:50)
 at org.elasticsearch.action.bulk.BulkRequest.add(BulkRequest.java:
 266)
 at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.
 java:256)
 at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.
 java:252)
 at org.elasticsearch.bulk.udp.BulkUdpService$Handler.
 messageReceived(BulkUdpService.java:181)
 at org.elasticsearch.common.netty.channel.
 SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.
 java:70)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendUpstream(DefaultChannelPipeline.java:564)
 at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
 sendUpstream(DefaultChannelPipeline.java:559)
 at org.elasticsearch.common.netty.channel.Channels.
 fireMessageReceived(Channels.java:268)
 at org.elasticsearch.common.netty.channel.socket.nio.
 NioDatagramWorker.read(NioDatagramWorker.java:98)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.process(AbstractNioWorker.java:108)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioSelector.run(AbstractNioSelector.java:318)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.run(AbstractNioWorker.java:89)
 at org.elasticsearch.common.netty.channel.socket.nio.
 NioDatagramWorker.run(NioDatagramWorker.java:343)
 at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(
 ThreadRenamingRunnable.java:108)
 at org.elasticsearch.common.netty.util.internal.
 DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(
 ThreadPoolExecutor.java:1145)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

 I find it strange that things work using the usual bulk API, but not with
 the bulk UDP API.

 Am I overlooking something or doing something wrong?

 Thanks,
 Bart

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6a676c4f-afd1-48a1-ab40-8c258aa3c54e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/6a676c4f-afd1-48a1-ab40-8c258aa3c54e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups

Does transport client do scatter gather?

2014-08-29 Thread John Smith

Just as the subject asks or only the node client can do scatter gather?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Refactoring idea for buildShardFailures()?

2014-08-29 Thread Na Meng

Hi Sir/Madam,

I'm doing some research in automatic refactoring suggestion. By observing
the co-change pattern of some similar code, we would like to develop a tool
to suggest possible refactorings to apply in order to extract out common
code while parameterizing any difference between them.

I have examined the code snippets in class
org.elasticsearch.action.search.type.TransportSearchScrollScanAction.AsyncAction,
org.elasticsearch.action.search.type.TransportSearchScrollQueryAndFetchAction.AsyncAction,

and
org.elasticsearch.action.search.type.TransportSearchScrollQueryThenFetchAction.AsyncAction.

I notice that all of the three classes have method buildShardFailures()
defined. The method bodies are pretty similar and they experience similar
or same changes at least once in the version history. Do you think it is a
good idea or bad idea to extract a method out of the methods?

No matter whether you would like to extract a method or not, would you like
to share the factors in your mind which affect your decision, such as
complexity of refactoring, poor readability, poor maintainability, etc.? For
each factor, how do you think it can affect your decision about using
refactoring? If possible, any quantative analysis will be great. For
example, if the code size after refactoring is greater than that before
refactoring, I won't do refactoring. Or if there are only two lines shared
between two code snippets, I won't do refactoring, etc.

Thanks a lot for your help! Your suggestion will be very valuable for our
research.

Best regards,
Na Meng

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/24d9494b-c514-477b-8096-ae6dec8ca638%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread tony . aponte

Thanks again and sorry to bother you guys but I'm new to Github and don't 
know what do do from here.  Can you point me to the right place where I can 
take the next step to put this patch on my server?  I only know how to 
untar the tarball I downloaded from the main ES page.

Thanks.
Tony

On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not 
 hesitate to test the fix immediately and replaced it with a better working 
 solution, since the lzf-compress software is having weaknesses regarding 
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com wrote:

 Amazing job. Great work.

 -- 
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com joerg...@gmail.com
  wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor and 
 opened a pull request 

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com joerg...@gmail.com
  wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at 
 org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at 

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b
  
 There has been a fix in LZF lately 
 https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this 
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and 
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can you 
 try 
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {   
  index.refresh_interval : 5s  },  mappings : {_default_ : { 
 
   _all : {enabled : true},   dynamic_templates : [ { 
 string_fields : {   match : *,   
 match_mapping_type 
 : string,   mapping : { type : string, 
 index 
 : analyzed, omit_norms : true,   fields : {   
 
   raw : {type: string, index : not_analyzed, ignore_above : 
 256}   }   } }   } ],   
 properties : 
 { @version: { type: string, index: not_analyzed },

   geoip  : {   type : object, dynamic: 
 true,   
   path: full, properties : {   
 location : { type : geo_point } } }   }   
  } 
  }}



 On Monday, August 25, 2014 3:53:18 PM UTC-4, tony@iqor.com 
 wrote:

 I have no plugins installed (yet) and only changed 
 es.logger.level to DEBUG in logging.yml. 

 elasticsearch.yml:
 cluster.name: es-AMS1Cluster
 node.name: KYLIE1
 node.rack: amssc2client02
 path.data: /export/home/apontet/elasticsearch/data
 path.work: /export/home/apontet/elasticsearch/work
 path.logs: /export/home/apontet/elasticsearch/logs
 network.host:    = sanitized line; file contains 
 actual server IP 
 discovery.zen.ping.multicast.enabled: false
 discovery.zen.ping.unicast.hosts: [s1, s2, s3, s5 , s6, 
 s7]   = Also sanitized

 Thanks,
 Tony




 On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

 I tested a simple Hello World document on Elasticsearch 1.3.2 
 with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default 
 settings.

 No issues.

 So I would like to know more about the settings in 
 elasticsearch.yml, the mappings, and the installed plugins.

 Jörg


 On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I have some Solaris 10 Sparc V440/V445 servers available and can 
 try to reproduce over the weekend.

 Jörg


 On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir 
 rober...@elasticsearch.com wrote:

 How big is it? Maybe i can have it anyway? I pulled two ancient 
 ultrasparcs out of my closet to try to debug your issue, but 
 unfortunately 
 they are a pita to work with (dead nvram battery on both, zeroed 
 mac 
 address, etc.) Id still love to get to the bottom of this.
  On Aug 22, 2014 3:59 PM, tony@iqor.com wrote:

 Hi Adrien,
 It's a bunch of garbled binary data, basically a dump of the 
 process image.
 Tony


 On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand 
 wrote:

 Hi Tony,

 Do you have more information in the core dump file? (cf. the 
 Core dump written line that

Re: Stop words and Keyword tokenizer

2014-08-29 Thread Germán Carrillo

Thanks Ivan! I'll test which way fits better to my needs.

2014-08-28 17:12 GMT-05:00 Ivan Brusic i...@brusic.com:

Character filters are executed before the tokenizer, so only something in
that family of filters would work if you plan to continue using the keyword
tokenizer.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

The mapping char filter might be a better match if you list is not in
regex form. I use the mapping char filter to remove copyright, trademark
and a whole list of other characters from my content.

Cheers,

Ivan

On Thu, Aug 28, 2014 at 2:33 PM, Germán Carrillo
carrillo.ger...@gmail.com wrote:

Ivan, yes, I'm aware I would obtain another text, that's fine. Even more,
my docs have a display field to be returned to users after a search. For
the example given above, the display value would be something like:
Mulaló, Yumbo, Valle del Cauca.

Itamar, I've actually considered several options. I think a synonym file
would be too big. I gave you 11 equivalent terms (you might've noticed I
could have continued to give you around 30 equivalent ways), but I didn't
mention place names (alone) have their corresponding synonyms, alternate
names, abbreviations, and vernacular names. There could be 10k different
places (docs) in the index. :D Also, taking into account every single case
into the synonym file seems to be sub-optimal. Really, I intend to
normalize a large number of ways of expressing place hierarchy into a few
ways. Otherwise I'd have to build very large lists for each place I add to
the index, and nothing prevents I'm missing a weird case. BTW, handling
hierarchy is a must, otherwise result disambiguation would be a nightmare
for users.

Thanks for all the discussion, it's certainly valuable to read an
expert's opinion.

Back to my very first question, is the pattern replace token filter the
only way to remove stop words from tokens obtained from a keyword tokenizer?
Are those regular expressions not very performant?

2014-08-28 15:49 GMT-05:00 Ivan Brusic i...@brusic.com:

You mentioned in your original post I'd like to obtain the original
text without stop words

The stopword-less phrase will indeed be present in the index after the
analysis phrase, however, when you ask for this content back as a result of
a query, the original text will be returned. What is indexed is not
necessarily what is stored/returned.

Cheers,

Ivan

On Thu, Aug 28, 2014 at 12:30 PM, Germán Carrillo
carrillo.ger...@gmail.com wrote:

Thanks Ivan,

do you mean what I obtain from a request such as

curl -XGET
'localhost:9200/_analyze?tokenizer=keywordfilters=lowercase,my_ascii_folding,my_stopwords'
-d 'El corregimiento de Mulaló, jurisdicción del municipio de Yumbo
(Valle del Cauca)'

is not what will be present in the index after the analysis process? If
so, how could I check whether the stop words filter is being (will be)
applied to a sample phrase?

2014-08-28 14:03 GMT-05:00 Ivan Brusic i...@brusic.com:

Also note that the content returned will still contain the stop
words. Only the inverted index will contain the stopword-less content.

--
Ivan

On Thu, Aug 28, 2014 at 11:55 AM, Itamar Syn-Hershko
ita...@code972.com wrote:

What would be the usecase for such a process (removing stop words
without tokenization)?

This may be a good read btw:
http://www.elasticsearch.org/blog/stop-stopping-stop-words-a-look-at-common-terms-query/

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Aug 28, 2014 at 9:48 PM, German Carrillo
carrillo.ger...@gmail.com wrote:

Hi all,

I'm looking for a way to remove stop words from tokens returned by a
keyword tokenizer, i.e., I'd like to obtain the original text without
stop
words after the analysis process.

Sample data looks like: El corregimiento de
Mulaló, jurisdicción del municipio de Yumbo (Valle del Cauca)
After the lowercase token filter: el corregimiento de
mulaló, jurisdicción del municipio de yumbo (valle del cauca)
After the ascii folding token filter:el corregimiento de
mulalo, jurisdiccion del municipio de yumbo (valle del cauca)
After removing stop words: corregimiento mulalo,
municipio yumbo (valle cauca)

The stop words (currently) are: [la, el, de, del,
los, las, jurisdiccion]

Is the pattern replace token filter the only (or best) way to go for
such a task?

I'd really like to avoid writing custom regular expressions rather
than specifying a stop words list, which I know would work perfectly
fine
for other tokenizers.

Regards,

Germán

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread joergpra...@gmail.com

Do you want to build from source? Or do you want to install a fresh binary?

At jenkins.elasticsearch.org I can not find any snapshot builds but it may
be just me.

It would be a nice add-on to provide snapshot builds for users that eagerly
await bug fixes or take a ride on the bleeding edge before the next release
arrives, without release notes etc.

Jörg


On Fri, Aug 29, 2014 at 4:29 PM, tony.apo...@iqor.com wrote:

 Thanks again and sorry to bother you guys but I'm new to Github and don't
 know what do do from here.  Can you point me to the right place where I can
 take the next step to put this patch on my server?  I only know how to
 untar the tarball I downloaded from the main ES page.

 Thanks.
 Tony


 On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not
 hesitate to test the fix immediately and replaced it with a better working
 solution, since the lzf-compress software is having weaknesses regarding
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com wrote:

 Amazing job. Great work.

 --
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor and
 opened a pull request

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at org.elasticsearch.common.
 compress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

 There has been a fix in LZF lately https://github.com/
 ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can 
 you try
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {
  index.refresh_interval : 5s  },  mappings : {_default_ : {
   _all : {enabled : true},   dynamic_templates : [ {
 string_fields : {   match : *,   
 match_mapping_type
 : string,   mapping : { type : string, 
 index
 : analyzed, omit_norms : true,   fields : {
   raw : {type: string, index : not_analyzed, ignore_above :
 256}   }   } }   } ],   
 properties :
 { @version: { type: string, index: not_analyzed },
   geoip  : {   type : object, dynamic: 
 true,
   path: full, properties : {
 location : { type : geo_point } } }   }  
   }
  }}



 On Monday, August 25, 2014 3:53:18 PM UTC-4, tony@iqor.com
 wrote:

 I have no plugins installed (yet) and only changed
 es.logger.level to DEBUG in logging.yml.

 elasticsearch.yml:
 cluster.name: es-AMS1Cluster
 node.name: KYLIE1
 node.rack: amssc2client02
 path.data: /export/home/apontet/elasticsearch/data
 path.work: /export/home/apontet/elasticsearch/work
 path.logs: /export/home/apontet/elasticsearch/logs
 network.host:    = sanitized line; file contains
 actual server IP
 discovery.zen.ping.multicast.enabled: false
 discovery.zen.ping.unicast.hosts: [s1, s2, s3, s5 , s6,
 s7]   = Also sanitized

 Thanks,
 Tony




 On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

 I tested a simple Hello World document on Elasticsearch 1.3.2
 with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default
 settings.

 No issues.

 So I would like to know more about the settings in
 elasticsearch.yml, the mappings, and the installed plugins.

 Jörg


 On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I have some Solaris 10 Sparc V440/V445 servers available and can
 try to reproduce over the weekend.

 Jörg


 On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir 
 rober...@elasticsearch.com wrote:

 How big is it? Maybe i can have it anyway? I pulled two ancient
 ultrasparcs out of my closet to try to debug your issue, but 
 unfortunately
 they are a pita to work with (dead nvram battery on both, zeroed 
 mac
 address, etc.) Id still love to get to the bottom of

Re: which class file trigger writing of segments.gen / segments_1

2014-08-29 Thread Jason Wee

Thanks Jörg,

read this link
https://lucene.apache.org/core/4_8_1/core/org/apache/lucene/index/SegmentInfos.html
, very informative.

Found a few spots that call the class SegmentInfos, below are them.

https://github.com/elasticsearch/elasticsearch/blob/v1.2.1/src/main/java/org/elasticsearch/index/gateway/local/LocalIndexShardGateway.java
https://github.com/elasticsearch/elasticsearch/blob/v1.2.1/src/main/java/org/elasticsearch/common/lucene/Lucene.java
https://github.com/elasticsearch/elasticsearch/blob/v1.2.1/src/main/java/org/elasticsearch/index/engine/internal/InternalEngine.java
https://github.com/elasticsearch/elasticsearch/blob/v1.2.1/src/main/java/org/elasticsearch/index/merge/policy/ElasticsearchMergePolicy.java
https://github.com/elasticsearch/elasticsearch/blob/v1.2.1/src/main/java/org/elasticsearch/index/snapshots/blobstore/BlobStoreIndexShardRepository.java

I understand that both segments file are written by Lucene but during
index, do you know which class in elasticsearch that eventually lead to
trigger the underlying writing of segments file?

/Jason

On Fri, Aug 29, 2014 at 8:49 PM, joergpra...@gmail.com
joergpra...@gmail.com wrote:

This is Lucene, when indexing starts. Look at the SegmentsInfo class
https://lucene.apache.org/core/4_9_0/core/org/apache/lucene/index/SegmentInfos.html

Jörg

On Fri, Aug 29, 2014 at 2:38 PM, Jason Wee peich...@gmail.com wrote:

Hello people,

Thank you.

/Jason

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bdfc3860-cb26-4f21-9597-4f500eb950e2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/bdfc3860-cb26-4f21-9597-4f500eb950e2%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3Ddauci86-7Y-RaN%2BJW94kqXU%3DwTA3kgxLO5Mj%3DLL0aQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3Ddauci86-7Y-RaN%2BJW94kqXU%3DwTA3kgxLO5Mj%3DLL0aQ%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHO4itx9BQfQN2f0JLwwUi4kKupohF1Otxh1_s31t40QeZbcPg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Not able to fulltext index Microsoft Office documents - PDF works fine

2014-08-29 Thread feenz

Hi David, 
I am currently using elasticsearch-1.3.1. Will the mapper-attchements-2.3.2
be compatible with my version of ES or will have have to update?

Thanks,

- Kyle



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Not-able-to-fulltext-index-Microsoft-Office-documents-PDF-works-fine-tp4062325p4062665.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1409321264321-4062665.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

Re: Not able to fulltext index Microsoft Office documents - PDF works fine

2014-08-29 Thread David Pilato

It will work with 1.3.1.
You should update to 1.3.2 though because we fixed some issues in this version.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 août 2014 à 16:07, feenz kfeeney5...@gmail.com a écrit :

Hi David,
I am currently using elasticsearch-1.3.1. Will the mapper-attchements-2.3.2
be compatible with my version of ES or will have have to update?

Thanks,

- Kyle

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Not-able-to-fulltext-index-Microsoft-Office-documents-PDF-works-fine-tp4062325p4062665.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

Multi-field collapsing

2014-08-29 Thread Brian Hudson

I have a use case which requires collapsing on multiple fields.

As a simple example assume I have some movie documents indexed with the 
fields: Director, Actor, Title  Release Date. I want to be able to 
collapse on Director and Actor, getting the most recent movie (as indicated 
by Release Date).

I think the new top hits aggregation almost gets me mostly what I need. I 
can create a terms aggregation on Director, with a sub terms aggregation on 
Actor, and add a top hits aggregation to that (size 1). Would this be the 
proper approach? By traversing over the aggregations I can get all of the 
hits that I want - however I can't (have elasticsearch) sort or page them.

It's almost like I'd need a hitCollector aggregation which would collect 
all search hits generated by it's sub aggregations and allow me to specify 
sort and paging information at that level. Thoughts?

Brian

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/318b7474-004f-4244-90e8-d9b93639481f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Does transport client do scatter gather?

2014-08-29 Thread joergpra...@gmail.com

I'm not exactly sure what you mean by scatter-gather, but yes, both clients
can execute requests on all nodes of the cluster.

Jörg


On Fri, Aug 29, 2014 at 3:43 PM, John Smith java.dev@gmail.com wrote:

 Just as the subject asks or only the node client can do scatter gather?

 Thanks

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdeW329GZgvOm3NG0NAuNvEUJftkSMzKTyCAzM1%2B8bFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: What the heck is this search?? :)

2014-08-29 Thread Chris Neal

Hi Boaz,

Thanks for the reply. :) It's not a problem per-se. I'm working through
performance/memory issues and turned on the slow log file and that one
popped up. It's a problem because it's slow, but not causing cluster
stability issues!

It's interesting that you think it is Kibana though. I removed the Head
plugin for 3 days and didn't see that query logged once, so I was pretty
sure it was the culprit! Maybe it was just coincidence that whatever in
Kibana was doing it didn't happen then either. Just my luck. ;)

Thanks again.
Chris

On Thu, Aug 28, 2014 at 3:48 PM, Boaz Leskes b.les...@gmail.com wrote:

Hi Chris,

This is actually Kibana. The reason it uses query_string is to allow
people some kind of syntax in their query with no query parsing on the
client side. Just a decision which I guess was made long ago to keep things
simple.

Is this a problem for you in any way?

Cheers,
Boaz

On Thursday, August 21, 2014 6:37:02 PM UTC+2, Chris Neal wrote:

Done. Will report back.

Thank you!

On Thu, Aug 21, 2014 at 11:27 AM, Itamar Syn-Hershko ita...@code972.com
wrote:

I'm going to bet on Head. Disable it and see what happens.

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Thu, Aug 21, 2014 at 7:22 PM, Chris Neal chris.n...@derbysoft.net
wrote:

Thanks guys for the thoughts. Plugins didn't even occur to me, but
they should have.

We've got Marvel, Head, and ElasticHQ installed.

Is there some way to tell where the search is coming from? Something
like an HTTP access log or something?

Thanks again for your time!
Chris

On Wed, Aug 20, 2014 at 3:57 PM, Itamar Syn-Hershko ita...@code972.com
wrote:

I thought of Kibana because there's a faceting operation on the _type
field. But I doubt neither Marvel nor Kibana would issue such an awful
query (notice the fquery bit, too).

Any part of your system (plugin or other) which might want to look at
the types of documents added to an ES index?

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Aug 20, 2014 at 11:53 PM, Ivan Brusic i...@brusic.com wrote:

Very strange query indeed. Wildcard search filtered by a match_all.
What?!?

It is not Elasticsearch, but perhaps some plugin. Itamar mentioned
Kibana, although you did not mention it in your post. Any other plugins?
Marvel?

--
Ivan

On Wed, Aug 20, 2014 at 12:43 PM, Itamar Syn-Hershko
ita...@code972.com wrote:

There is no such thing as query internal to ES, if you see this in
the logs you have a client making it. I would point to a Kibana instance
but I'm pretty sure Kibana won't use a query_string query like this.

And yes this is quite an expensive query (and facets) to run on a
decent sized installation.

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Wed, Aug 20, 2014 at 10:14 PM, Chris Neal
chris.n...@derbysoft.net wrote:

Hi guys,

I'm working through some performance concerns in my cluster, and I
turned on the slow log feature. I'm seeing this in the
index_search_slowlog.log log:

[2014-08-20 06:37:52,734][INFO ][index.search.slowlog.query]
[elasticsearch-ip-10-0-0-41] [index-20140731][0] took[6s],
took_millis[6081], types[], stats[], search_type[QUERY_TH
EN_FETCH], total_shards[86], source[{facets:{terms:{
terms:{field:_type,size:100,order:count,exclude
:[]},facet_filter:{fquery:{query:{filtered:{query:
{bool:{should:[{query_string:{query:*}}]}},
filter:{bool:{must:[{match_all:{}}],size:0}],
extra_source[],

Is that a user generated search, or something internal to ES maybe?
I can't even tell what it's trying to do. It seems to hit every one
of my
indexes though, as the same search query is logged 63 times in a one
minute
period.

Any ideas what this is? Is it something to be concerned about?

Thanks for the help!
Chris

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAND3Dpj7BzbaNva9B7JNFOeeaC9SrYWCEnvzTJgx2-AQeT478w%40mail.
gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAND3Dpj7BzbaNva9B7JNFOeeaC9SrYWCEnvzTJgx2-AQeT478w%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread Ivan Brusic

The snapshot repo is still active, but it is a bit behind and does not
include this patch:

https://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch/

-- 
Ivan


On Fri, Aug 29, 2014 at 8:21 AM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Do you want to build from source? Or do you want to install a fresh binary?

 At jenkins.elasticsearch.org I can not find any snapshot builds but it
 may be just me.

 It would be a nice add-on to provide snapshot builds for users that
 eagerly await bug fixes or take a ride on the bleeding edge before the next
 release arrives, without release notes etc.

 Jörg


 On Fri, Aug 29, 2014 at 4:29 PM, tony.apo...@iqor.com wrote:

 Thanks again and sorry to bother you guys but I'm new to Github and don't
 know what do do from here.  Can you point me to the right place where I can
 take the next step to put this patch on my server?  I only know how to
 untar the tarball I downloaded from the main ES page.

 Thanks.
 Tony


 On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not
 hesitate to test the fix immediately and replaced it with a better working
 solution, since the lzf-compress software is having weaknesses regarding
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com wrote:

 Amazing job. Great work.

 --
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor
 and opened a pull request

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at org.elasticsearch.common.
 compress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

 There has been a fix in LZF lately https://github.com/
 ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can 
 you try
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {
  index.refresh_interval : 5s  },  mappings : {_default_ : 
 {
   _all : {enabled : true},   dynamic_templates : [ {
 string_fields : {   match : *,   
 match_mapping_type
 : string,   mapping : { type : string, 
 index
 : analyzed, omit_norms : true,   fields : {
   raw : {type: string, index : not_analyzed, ignore_above 
 :
 256}   }   } }   } ],   
 properties :
 { @version: { type: string, index: not_analyzed },
   geoip  : {   type : object, dynamic: 
 true,
   path: full, properties : {
 location : { type : geo_point } } }   } 
}
  }}



 On Monday, August 25, 2014 3:53:18 PM UTC-4, tony@iqor.com
 wrote:

 I have no plugins installed (yet) and only changed
 es.logger.level to DEBUG in logging.yml.

 elasticsearch.yml:
 cluster.name: es-AMS1Cluster
 node.name: KYLIE1
 node.rack: amssc2client02
 path.data: /export/home/apontet/elasticsearch/data
 path.work: /export/home/apontet/elasticsearch/work
 path.logs: /export/home/apontet/elasticsearch/logs
 network.host:    = sanitized line; file contains
 actual server IP
 discovery.zen.ping.multicast.enabled: false
 discovery.zen.ping.unicast.hosts: [s1, s2, s3, s5 ,
 s6, s7]   = Also sanitized

 Thanks,
  Tony




 On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

 I tested a simple Hello World document on Elasticsearch 1.3.2
 with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default
 settings.

 No issues.

 So I would like to know more about the settings in
 elasticsearch.yml, the mappings, and the installed plugins.

 Jörg


 On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I have some Solaris 10 Sparc V440/V445 servers available and
 can try to reproduce over the weekend.

 Jörg


 On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir

Re: EL setup for fulltext search

2014-08-29 Thread Ivan Brusic

That output does not look like the something generated from the standard
analyzer since it contains uppercase letters and various non-word
characters such as '='.

Your two analysis requests will differ since the second one contains the
default word_delimiter filter instead of your custom my_word_delimiter.
What you are trying to achieve is somewhat difficult, but you can get there
if you keep on tweaking. :) Try using a pattern tokenizer instead of the
whitespace tokenizer if you want more control over word boundaries.

-- 
Ivan


On Fri, Aug 29, 2014 at 1:48 AM, Marc mn.off...@googlemail.com wrote:

 Hi Ivan,

 thanks again. I have tried so and found a reasonable combination.
 Nevertheless, when I now try to use the analyze api with an index that has
 the said analyzer defined via template it doesn't seem to apply:

 This is the complete template:
 {
 template: bogstash-*,
 settings: {
 index.number_of_replicas: 0,
 analysis: {
 analyzer: {
 msg_excp_analyzer: {
 type: custom,
 tokenizer: whitespace,
 filters: [word_delimiter,
 lowercase,
 asciifolding,
 shingle,
 standard]
 }
 },
 filters: {
 my_word_delimiter: {
 type: word_delimiter,
 preserve_original: true
 },
 my_asciifolding: {
 type: asciifolding,
 preserve_original: true
 }
 }
 }
 },
 mappings: {
 _default_: {
 properties: {
 @excp: {
 type: string,
 index: analyzed,
 analyzer: msg_excp_analyzer
 },
 @msg: {
 type: string,
 index: analyzed,
 analyzer: msg_excp_analyzer
 }
 }
 }
 }
 }
 I create the index bogstash-1.
 Now I test the following:
 curl -XGET
 'localhost:9200/bogstash-1/_analyze?analyzer=msg_excp_analyzerpretty=1' -d
 'Service=MyMDB.onMessage appId=cs Times=Me:22/Total:22 (updated
 attributes=gps_lng: 183731222/ gps_lat: 289309222/ )'
 and it returns:
 {
   tokens : [ {
 token : Service=MyMDB.onMessage,
 start_offset : 0,
 end_offset : 23,
 type : word,
 position : 1
   }, {
 token : appId=cs,
 start_offset : 24,
 end_offset : 32,
 type : word,
 position : 2
   }, {
 token : Times=Me:22/Total:22,
 start_offset : 33,
 end_offset : 53,
 type : word,
 position : 3
   }, {
 token : (updated,
 start_offset : 54,
 end_offset : 62,
 type : word,
 position : 4
   }, {
 token : attributes=gps_lng:,
 start_offset : 63,
 end_offset : 82,
 type : word,
 position : 5
   }, {
 token : 183731222/,
 start_offset : 83,
 end_offset : 93,
 type : word,
 position : 6
   }, {
 token : gps_lat:,
 start_offset : 94,
 end_offset : 102,
 type : word,
 position : 7
   }, {
 token : 289309222/,
 start_offset : 103,
 end_offset : 113,
 type : word,
 position : 8
   }, {
 token : ),
 start_offset : 114,
 end_offset : 115,
 type : word,
 position : 9
   } ]
 }
 Which is the output of a standard analyzer.
 Giving the tokenizer and filters in the analyze API directly works fine:
 curl -XGET
 'localhost:9200/_analyze?tokenizer=whitespacefilters=lowercase,word_delimiter,shingle,asciifolding,standardpretty=1'
 -d 'Service=MyMDB.onMessage appId=cs Times=Me:22/Total:22 (updated
 attributes=gps_lng: 183731222/ gps_lat: 289309222/ )'
 This results in:
 {
   tokens : [ {
 token : service,
 start_offset : 0,
 end_offset : 7,
 type : word,
 position : 1
   }, {
 token : service mymdb,
 start_offset : 0,
 end_offset : 13,
 type : shingle,
 position : 1
   }, {
 token : mymdb,
 start_offset : 8,
 end_offset : 13,
 type : word,
 position : 2
   }, {
 token : mymdb onmessage,
 start_offset : 8,
 end_offset : 23,
 type : shingle,
 position : 2
   }, {
 token : onmessage,
 start_offset : 14,
 end_offset : 23,
 type : word,
 position : 3
   }, {
 token : onmessage appid,
 start_offset : 14,
 end_offset : 29,
 type : shingle,
 position : 3
   }, {
 token : appid,
 start_offset : 24,
 end_offset : 29,
 type : word,
 position : 4
   }, {
 token : appid cs,
 start_offset : 24,
 end_offset : 32,
 type : shingle,
 position : 4
   }, {
 token : cs,
 start_offset : 30,
 end_offset : 32,
 type : word,
 position : 5
   }, {
 token : cs times,
 start_offset : 30,
 end_offset : 38,
 type : shingle,

How big can/should you scale Elasticsearch

2014-08-29 Thread Rob Blackin

We are trying to implement a 5 TB, 10 Billion item Elasticsearch cluster. 
The key is an integer and the item data is fairly small.

We seem to run into issues around loading. Seems to slow down as the index 
gets bigger.

We are doing this on EC2 i2.xlarge nodes.

How many documents/TB do you think we can load per node max?

So if we can do 2 Billion each then we need 5 nodes. We are trying to size 
it.  

Any advice is welcome. Even if it is that this is not a good thing to do :)

thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3faa4de9-0a27-49dc-8f68-ceebd5569da9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread tony . aponte

The easiest for me is to install fresh binaries but I'm not shy about 
learning about Maven while I build it from source.  

Thanks
Tony

On Friday, August 29, 2014 11:21:34 AM UTC-4, Jörg Prante wrote:

 Do you want to build from source? Or do you want to install a fresh binary?

 At jenkins.elasticsearch.org I can not find any snapshot builds but it 
 may be just me.

 It would be a nice add-on to provide snapshot builds for users that 
 eagerly await bug fixes or take a ride on the bleeding edge before the next 
 release arrives, without release notes etc.

 Jörg


 On Fri, Aug 29, 2014 at 4:29 PM, tony@iqor.com javascript: wrote:

 Thanks again and sorry to bother you guys but I'm new to Github and don't 
 know what do do from here.  Can you point me to the right place where I can 
 take the next step to put this patch on my server?  I only know how to 
 untar the tarball I downloaded from the main ES page.

 Thanks.
 Tony


 On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not 
 hesitate to test the fix immediately and replaced it with a better working 
 solution, since the lzf-compress software is having weaknesses regarding 
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com wrote:

 Amazing job. Great work.

 -- 
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor 
 and opened a pull request 

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at org.elasticsearch.common.
 compress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at 

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b
  
 There has been a fix in LZF lately https://github.com/
 ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this 
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and 
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can 
 you try 
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {   
  index.refresh_interval : 5s  },  mappings : {_default_ : 
 { 
   _all : {enabled : true},   dynamic_templates : [ {
  
 string_fields : {   match : *,   
 match_mapping_type 
 : string,   mapping : { type : string, 
 index 
 : analyzed, omit_norms : true,   fields : { 
   
   raw : {type: string, index : not_analyzed, ignore_above 
 : 
 256}   }   } }   } ],   
 properties : 
 { @version: { type: string, index: not_analyzed },  
  
   geoip  : {   type : object, dynamic: 
 true,   
   path: full, properties : {   
 location : { type : geo_point } } }   } 
} 
  }}



 On Monday, August 25, 2014 3:53:18 PM UTC-4, tony@iqor.com 
 wrote:

 I have no plugins installed (yet) and only changed 
 es.logger.level to DEBUG in logging.yml. 

 elasticsearch.yml:
 cluster.name: es-AMS1Cluster
 node.name: KYLIE1
 node.rack: amssc2client02
 path.data: /export/home/apontet/elasticsearch/data
 path.work: /export/home/apontet/elasticsearch/work
 path.logs: /export/home/apontet/elasticsearch/logs
 network.host:    = sanitized line; file contains 
 actual server IP 
 discovery.zen.ping.multicast.enabled: false
 discovery.zen.ping.unicast.hosts: [s1, s2, s3, s5 , 
 s6, s7]   = Also sanitized

 Thanks,
  Tony




 On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

 I tested a simple Hello World document on Elasticsearch 1.3.2 
 with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, 
 default 
 settings.

 No issues.

 So I would like to know more about the settings in 
 elasticsearch.yml, the mappings, and the installed plugins.

 Jörg


 On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I have some Solaris 10 Sparc V440/V445 servers available and 
 can try to reproduce over the weekend.

 Jörg


 On Sat, Aug 23, 2014 at

Re: Explicitly Copying Replica Shards That Fail to Start

2014-08-29 Thread Ivan Brusic

I used to apply that trick all the time with older versions of
Elasticsearch! Thankfully it has not occurred to me in years.

--
Ivan

On Thu, Aug 28, 2014 at 3:53 PM, Mark Walkom ma...@campaignmonitor.com
wrote:

Yep, the easiest way is to drop the replica and then add it back and see
how you go.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 29 August 2014 08:40, David Kleiner david.klei...@gmail.com wrote:

Greetings,

I also tried dumping index with failing secondary shards but two nodes
crashed (well, lost connection to cluster) so dump failed.

Would setting replica # to 0, copying masters to 2 nodes and setting
replica # to 1 a viable alternative?

Thank you,

David

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8e7c4f11-2790-49d6-8c65-87e9aa05aa3b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8e7c4f11-2790-49d6-8c65-87e9aa05aa3b%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZpLMWPg95joA023WT3hS7AsS1x4%3DN4E5UUWuyt_LAWtg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624ZpLMWPg95joA023WT3hS7AsS1x4%3DN4E5UUWuyt_LAWtg%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCTZzDLj-roqZYV40sf8QrF7K_OB1oOAAVNv8N7m9zp-A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Does transport client do scatter gather?

2014-08-29 Thread John Smith

According to this...

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Non data nodes (I assume Node client is equivalent of a non data node) is
capable of scatter/gather searching. Was wondering if transport can do this
also?

2- Does transport support routing if you specify routing field? Or does it
always round robin regardless?
On Aug 29, 2014 12:09 PM, joergpra...@gmail.com joergpra...@gmail.com
wrote:

I'm not exactly sure what you mean by scatter-gather, but yes, both
clients can execute requests on all nodes of the cluster.

Jörg

On Fri, Aug 29, 2014 at 3:43 PM, John Smith java.dev@gmail.com
wrote:

Just as the subject asks or only the node client can do scatter gather?

Thanks

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups elasticsearch group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/70zTmEuyWHE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdeW329GZgvOm3NG0NAuNvEUJftkSMzKTyCAzM1%2B8bFg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHdeW329GZgvOm3NG0NAuNvEUJftkSMzKTyCAzM1%2B8bFg%40mail.gmail.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMiEuFSuCrwaF6qoVf3-rsA_NjQKrJjFue62kjVvoiUH8A2rJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How big can/should you scale Elasticsearch

2014-08-29 Thread Arie

When you look to the guys @ found (https://www.found.no/pricing/) then the 
data on one ES server is 8 times memory,
if it should run smooth, but do not know how valuable that is. If you have 
a lot of ES nodes, then consider one master
node without data, it's a best practice I have read somewhere.

16GB Memory equals 128GB data.

On Friday, August 29, 2014 7:27:28 PM UTC+2, Rob Blackin wrote:

 We are trying to implement a 5 TB, 10 Billion item Elasticsearch cluster. 
 The key is an integer and the item data is fairly small.

 We seem to run into issues around loading. Seems to slow down as the index 
 gets bigger.

 We are doing this on EC2 i2.xlarge nodes.

 How many documents/TB do you think we can load per node max?

 So if we can do 2 Billion each then we need 5 nodes. We are trying to size 
 it.  

 Any advice is welcome. Even if it is that this is not a good thing to do :)

 thanks


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c3e4601d-8564-47f6-b3b3-0fdb91fac96e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread joergpra...@gmail.com

Quick guide:

- install Java 7 (or Java 8), Apache Maven, and git, also ensure internet
connection to the Maven central repo

- clone 1.3 branch only (you could also clone the whole repo and switch to
the branch): git clone https://github.com/elasticsearch/elasticsearch.git
--branch 1.3 --single-branch es-1.3

- enter folder es-1.3

- start build: mvn -DskipTests clean install

- wait a few minutes while Maven loads all dependent artifacts and compiles
~3000 source files

The result will be a complete build of all binaries. In the 'target'
folder, after the Build complete message of Maven, you will see a file
elasticsearch-VERSION.jar

VERSION is something like 1.3.3-SNAPSHOT. You can copy this file into
your existing Elasticsearch 1.3.x installation lib folder. Do not forget
to adjust bin/elasticsearch.in.sh to point to the new
elasticsearch-VERSION.jar file in the classpath configuration (at the top
lines). This must be the first jar on the classpath so it can patch Lucene
jars.

If you have already data in the existing Elasticsearch I recommend to
backup everything before starting the new snapshot build - no guarantees,
use at your own risk.

Jörg




On Fri, Aug 29, 2014 at 7:36 PM, tony.apo...@iqor.com wrote:

 The easiest for me is to install fresh binaries but I'm not shy about
 learning about Maven while I build it from source.

 Thanks
 Tony


 On Friday, August 29, 2014 11:21:34 AM UTC-4, Jörg Prante wrote:

 Do you want to build from source? Or do you want to install a fresh
 binary?

 At jenkins.elasticsearch.org I can not find any snapshot builds but it
 may be just me.

 It would be a nice add-on to provide snapshot builds for users that
 eagerly await bug fixes or take a ride on the bleeding edge before the next
 release arrives, without release notes etc.

 Jörg


 On Fri, Aug 29, 2014 at 4:29 PM, tony@iqor.com wrote:

 Thanks again and sorry to bother you guys but I'm new to Github and
 don't know what do do from here.  Can you point me to the right place where
 I can take the next step to put this patch on my server?  I only know how
 to untar the tarball I downloaded from the main ES page.

 Thanks.
 Tony


 On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not
 hesitate to test the fix immediately and replaced it with a better working
 solution, since the lzf-compress software is having weaknesses regarding
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com wrote:

 Amazing job. Great work.

 --
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor
 and opened a pull request

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at org.elasticsearch.common.co
 mpress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

 There has been a fix in LZF lately https://github.com/ning
 /compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can 
 you try
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d
 @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {
  index.refresh_interval : 5s  },  mappings : {_default_ 
 : {
   _all : {enabled : true},   dynamic_templates : [ {
 string_fields : {   match : *,   
 match_mapping_type
 : string,   mapping : { type : string, 
 index
 : analyzed, omit_norms : true,   fields : {
   raw : {type: string, index : not_analyzed, 
 ignore_above :
 256}   }   } }   } ],   
 properties :
 { @version: { type: string, index: not_analyzed },
   geoip  : {   type : object, dynamic: 
 true,
   path: full, properties : {
 location : { type : geo_point } } }   
 }}
  }}



 On Monday, August 25, 2014 3:53:18 PM UTC-4, tony@iqor.com
 wrote:

 I have no plugins

Re: How big can/should you scale Elasticsearch

2014-08-29 Thread Nikolas Everett

On Fri, Aug 29, 2014 at 1:27 PM, Rob Blackin robblac...@gmail.com wrote:

We are trying to implement a 5 TB, 10 Billion item Elasticsearch cluster.
The key is an integer and the item data is fairly small.

We're running around 5.5TB right now without a problem. The biggest
annoyance is that rolling restarts take time proportional to how much data
you have.

We have much larger documents then you have so we only store 181 million or
so. Our documents are interactively maintained - a consistent portion of
them are updated daily with some creates and a few rare deletes.

You might want to think about how you do sharding - look into routing to
see if you can get away with oversubscribing on shards. You might also
look into using multiple indexes as well. Shay gave a talk on how you
could subdivide one large set of data into multiple indexes to help
things. One 5TB index would be difficult to maintain. As are any shards
that are more then, say, 20GB. Just shuffling those shards from system to
system for rebalancing gets expensive. Merges on those shards have a
higher upper bound on disk io and cache thrash.

We seem to run into issues around loading. Seems to slow down as the index
gets bigger.

Check on your merge rate. This is old but it'll give you some idea of what
is going on:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
You can tune this a bit - especially if your data comes in spurts.

We are doing this on EC2 i2.xlarge nodes.

How many documents/TB do you think we can load per node max?

So if we can do 2 Billion each then we need 5 nodes. We are trying to size
it.

I can't talk to Amazon because we use physical machines. We use 18
machines with two reasonably nice Intel ssds per machine, 96GB of ram, and
pretty sizeable CPUs and it isn't really enough to handle the query load we
want to throw at it. I imagine the shape of your load is going to be of a
different though.

Nik

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1-59_4WQyKGFOsWBDmZd8iYu9agQPszwh80rB8g8vQ4Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Does transport client do scatter gather?

2014-08-29 Thread joergpra...@gmail.com

A node client is not just a non-data node although very close. The ES page
describes a proxy node scenario. Example: you have many HTTP clients and
they search for large result sets. This is often a challenge because of the
high resource contention. One or more data-less proxy nodes can help in
gathering these result sets, letting the data nodes alone, which just do
the scatter part of the search.

This is similar to how a TransportClient works for a JVM-only client.
TransportClient is also a proxy node that gathers result sets. But with
some subtle difference, you can not connect HTTP clients to a
TransportClient, and because the TransportClient is not a cluster member,
it uses the configured connected nodes as gather nodes within the cluster.
Because there are two gather nodes, this is called an extra hop in
comparison to a Java NodeClient. But, if you add the HTTP client request to
the request scenario mentioned before, there is no extra hop, only an extra
JVM. So the best place for TransportClient is on a remote host.

In Java, NodeClient and TransportClient share the full functionality of ES,
routing requests, round-robin for load balancing etc. For cluster-specific
server-only services like listening to cluster state, or snapshot/restore,
a TransportClient is not feasible, it can't do it or must ask a node in the
cluster for passing the information.

Jörg

On Fri, Aug 29, 2014 at 8:54 PM, John Smith java.dev@gmail.com wrote:

According to this...

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Non data nodes (I assume Node client is equivalent of a non data node) is
capable of scatter/gather searching. Was wondering if transport can do this
also?

2- Does transport support routing if you specify routing field? Or does it
always round robin regardless?
On Aug 29, 2014 12:09 PM, joergpra...@gmail.com joergpra...@gmail.com
wrote:

I'm not exactly sure what you mean by scatter-gather, but yes, both
clients can execute requests on all nodes of the cluster.

Jörg

On Fri, Aug 29, 2014 at 3:43 PM, John Smith java.dev@gmail.com
wrote:

Just as the subject asks or only the node client can do scatter gather?

Thanks

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b5274032-c142-46df-91e2-f451ab9c069e%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMiEuFSuCrwaF6qoVf3-rsA_NjQKrJjFue62kjVvoiUH8A2rJA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMiEuFSuCrwaF6qoVf3-rsA_NjQKrJjFue62kjVvoiUH8A2rJA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D-7Xwe6N%3DiOK3YiT-a9EmwOAbu4KqGM1xT1Yu_FHsbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

2014-08-29 Thread tony . aponte

Thank you very much.
Tony

On Friday, August 29, 2014 3:27:33 PM UTC-4, Jörg Prante wrote:

 Quick guide:

 - install Java 7 (or Java 8), Apache Maven, and git, also ensure internet 
 connection to the Maven central repo 

 - clone 1.3 branch only (you could also clone the whole repo and switch to 
 the branch): git clone https://github.com/elasticsearch/elasticsearch.git 
 --branch 1.3 --single-branch es-1.3

 - enter folder es-1.3

 - start build: mvn -DskipTests clean install

 - wait a few minutes while Maven loads all dependent artifacts and 
 compiles ~3000 source files

 The result will be a complete build of all binaries. In the 'target' 
 folder, after the Build complete message of Maven, you will see a file 
 elasticsearch-VERSION.jar

 VERSION is something like 1.3.3-SNAPSHOT. You can copy this file into 
 your existing Elasticsearch 1.3.x installation lib folder. Do not forget 
 to adjust bin/elasticsearch.in.sh to point to the new 
 elasticsearch-VERSION.jar file in the classpath configuration (at the top 
 lines). This must be the first jar on the classpath so it can patch Lucene 
 jars.

 If you have already data in the existing Elasticsearch I recommend to 
 backup everything before starting the new snapshot build - no guarantees, 
 use at your own risk.

 Jörg




 On Fri, Aug 29, 2014 at 7:36 PM, tony@iqor.com javascript: wrote:

 The easiest for me is to install fresh binaries but I'm not shy about 
 learning about Maven while I build it from source.  

 Thanks
 Tony


 On Friday, August 29, 2014 11:21:34 AM UTC-4, Jörg Prante wrote:

 Do you want to build from source? Or do you want to install a fresh 
 binary?

 At jenkins.elasticsearch.org I can not find any snapshot builds but it 
 may be just me.

 It would be a nice add-on to provide snapshot builds for users that 
 eagerly await bug fixes or take a ride on the bleeding edge before the next 
 release arrives, without release notes etc.

 Jörg


 On Fri, Aug 29, 2014 at 4:29 PM, tony@iqor.com wrote:

 Thanks again and sorry to bother you guys but I'm new to Github and 
 don't know what do do from here.  Can you point me to the right place 
 where 
 I can take the next step to put this patch on my server?  I only know how 
 to untar the tarball I downloaded from the main ES page.

 Thanks.
 Tony


 On Wednesday, August 27, 2014 1:35:06 PM UTC-4, tony@iqor.com 
 wrote:

 Kudos!

 Tony

 On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

 All praise should go to the fantastic Elasticsearch team who did not 
 hesitate to test the fix immediately and replaced it with a better 
 working 
 solution, since the lzf-compress software is having weaknesses regarding 
 threadsafety.

 Jörg


 On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic iv...@brusic.com 
 wrote:

 Amazing job. Great work.

 -- 
 Ivan


 On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 I fixed the issue by setting the safe LZF encoder in LZFCompressor 
 and opened a pull request 

 https://github.com/elasticsearch/elasticsearch/pull/7466

 Jörg


 On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Still broken with lzf-compress 1.0.3

 https://gist.github.com/jprante/d2d829b497db4963aea5

 Jörg


 On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com 
 joerg...@gmail.com wrote:

 Thanks for the logstash mapping command. I can reproduce it now.

 It's the LZF encoder that bails out at org.elasticsearch.common.
 compress.lzf.impl.UnsafeChunkEncoderBE._getInt

 which uses in turn sun.misc.Unsafe.getInt

 I have created a gist of the JVM crash file at 

 https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b
  
 There has been a fix in LZF lately https://github.com/ning
 /compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

 for version 1.0.3 which has been released recently.

 I will build a snapshot ES version with LZF 1.0.3 and see if this 
 works...

 Jörg



 On Mon, Aug 25, 2014 at 11:30 PM, tony@iqor.com wrote:

 I captured a WireShark trace of the interaction between ES and 
 Logstash 1.4.1.  The error occurs even before my data is sent.  Can 
 you try 
 to reproduce it on your testbed with this message I captured?

 curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d 
 @y

 Contests of file 'y:
 {  template : logstash-*,  settings : {   
  index.refresh_interval : 5s  },  mappings : {_default_ 
 : { 
   _all : {enabled : true},   dynamic_templates : [ {  

 string_fields : {   match : *,   
 match_mapping_type 
 : string,   mapping : { type : string, 
 index 
 : analyzed, omit_norms : true,   fields : {   
 
   raw : {type: string, index : not_analyzed, 
 ignore_above : 
 256}   }   } }   } ],   
 properties : 
 { @version: { type: string, index: not_analyzed 
 },   
   geoip  : {   type : object, dynamic: 
 true,

Re: Replica assignement on the same host

2014-08-29 Thread Ivan Brusic

It's Friday. Can't read. Nevermind. :)

On Fri, Aug 29, 2014 at 5:06 PM, Mark Walkom ma...@campaignmonitor.com
wrote:

He's running multiple ES instances/nodes per physical, ie a VM or
container or just a second process, so I don't think it's primary and
secondary on the same ES instance.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 30 August 2014 05:16, Ivan Brusic i...@brusic.com wrote:

The replica of a shard should never be on the same node as the primary.
Where did you notice this anomaly? What version are you using?

--
Ivan

On Fri, Aug 29, 2014 at 3:52 AM, Mark Walkom ma...@campaignmonitor.com
wrote:

That's the best method as per
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#allocation-awareness

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 29 August 2014 20:45, 'Nicolas Fraison' via elasticsearch
elasticsearch@googlegroups.com wrote:

Hi,

I have an ES cluster with 12 data nodes spread on 6 servers (so 2 nodes
per server) and I saw that replicas of a shard can be allocated on the same
server(on each nodes hosted by a server)

To avoid this I haveset those parameters to the cluster:
node.host: server_name
cluster.routing.allocation.awareness.attributes: zone, host

But I'm wondering if there are not a specific parameter for this
instead of using clustering awareness allocation?

Nicolas

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4d3924ab-77e4-49ef-9039-52df801ff46d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4d3924ab-77e4-49ef-9039-52df801ff46d%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624YaxHLT%3DsptzqcSQw3i9u9oozO_2DstFJ6vCs-VC_bzOw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624YaxHLT%3DsptzqcSQw3i9u9oozO_2DstFJ6vCs-VC_bzOw%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCFgA6FprUZ%2BZoYsB47N4f28pXSP4%2BGfdkRn_3L%3D_tXow%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCFgA6FprUZ%2BZoYsB47N4f28pXSP4%2BGfdkRn_3L%3D_tXow%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624bMkjzG%3Db%3DaQdq0pd6khSsVi3BDgVa2AcChimiuxtneKA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624bMkjzG%3Db%3DaQdq0pd6khSsVi3BDgVa2AcChimiuxtneKA%40mail.gmail.com?utm_medium=emailutm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA_uyaQGXdKQt8hZr1X-qq_%2Bm3Rh%2BCpG4Fg_wRqYhdN0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

37 matches

Mail list logo