date:20150317

You can use Logstash to change the XML into JSON, but you will need to do
the JSON to XML output yourself.

On 17 March 2015 at 15:17, Venkat Ankam  wrote:

> I have a requirement to index and search millions of XML documents related
> to mortgage (Uniform Closing Dataset XMLs).
>
> Indexed data will be requested by a web services of many internal
> applications through REST API.
>
> Output should be in XML format.
>
> How do I implement this in ELK stack?  How to convert XML input to JSON
> and how to get output in XML format?
>
> Request you to share any examples related this scenario.
>
> Regards,
> Venkat
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAMVHyg2Y826yt1o6yMW9pjwE3bFdWTVouAFQhZuiktjvnZV4Zw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9mnyYuGoEtS2DDKo4LTkTtpcaRg9FCa1X6G5F%2Br1UE%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: spark version, elasticsearch-hadoop version, akka version sync up

2015-03-17 Thread Jeff Steinmetz

Thank you for the summary - you are confirming (as a sanity check for 
myself): 

elasticsearch-hadoop beta3 (not snapshot) on spark core 1.1 only
elasticsearch-hadoop-beta3-SNAPSHOT with spark core 1.1, 1.2 and 1.3 -- as 
long as I don't use Spark SQL when using 1.2 and 1.3

Costin - I am amazed by your ability to keep all this straight - my head 
would explode dealing with all the dependencies in flux.  Kudos to you.


On Tuesday, March 17, 2015 at 2:12:06 PM UTC-7, Costin Leau wrote:
>
> es-hadoop doesn't depend on akka, only on Spark. The scala version that 
> es-hadoop is compiled against matches the one used by the Spark version 
> compiled against for each release - typically this shouldn't pose a problem.
>
> Unfortunately, despite the minor version increments, some of the Spark 
> APIs or components (in particular Spark SQL) have changed drastically 
> between each release breaking backwards compatibility. For example, Beta3 
> works until Spark 1.1 (which was the latest stable release during its 
> release) but not with 1.2. This is fixed in master however the current dev 
> build doesn't work with Spark SQL in the newly released 1.3 (does work with 
> Spark core).
>
> This has already been fixed locally however I'm having difficulties trying 
> to preserve compatibility across the Spark SQL 1.2 release and 1.3.
>
> Long story short, as long as the dependencies for Spark are in order, the 
> same should apply for es-hadoop as well since it relies only on Spark (and 
> Scala of course).
>
> On Tue, Mar 17, 2015 at 10:43 PM, Jeff Steinmetz  > wrote:
>
>> There are plenty of spark / akka / scala / elasticsearch-hadoop 
>> dependencies to keep track of.
>>
>> Is it true that elasticsearch-hadoop needs to be compiled for a specific 
>> spark version to run correctly on the cluster?  I'm also trying to keep 
>> track of the akka version and scala version.  i.e, wil es-hadoop compiled 
>> for spark 1.2  work with Spark 1.3 ?
>>
>> When the elasticsearch-hadoop versions are released, as v2.0 v2.1, 
>> v2.1.0.Beta3, at what point do we need to keep in mind what spark version 
>> it was also compiled against?
>> i.e., is it safe to assume the es-hadoop versions are tied to a specific 
>> spark core version?
>>
>> I've been keeping the following chart in my notes to see what all the 
>> versions are with all dependencies
>> =
>>
>> Akka Version  Dependencies
>> Current Akka Stable Release:  2.3.9
>>
>> Elasticsearch-Hadoop:  2.1.0Beta3 = Spark 1.1.0
>> Elasticsearch-Hadoop:  2.1.0Beta3-SNAPSHOT = Spark 1.2.1
>> Elasticsearch-Hadoop: what about spark 1.3 ?
>>
>> Spark: 1.3, Akka: 2.3.4-spark
>> Spark: 1.2, Akka: 2.3.4-spark
>> Spark: 1.1, Akka: 2.2.3-shaded-protobuf
>>
>> Activator 1.2.12 comes with with Akka 2.3.4
>>
>> Play 2.3.8, akka 2.3.4, scala 2.11.1 (will also work with 2.10.4 )
>> Play 2.2.x, akka 2.2.0
>>
>> Spark Job Server 0.4.1, Spark Core 1.1.0, Akka, 2.2.4
>> Spark Job Server Master as of Feb 22, 2015, Spark Core 1.2.0,  Akka 
>> 2.3.4, Scala 2.10.4
>>
>> Akka persistence latest 2.3.4 or later
>> Akka 2.3.9 is released for Scala 2.10.4 and 2.11.5
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/28ad3f78-8b3d-450a-a29d-06d3e6636cfd%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c19d1fa-17b9-4e6e-a698-b49c7d6919d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: PayloadTermQuery in ElasticSearch

2015-03-17 Thread joergpra...@gmail.com

I created an example payload plugin

https://github.com/jprante/elasticsearch-payload

but I can't get a custom per-field similarity to work.  Setting up a field
with a prebuilt similarity works flawlessly, but with a custom one, it is
not even listed in the mapping.

It looks like SimilarityLookupService fails to find custom similarities.

If someone can help in tracking down the issue, I'd be glad. Maybe I do
something wrong.

Jörg

On Tue, Mar 17, 2015 at 5:02 PM, Nikolas Everett  wrote:

> I imagine the right way to do this is with a plugin but I'm not 100% sure.
>
> On Tue, Mar 17, 2015 at 11:47 AM, Devaraja Swami 
> wrote:
>
>> I plan to store floats in the payload and boost the score
>> (multiplicatively) based on the average value of the payloads over the
>> occurrences of the matching term in the document. ie., exactly as in
>> AveragePayloadFunction in Lucene.
>>
>> On Tue, Mar 17, 2015 at 2:16 AM, joergpra...@gmail.com <
>> joergpra...@gmail.com> wrote:
>>
>>> The concrete implementation depends on what you store in the payload
>>> (e.g. scores)
>>>
>>> Jörg
>>>
>>> On Tue, Mar 17, 2015 at 7:01 AM, Devaraja Swami >> > wrote:
>>>
 I need to use PayloadTermQuery from Lucene.
 Does anyone know how I can use this in ElasticSearch?
 I am using ES 1.4.4, with the Java API.
 In Lucene, I could use this by directly instantiating PayloadTermQuery,
 but there are no APIs in ES QueryBuilders for this.
 I don't need a query parser, because I can build the query directly
 using the Java API (don't need a JSON representation of the query),
 so I only need to be able to construct, in Java, a query builder
 encapsulating a PayloadTermQuery.

 Thanks in advance!

 -devarajaswami

  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8fc84082-6fc7-42aa-9caf-8ab527bc8a0b%40googlegroups.com
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFwk_Ve_OK9J%2BfsEzGwbtOnaL7%2BeqT%3DR61hCoX8Mzi-fQ%40mail.gmail.com
>>> 
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CABMSir6hF%3DuM1jp0jgoBq_v30YNVB-8JLF7PLyvFjyXbdqtLvg%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2%2B-YFPsRVB0QywSuMFEVXL-UgQyxJRGBjGn4Lw0KWT4A%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEO1XqR0SdrJtRHPPONrL%2B1a6iGi3xBvQikxFkAS0pXLA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Courier Fetch error, maybe due to lack of @timestamp?

2015-03-17 Thread Itamar Syn-Hershko

@timestamp is generated automatically by logstash, any documents not added
by logstash will not have it

--

Itamar Syn-Hershko
http://code972.com | @synhershko 
Freelance Developer & Consultant
Lucene.NET committer and PMC member

On Wed, Mar 18, 2015 at 12:51 AM, David Reagan  wrote:

> @timestamp has always been applied automatically. Only time I've ever
> touched it is when I've adjusted the date to what the log message holds,
> rather than when the log message is processed by logstash.
>
> So, I have no idea where it comes from, or how I could have turned it off
> on something.
>
> Is that in the template?
>
> --David Reagan
>
> On Tue, Mar 17, 2015 at 2:24 PM, Itamar Syn-Hershko 
> wrote:
>
>> Like the error suggests, "No mapping found for [@timestamp] in order to
>> sort on"
>>
>> Kibana expects a @timestamp field - make sure to push that in your source
>>
>> --
>>
>> Itamar Syn-Hershko
>> http://code972.com | @synhershko 
>> Freelance Developer & Consultant
>> Lucene.NET committer and PMC member
>>
>> On Tue, Mar 17, 2015 at 11:19 PM, David Reagan  wrote:
>>
>>> I keep getting an error like this: "Courier Fetch: 5 of 270 shards
>>> failed." in Kibana 4.0.1.
>>>
>>> After some Googling, I think it has something to do with @timestamp not
>>> existing for some of my data. But I'm not sure, because
>>> https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion
>>> was solved by not searching the kibana indexes. I'm only searching my
>>> logstash indexes. And I'm still getting that error.
>>>
>>> In kibana 4 I went to Settings->Indices and made sure I only have
>>> logstash-* listed under Index Patterns.
>>>
>>> I did recently update the template to what was in the logstash git HEAD.
>>>
>>> See http://pastebin.com/w7PmHxXS for my
>>> /var/log/elasticsearch/index.log output. As well as the template I'm using.
>>> It's at the bottom of the paste.
>>>
>>> I did check with curl -XGET '
>>> http://localhost:9200/_cat/shards?pretty=true' to see if any shards had
>>> issues. They all had "STARTED" as their status.
>>>
>>> Any suggestions?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/dH6zw6swHBg/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsPGBtn9cSLt6Dyz0M%2BEznMCFM0d0Chj1h4%3DwJFX3qTng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: correctly analyzed field not found by query_string search

2015-03-17 Thread Ryan Pedela

 For anyone who has a similar problem, I have figured out the issue. By 
default, it appears to me that only the _all field is searched. The _all 
field contains "pharmacy_docs" but not "pharmacy". If the search is 
modified to search the "name" fields then the search works. And if you 
wanted to support searching for "pharmacy_docs", you could add "_all" to 
the list such as:

curl 'http://localhost:9200/my_index/_search?pretty' -d '{
"query": {
"query_string": {
"fields": [ "_all", "name" ],
"query": "pharmacy_docs"
}
}
}'

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/352b95b7-0651-4d1e-88e1-44be6ebf2e6b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Courier Fetch error, maybe due to lack of @timestamp?

@timestamp has always been applied automatically. Only time I've ever
touched it is when I've adjusted the date to what the log message holds,
rather than when the log message is processed by logstash.

So, I have no idea where it comes from, or how I could have turned it off
on something.

Is that in the template?

--David Reagan

On Tue, Mar 17, 2015 at 2:24 PM, Itamar Syn-Hershko 
wrote:

> Like the error suggests, "No mapping found for [@timestamp] in order to
> sort on"
>
> Kibana expects a @timestamp field - make sure to push that in your source
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko 
> Freelance Developer & Consultant
> Lucene.NET committer and PMC member
>
> On Tue, Mar 17, 2015 at 11:19 PM, David Reagan  wrote:
>
>> I keep getting an error like this: "Courier Fetch: 5 of 270 shards
>> failed." in Kibana 4.0.1.
>>
>> After some Googling, I think it has something to do with @timestamp not
>> existing for some of my data. But I'm not sure, because
>> https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion
>> was solved by not searching the kibana indexes. I'm only searching my
>> logstash indexes. And I'm still getting that error.
>>
>> In kibana 4 I went to Settings->Indices and made sure I only have
>> logstash-* listed under Index Patterns.
>>
>> I did recently update the template to what was in the logstash git HEAD.
>>
>> See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log
>> output. As well as the template I'm using. It's at the bottom of the paste.
>>
>> I did check with curl -XGET '
>> http://localhost:9200/_cat/shards?pretty=true' to see if any shards had
>> issues. They all had "STARTED" as their status.
>>
>> Any suggestions?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/dH6zw6swHBg/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANo%2B_AdzgruuC8mb5W2fKrxYi58tyBwak%2B-3g8w2hbWJTyRThw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Indexing and Searching XML documents

2015-03-17 Thread joergpra...@gmail.com

It strongly depends on the method how you want to convert XML to JSON and
vice versa.

Maybe this plugin can give you some hints about Jackson XML regarding
parsing and formatting

https://github.com/jprante/elasticsearch-xml

Do not expect XML schema, validation, or XSL stylesheet, this is not
included.

Jörg

On Tue, Mar 17, 2015 at 11:17 PM, Venkat Ankam  wrote:

> I have a requirement to index and search millions of XML documents related
> to mortgage (Uniform Closing Dataset XMLs).
>
> Indexed data will be requested by a web services of many internal
> applications through REST API.
>
> Output should be in XML format.
>
> How do I implement this in ELK stack?  How to convert XML input to JSON
> and how to get output in XML format?
>
> Request you to share any examples related this scenario.
>
> Regards,
> Venkat
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAMVHyg2Y826yt1o6yMW9pjwE3bFdWTVouAFQhZuiktjvnZV4Zw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGPWOG9J6qa9rnOVBfzjGORL3oKpR%3DmNrzj4oTNkEQW9Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: large number of indexes for multi-tenant product

This is a super timely blog from the Found crew -
https://found.no/foundation/multi-tenancy/

On 17 March 2015 at 14:11, Mark Walkom  wrote:

> There are practical limits, based on your dataset, node sizing, version
> etc.
>
> You'd be better off segregating indices by a higher level definition (eg
> customer number, 1-999, 1000-1999 etc), using routing and then aliases on
> top. This way you conceptually get the same layout as a single index per
> customer, but it gives you the option to split larger customers out to
> their own index and without wasting resources on small use customers.
>
> On 16 March 2015 at 19:11, Richard Blaylock  wrote:
>
>> Hi all,
>>
>> We have a multi-tenant product and are leaning towards dynamically
>> creating (and deleting) various indexes relevant to a tenant at runtime: as
>> a tenant is created, so are that tenant's indexes.  When a tenant is
>> deleted so are that tenant's indexes.  Each index is specific to that
>> tenant and could vary in size, but we do not expect any given index to ever
>> be larger than a single disk (e.g. 80 GB).
>>
>> Due to index shard issues (static, too many shards per index = a hit on
>> performance (more map/reduce work to do), etc.), and due to the nature of
>> our application, we are currently opting for a single-shard-per-index model
>> - each index will have one and only one shard.  We will have replicas for
>> fault tolerance.
>>
>> On the surface, this appears to be an ideal design choice for
>> multi-tenant applications: for any given index, one and only one shard will
>> be 'hit' - no need to search across multiple shards, ever.  It also reduces
>> contention because indexes are always tenant-specific: as an index becomes
>> larger, any slowness due to the large index *only* impacts the
>> corresponding tenant (customer), whereas the alternative - using one index
>> across tenants - one tenant's use/load could negatively impact other
>> tenants' query performance.
>>
>> So for multi-tenancy, this single-shard-per-index model sounds ideal for
>> our use case - the *only* issue here is that the number of indexes
>> increases dramatically as the number of tenants (customers) increases.
>> Consider a system with 20,000 tenants, each having (potentially) hundreds
>> or thousands, or even 10s of thousands of indexes, resulting in millions of
>> indexes overall.  This is manageable from our product's perspective, but
>> what impact would this have on ElasticSearch, if any?
>>
>> Are there practical limits? IIUC, there is a Lucene index (file) per
>> shard, so if there are hundreds of thousands or millions of Lucene
>> indexes/files - other than disk space and file descriptor count per ES
>> node, are there any other limits?  Does performance degrade as the number
>> of single-shard-indexes increases?  Or is there no problem at all?
>>
>> Thanks,
>> Richard
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/607f62c1-5854-43e0-9d25-3f748aca44a4%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X99jYR7a%2BYuf3o-C_bxE5OvxybTAKr2rQL4HEEDqS0R6Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: ElasticSearch documents relationship question

Take a look at
http://www.elastic.co/guide/en/elasticsearch/guide/current/relations.html
to get you started.

On 17 March 2015 at 15:07, Mithrawnuroudo  wrote:

> Could you help me to model architecture of storing posts and comments in
> ElasticSearch?
>
> Currenlty i have simple data structure - I store in ES "posts" as
> documents in ES index. I do search on that index to find posts with
> particular words. Posts are not related to anything. Every post has unique
> url and that's it. Simple.
>
> I want to add possibility to store comments to post."Comments" will be
> special version of "posts" - comment has parent(other post) and comments
> have particular order between other comments of their parent. I wonder how
> should i model relationship between posts/comments and which ES data
> structure should i use. I don't know anything about documents relationship
> in Elasticsearch, so any help will be great.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e262e145-3b1f-45f0-a374-a6b2dd8ad189%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9UMUeaXp%2BTa%2BGr9apsJeXCxtv8caMc9hhgdWx%3D_mN--g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch - transport client singleton

2015-03-17 Thread David Pilato

Yes!

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 17 mars 2015 à 11:23, Александр Свиридов  a écrit :
> 
> I am newbie in elastic and I don't understand how should I work with 
> transport client connections. Should I use singleton for Client, something 
> like
> 
> class ElasticClientManager {
>   private static Client client;
>  
>  public static Client getClient(){
> if (client==null) {
>   Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "elasticsearch")
> .put("client.transport.sniff", true).build();
> 
>client = new TransportClient(settings)
> .addTransportAddress(new 
> InetSocketTransportAddress("localhost",9300));
> }
>return client;
>  }
> }
> 
> By other words - I create one client and keep the reference in it in 
> singleton. Every time I need to query elastic I do
> 
> Client client = ElasticClientManager.getClient();
> GetResponse getResponse = client.prepareGet().execute().actionGet();
> 
> Is such approach right?
> 
> 
> -- 
> Александр Свиридов
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/1426616605.710285922%40f217.i.mail.ru.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6268C917-24DD-4081-A012-9BE565539438%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Indexing and Searching XML documents

2015-03-17 Thread Venkat Ankam

I have a requirement to index and search millions of XML documents related
to mortgage (Uniform Closing Dataset XMLs).

Indexed data will be requested by a web services of many internal
applications through REST API.

Output should be in XML format.

How do I implement this in ELK stack?  How to convert XML input to JSON and
how to get output in XML format?

Request you to share any examples related this scenario.

Regards,
Venkat

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMVHyg2Y826yt1o6yMW9pjwE3bFdWTVouAFQhZuiktjvnZV4Zw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

ElasticSearch documents relationship question

2015-03-17 Thread Mithrawnuroudo

 

Could you help me to model architecture of storing posts and comments in 
ElasticSearch?

Currenlty i have simple data structure - I store in ES "posts" as documents 
in ES index. I do search on that index to find posts with particular words. 
Posts are not related to anything. Every post has unique url and that's it. 
Simple.

I want to add possibility to store comments to post."Comments" will be 
special version of "posts" - comment has parent(other post) and comments 
have particular order between other comments of their parent. I wonder how 
should i model relationship between posts/comments and which ES data 
structure should i use. I don't know anything about documents relationship 
in Elasticsearch, so any help will be great.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e262e145-3b1f-45f0-a374-a6b2dd8ad189%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Courier Fetch error, maybe due to lack of @timestamp?

2015-03-17 Thread Itamar Syn-Hershko

Like the error suggests, "No mapping found for [@timestamp] in order to
sort on"

Kibana expects a @timestamp field - make sure to push that in your source

--

Itamar Syn-Hershko
http://code972.com | @synhershko 
Freelance Developer & Consultant
Lucene.NET committer and PMC member

On Tue, Mar 17, 2015 at 11:19 PM, David Reagan  wrote:

> I keep getting an error like this: "Courier Fetch: 5 of 270 shards
> failed." in Kibana 4.0.1.
>
> After some Googling, I think it has something to do with @timestamp not
> existing for some of my data. But I'm not sure, because
> https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion
> was solved by not searching the kibana indexes. I'm only searching my
> logstash indexes. And I'm still getting that error.
>
> In kibana 4 I went to Settings->Indices and made sure I only have
> logstash-* listed under Index Patterns.
>
> I did recently update the template to what was in the logstash git HEAD.
>
> See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log
> output. As well as the template I'm using. It's at the bottom of the paste.
>
> I did check with curl -XGET 'http://localhost:9200/_cat/shards?pretty=true'
> to see if any shards had issues. They all had "STARTED" as their status.
>
> Any suggestions?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zsf8HH4WFvF8geoDy4zNhWOX6Y6hEsaLv8E8xhc04F62A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Courier Fetch error, maybe due to lack of @timestamp?

I keep getting an error like this: "Courier Fetch: 5 of 270 shards failed." 
in Kibana 4.0.1.

After some Googling, I think it has something to do with @timestamp not 
existing for some of my data. But I'm not sure, because 
https://groups.google.com/d/topic/elasticsearch/L6AG3dZOGJ8/discussion was 
solved by not searching the kibana indexes. I'm only searching my logstash 
indexes. And I'm still getting that error.

In kibana 4 I went to Settings->Indices and made sure I only have 
logstash-* listed under Index Patterns.

I did recently update the template to what was in the logstash git HEAD.

See http://pastebin.com/w7PmHxXS for my /var/log/elasticsearch/index.log 
output. As well as the template I'm using. It's at the bottom of the paste.

I did check with curl -XGET 'http://localhost:9200/_cat/shards?pretty=true' 
to see if any shards had issues. They all had "STARTED" as their status.

Any suggestions? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d816fa6-62c4-4651-a1e3-30c4f9239f5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: spark version, elasticsearch-hadoop version, akka version sync up

2015-03-17 Thread Costin Leau

es-hadoop doesn't depend on akka, only on Spark. The scala version that
es-hadoop is compiled against matches the one used by the Spark version
compiled against for each release - typically this shouldn't pose a problem.

Unfortunately, despite the minor version increments, some of the Spark APIs
or components (in particular Spark SQL) have changed drastically between
each release breaking backwards compatibility. For example, Beta3 works
until Spark 1.1 (which was the latest stable release during its release)
but not with 1.2. This is fixed in master however the current dev build
doesn't work with Spark SQL in the newly released 1.3 (does work with Spark
core).

This has already been fixed locally however I'm having difficulties trying
to preserve compatibility across the Spark SQL 1.2 release and 1.3.

Long story short, as long as the dependencies for Spark are in order, the
same should apply for es-hadoop as well since it relies only on Spark (and
Scala of course).

On Tue, Mar 17, 2015 at 10:43 PM, Jeff Steinmetz <
jeffrey.steinm...@gmail.com> wrote:

> There are plenty of spark / akka / scala / elasticsearch-hadoop
> dependencies to keep track of.
>
> Is it true that elasticsearch-hadoop needs to be compiled for a specific
> spark version to run correctly on the cluster?  I'm also trying to keep
> track of the akka version and scala version.  i.e, wil es-hadoop compiled
> for spark 1.2  work with Spark 1.3 ?
>
> When the elasticsearch-hadoop versions are released, as v2.0 v2.1,
> v2.1.0.Beta3, at what point do we need to keep in mind what spark version
> it was also compiled against?
> i.e., is it safe to assume the es-hadoop versions are tied to a specific
> spark core version?
>
> I've been keeping the following chart in my notes to see what all the
> versions are with all dependencies
> =
>
> Akka Version  Dependencies
> Current Akka Stable Release:  2.3.9
>
> Elasticsearch-Hadoop:  2.1.0Beta3 = Spark 1.1.0
> Elasticsearch-Hadoop:  2.1.0Beta3-SNAPSHOT = Spark 1.2.1
> Elasticsearch-Hadoop: what about spark 1.3 ?
>
> Spark: 1.3, Akka: 2.3.4-spark
> Spark: 1.2, Akka: 2.3.4-spark
> Spark: 1.1, Akka: 2.2.3-shaded-protobuf
>
> Activator 1.2.12 comes with with Akka 2.3.4
>
> Play 2.3.8, akka 2.3.4, scala 2.11.1 (will also work with 2.10.4 )
> Play 2.2.x, akka 2.2.0
>
> Spark Job Server 0.4.1, Spark Core 1.1.0, Akka, 2.2.4
> Spark Job Server Master as of Feb 22, 2015, Spark Core 1.2.0,  Akka 2.3.4,
> Scala 2.10.4
>
> Akka persistence latest 2.3.4 or later
> Akka 2.3.9 is released for Scala 2.10.4 and 2.11.5
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/28ad3f78-8b3d-450a-a29d-06d3e6636cfd%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJogdmf9z8JjP_LWTjObDuKiE4DzdSOH2rA%3DdbtkfqQakXYbkw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: large number of indexes for multi-tenant product

There are practical limits, based on your dataset, node sizing, version etc.

You'd be better off segregating indices by a higher level definition (eg
customer number, 1-999, 1000-1999 etc), using routing and then aliases on
top. This way you conceptually get the same layout as a single index per
customer, but it gives you the option to split larger customers out to
their own index and without wasting resources on small use customers.

On 16 March 2015 at 19:11, Richard Blaylock  wrote:

> Hi all,
>
> We have a multi-tenant product and are leaning towards dynamically
> creating (and deleting) various indexes relevant to a tenant at runtime: as
> a tenant is created, so are that tenant's indexes.  When a tenant is
> deleted so are that tenant's indexes.  Each index is specific to that
> tenant and could vary in size, but we do not expect any given index to ever
> be larger than a single disk (e.g. 80 GB).
>
> Due to index shard issues (static, too many shards per index = a hit on
> performance (more map/reduce work to do), etc.), and due to the nature of
> our application, we are currently opting for a single-shard-per-index model
> - each index will have one and only one shard.  We will have replicas for
> fault tolerance.
>
> On the surface, this appears to be an ideal design choice for multi-tenant
> applications: for any given index, one and only one shard will be 'hit' -
> no need to search across multiple shards, ever.  It also reduces contention
> because indexes are always tenant-specific: as an index becomes larger, any
> slowness due to the large index *only* impacts the corresponding tenant
> (customer), whereas the alternative - using one index across tenants - one
> tenant's use/load could negatively impact other tenants' query performance.
>
> So for multi-tenancy, this single-shard-per-index model sounds ideal for
> our use case - the *only* issue here is that the number of indexes
> increases dramatically as the number of tenants (customers) increases.
> Consider a system with 20,000 tenants, each having (potentially) hundreds
> or thousands, or even 10s of thousands of indexes, resulting in millions of
> indexes overall.  This is manageable from our product's perspective, but
> what impact would this have on ElasticSearch, if any?
>
> Are there practical limits? IIUC, there is a Lucene index (file) per
> shard, so if there are hundreds of thousands or millions of Lucene
> indexes/files - other than disk space and file descriptor count per ES
> node, are there any other limits?  Does performance degrade as the number
> of single-shard-indexes increases?  Or is there no problem at all?
>
> Thanks,
> Richard
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/607f62c1-5854-43e0-9d25-3f748aca44a4%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_UA2XX8M8bDCCS%2Bx4p9Ta5-nk1vj45pLh9JDSePY0AGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch high heap usage

Take a look at
http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

On 16 March 2015 at 20:29,  wrote:

> Hello Mark,
>
> Thanks for your answer! We are using the default values, so no doc_values.
> I did some research about it and it sounds very interesting and helpful to
> keep the heap usage lower.
> How can I add doc_values: true to the index template so that the new daily
> based generated indexes using this feature.
>
> Cheers
> Chris
>
> On Monday, March 16, 2015 at 11:36:13 AM UTC+7, Mark Walkom wrote:
>>
>> Those are reasonably large documents. You also seem to have a lot of
>> shards for the data.
>>
>> What sort of data is it, are you using doc values, how are you bucketing
>> data (ie time series indices)?
>>
>> On 15 March 2015 at 20:39,  wrote:
>>
>>> Hello,
>>>
>>> We have a 2 node elasticsearch cluster which is used by logstash to
>>> store log files. The current input is around 100 documents (logs) per
>>> second wit a size of around 50kb - 150kb.
>>> Compared to what i have read so far this is not a high amount but we
>>> experience already a high heap usage 70% form the total of 11GB heap size,
>>> the system has in total 32GB RAM. CPU and IO are totally fine.
>>>
>>> Any suggestion highly appreciated!
>>>
>>> Cheers
>>> Chris
>>>
>>> 
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/c2679cad-72ec-472f-a009-a6c9e2abbb9d%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/bdd6d405-61b4-449d-b1ce-3f95bdf8b0c2%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X89%3D9nd1AVZGvu7TWSM7%2BA_dNsVo%3D9Z0YcxNT%3DsNc6msQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

spark version, elasticsearch-hadoop version, akka version sync up

2015-03-17 Thread Jeff Steinmetz

There are plenty of spark / akka / scala / elasticsearch-hadoop 
dependencies to keep track of.

Is it true that elasticsearch-hadoop needs to be compiled for a specific 
spark version to run correctly on the cluster?  I'm also trying to keep 
track of the akka version and scala version.  i.e, wil es-hadoop compiled 
for spark 1.2  work with Spark 1.3 ?

When the elasticsearch-hadoop versions are released, as v2.0 v2.1, 
v2.1.0.Beta3, at what point do we need to keep in mind what spark version 
it was also compiled against?
i.e., is it safe to assume the es-hadoop versions are tied to a specific 
spark core version?

I've been keeping the following chart in my notes to see what all the 
versions are with all dependencies
=

Akka Version  Dependencies
Current Akka Stable Release:  2.3.9

Elasticsearch-Hadoop:  2.1.0Beta3 = Spark 1.1.0
Elasticsearch-Hadoop:  2.1.0Beta3-SNAPSHOT = Spark 1.2.1
Elasticsearch-Hadoop: what about spark 1.3 ?

Spark: 1.3, Akka: 2.3.4-spark
Spark: 1.2, Akka: 2.3.4-spark
Spark: 1.1, Akka: 2.2.3-shaded-protobuf

Activator 1.2.12 comes with with Akka 2.3.4

Play 2.3.8, akka 2.3.4, scala 2.11.1 (will also work with 2.10.4 )
Play 2.2.x, akka 2.2.0

Spark Job Server 0.4.1, Spark Core 1.1.0, Akka, 2.2.4
Spark Job Server Master as of Feb 22, 2015, Spark Core 1.2.0,  Akka 2.3.4, 
Scala 2.10.4

Akka persistence latest 2.3.4 or later
Akka 2.3.9 is released for Scala 2.10.4 and 2.11.5


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/28ad3f78-8b3d-450a-a29d-06d3e6636cfd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Number of shards in 4 node Cluster

2015-03-17 Thread Andrew Selden

I typically suggest to start with the default of 5 shards. A single shard can 
hold several tens of gigabytes. Certainly in your case it seems like 20 shards 
is overkill for a 4 node cluster. 


> On Mar 17, 2015, at 11:00 AM, John S  wrote:
> 
> Hi All,
> 
> Is there any best practices of having on the number of shards for a cluster? 
> I have a 4 node cluster and used shards of 20.
> 
> During any node failure or other events i doubts since the shards number is 
> high, replication to new node is taking more time...
> 
> Is there any metrics or formula to be done for number or shards?
> 
> Regards
> John
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com
>  
> .
> For more options, visit https://groups.google.com/d/optout 
> .

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/DD2AA858-ABD4-49F5-9F9C-D73C01F615CE%40elastic.co.
For more options, visit https://groups.google.com/d/optout.

Re: Number of shards in 4 node Cluster

What sort of data do you have, time based or static? If it's the former
then going with any arbitrary number is less of a problem as you can change
this the next roll over period. If it's static then 4 would be a good start.

There aren't any metrics around this, other than *not* creating a large
number to start with, as each shard is a lucene instance and does take
resources.

On 17 March 2015 at 11:00, John S  wrote:

> Hi All,
>
> Is there any best practices of having on the number of shards for a
> cluster? I have a 4 node cluster and used shards of 20.
>
> During any node failure or other events i doubts since the shards number
> is high, replication to new node is taking more time...
>
> Is there any metrics or formula to be done for number or shards?
>
> Regards
> John
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_B9mxm9xnJtzoSc-tj1G-MoZ7vdQ-ye%2B7woLfj7aRHJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sorting and range filtering semantic versions

2015-03-17 Thread Mike Turley

Did you ever find a good solution for this?  I am trying to solve the same 
problem (just sorting, not range filtering).

On Monday, January 26, 2015 at 2:47:30 AM UTC-5, Eric Smith wrote:
>
> I am trying to figure out some sort of indexing scheme where I can do 
> range filters on semantic versions .  Values look 
> like these:
>
> "1.0.2.5", "1.10.2.5", "2.3.434.1"
>
> I know that I can add a separate field with the numbers padded out, but I 
> was hoping to have a single field where I could do things like this:
>
> "version:>1.0" "version:1.0.2.5" "version:1.0" "version:[1.0 TO 2.0]"
>
> I have created some pattern capture filters to allow querying partial 
> version numbers. I even created some pattern replacement filters to pad the 
> values out so that they could be lexicographically sorted, but those 
> filters only control the tokens that are indexed and not the value that is 
> used for sorting and range filters.
>
> Is there a way to customize the value that is used for sorting and range 
> filters?  It seems like it just uses the original value and I don't have 
> any control of it?
>
> Any help would be greatly appreciated!
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a80f6c9-ae8e-4df9-a1df-30e3eda6697f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why does creating a repository fail?

According to http://www.kernelcrash.com/blog/nfs-uidgid-mapping/2007/09/10/
the method described in that post only applies to old, out of date,
systems.

I also found no mention of a map file in
http://linux.die.net/man/8/mount.nfs or http://linux.die.net/man/5/nfs

The closest I found to something I could use was
http://serverfault.com/questions/514118/mapping-uid-and-gid-of-local-user-to-the-mounted-nfs-share
But it seems to only apply to nfs version 4. We, for some reason, are on
version 3.

Hmm... Would adding the suid flag to the mount help?


As for iSCSI, it doesn't matter if the file system sees it as a local
device. Currently my file system sees my nfs mounts as pretty much local
mounts. But it still thinks that some of my elasticsearch owned files are
actually owned by my ntp user... I don't see how "formatting however I
want" will help with that kind of issue. Permissions are set by uid and
gid, not the name. Unless iSCSI has some feature that overrides that.

--David Reagan

On Tue, Mar 17, 2015 at 11:11 AM, Mark Walkom  wrote:

> iSCSI can be mounted as a block device that you can format however you
> want, if you do it that way the uid problem won't show up as the system
> sees it as a local FS.
>
> On 17 March 2015 at 09:00, David Reagan  wrote:
>
>> @Mark Walkom, So, I'm looking into iscsi. From what I have learned so
>> far, you actually format the LUN with whatever file system you want. So,
>> wouldn't the gid/uid issue show up there as well, if I formatted to ext3 or
>> ext4? Since Ubuntu would treat it like a normal partition and use typical
>> linux file perms on it.
>>
>> --David Reagan
>>
>> On Mon, Mar 16, 2015 at 5:37 PM, David Reagan  wrote:
>>
>>> If I were manually creating the elasticsearch user, that'd be easy. But
>>> I'm relying on apt to do the job for me. So, yeah...
>>>
>>> Hmm... I suppose I could manually create an elasticsearch2 user, then
>>> modify the defaults files to use it when running ES. Still seems clunky...
>>>
>>> --David Reagan
>>>
>>> On Mon, Mar 16, 2015 at 5:20 PM, Andrew Selden 
>>> wrote:
>>>
 I’m not that familiar with iSCSI so I hesitate to say for sure, but
 anytime you are cross-mounting filesystems on Linux you have to take
 uid/gid consistency into account.

 - Andrew

 On Mar 16, 2015, at 4:46 PM, David Reagan  wrote:

 Would an iSCSI mount have the same issue? I believe our SAN supports
 both.

 --David Reagan

 On Mon, Mar 16, 2015 at 4:40 PM, Andrew Selden 
 wrote:

> Hi David,
>
> This is a common problem with NFS. Unfortunately the protocol assumes
> identical uid/gid mappings across all machines. It’s just one of those
> annoying sys-admin tasks that one has to take into account when using NFS.
> To get your permissions back to less permissive settings you will have to
> edit the /etc/passwd and /etc/group files to keep them in sync.
>
> See http://www.tldp.org/HOWTO/NFS-HOWTO/troubleshooting.html#SYMPTOM4
> for more context.
>
> - Andrew
>
>
> On Mar 16, 2015, at 4:04 PM, David Reagan  wrote:
>
> First, it is a file permissions issue. I did get snapshots to run when
> I chmoded to 777. As you can see from the ls output, /mounts/prod_backup 
> is
> 777. Prior to that it was 775 or 755. So, I could revise my question to
> "How can I get snapshots working without using insecure file permissions?"
>
> root@log-elasticsearch-01:~# mount
> /dev/mapper/ws--template--01-root on / type ext4 (rw,errors=remount-ro)
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> none on /sys/fs/fuse/connections type fusectl (rw)
> none on /sys/kernel/debug type debugfs (rw)
> none on /sys/kernel/security type securityfs (rw)
> udev on /dev type devtmpfs (rw,mode=0755)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
> none on /run/shm type tmpfs (rw,nosuid,nodev)
> /dev/sda1 on /boot type ext2 (rw)
> rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
> nfsip:/vol/Logs/prod_backup on /mounts/prod_backup type nfs
> (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
> nfsip:/vol/Logs/log-elasticsearch-01 on /mounts/log-elasticsearch-01
> type nfs (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
>
> root@log-elasticsearch-01:~# ls -ld /mounts
> drwxr-xr-x 6 root root 4096 Oct  1 13:43 /mounts
>
> root@log-elasticsearch-01:~# ls -ld /mounts/prod_backup/
> drwxrwxrwx 4 elasticsearch elasticsearch 4096 Mar 16 13:41
> /mounts/prod_backup/
>
> --David Reagan
>
> On Mon, Mar 16, 2015 at 3:47 PM, Mark Walkom 
> wrote:
>
>> Can you post the output from *mount* and *ls -ld /mounts
>> /mounts/pr

Re: What's wrong with this query?

2015-03-17 Thread Roger de Cordova Farias

Look at this example on how to use multiple filters:
http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html#_multiple_filters

You should wrap them on a bool filter

2015-03-17 15:32 GMT-03:00 jrkroeg :

> I'm trying to get the top 100 documents which match the filtered criteria,
> and sort by distance from the pin.location.
>
> Here's my query - which isn't resulting in error, but should be returning
> results:
>
> {
>  "query": {
>  "filtered": {
>  "query": {
>  "match_all": {}
>  },
>  "filter": [
>  {
>  "term": {
>  "searchTerm1": "N"
>  }
>  },
>  {
>  "term": {
>  "searchTerm2": "Y"
>  }
>  },
>  {
>  "term": {
>  "searchTerm3": "Y"
>  }
>  },
>  {
>  "term": {
>  "searchTerm4": "Y"
>  }
>  }
>  ]
>  }
>  },
> "sort": [
> {
> "_geo_distance": {
> "pin.location": {
> "lat": 34.073620,
> "lon": -118.400356
> },
> "order": "asc",
> "unit": "mi"
> }
> }
> ],
> "size": 100
> }
>
>
> On a separate note, I'd like to find a way to make the filter more of a
> suggestion, rather than forced - how would I achieve this?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/22379295-332d-4ebe-aef3-6c9b2326e755%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp2533a5NcTmSnSYBDTJtmPpVk9a1vyiO9TZkYnPqdyP3TwnQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Fwd: Us congress hearing of maan alsaan Money laundry قضية الكونغجرس لغسيل الأموال للمليادير معن الصانع

2015-03-17 Thread fayez joell

YouTube videos of



 U.S. Congress money laundering hearing


of

Saudi Billionaire  " Maan  Al sanea"

 with *bank of America*


and  The  owner of Saad Hospital and  Schools

 in the Eastern Province in *Saudi Arabia*



and the Chairman of the Board of Directors of Awal Bank  in *Bahrain*


With Arabic Subtitles


http://www.youtube.com/watch?v=mIBNnQvhU8s





*موقع اليوتيوب الذي عرض جلسة استماع الكونجرس الأمريكي *

* لمتابعة نشاطات غسل الأموال ونشاطات*



*السعودي معن عبدالواحد الصانع*



*مالك مستشفى  وشركة سعد  ومدارس سعد بالمنطقة الشرقية بالسعودية   ورئيس مجلس
ادارة بنك اوال البحريني*



*مترجم باللغة العربية*



http://www.youtube.com/watch?v=mIBNnQvhU8s

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJNWts0XK4DAXzqN%2B2MNUiRCvLht7v1b-Cs-zfUFK2RphjQ%2BHg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

What's wrong with this query?

2015-03-17 Thread jrkroeg

I'm trying to get the top 100 documents which match the filtered criteria, 
and sort by distance from the pin.location.

Here's my query - which isn't resulting in error, but should be returning 
results:

{
 "query": {
 "filtered": {
 "query": {
 "match_all": {}
 },
 "filter": [
 {
 "term": {
 "searchTerm1": "N"
 }
 },
 {
 "term": {
 "searchTerm2": "Y"
 }
 },
 {
 "term": {
 "searchTerm3": "Y"
 }
 },
 {
 "term": {
 "searchTerm4": "Y"
 }
 }
 ]
 }
 },
"sort": [
{
"_geo_distance": {
"pin.location": {
"lat": 34.073620,
"lon": -118.400356
},
"order": "asc",
"unit": "mi"
}
}
],
"size": 100
}


On a separate note, I'd like to find a way to make the filter more of a 
suggestion, rather than forced - how would I achieve this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22379295-332d-4ebe-aef3-6c9b2326e755%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch - transport client singleton

 I am newbie in elastic and I don't understand how should I work with transport 
client connections. Should I use singleton for Client, something like

class ElasticClientManager {
  private static Client client;
 
 public static Client getClient(){
    if (client==null) {
  Settings settings = ImmutableSettings.settingsBuilder()
    .put("cluster.name", "elasticsearch")
    .put("client.transport.sniff", true).build();

   client = new TransportClient(settings)
    .addTransportAddress(new 
InetSocketTransportAddress("localhost",9300));
    }
   return client;
 }
}

By other words - I create one client and keep the reference in it in singleton. 
Every time I need to query elastic I do

Client client = ElasticClientManager.getClient();
GetResponse getResponse = client.prepareGet().execute().actionGet();

Is such approach right?


-- 
Александр Свиридов

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1426616605.710285922%40f217.i.mail.ru.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch - transport client singleton

 I am newbie in elastic and I don't understand how should I work with transport 
client connections. Should I use singleton for Client, something like

class ElasticClientManager {
  private static Client client;
 
 public static Client getClient(){
    if (client==null) {
  Settings settings = ImmutableSettings.settingsBuilder()
    .put("cluster.name", "elasticsearch")
    .put("client.transport.sniff", true).build();

   client = new TransportClient(settings)
    .addTransportAddress(new 
InetSocketTransportAddress("localhost",9300));
    }
   return client;
 }
}

By other words - I create one client and keep the reference in it in singleton. 
Every time I need to query elastic I do

Client client = ElasticClientManager.getClient();
GetResponse getResponse = client.prepareGet().execute().actionGet();

Is such approach right? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1426616386.538391541%40f217.i.mail.ru.
For more options, visit https://groups.google.com/d/optout.

Re: Why does creating a repository fail?

iSCSI can be mounted as a block device that you can format however you
want, if you do it that way the uid problem won't show up as the system
sees it as a local FS.

On 17 March 2015 at 09:00, David Reagan  wrote:

> @Mark Walkom, So, I'm looking into iscsi. From what I have learned so far,
> you actually format the LUN with whatever file system you want. So,
> wouldn't the gid/uid issue show up there as well, if I formatted to ext3 or
> ext4? Since Ubuntu would treat it like a normal partition and use typical
> linux file perms on it.
>
> --David Reagan
>
> On Mon, Mar 16, 2015 at 5:37 PM, David Reagan  wrote:
>
>> If I were manually creating the elasticsearch user, that'd be easy. But
>> I'm relying on apt to do the job for me. So, yeah...
>>
>> Hmm... I suppose I could manually create an elasticsearch2 user, then
>> modify the defaults files to use it when running ES. Still seems clunky...
>>
>> --David Reagan
>>
>> On Mon, Mar 16, 2015 at 5:20 PM, Andrew Selden  wrote:
>>
>>> I’m not that familiar with iSCSI so I hesitate to say for sure, but
>>> anytime you are cross-mounting filesystems on Linux you have to take
>>> uid/gid consistency into account.
>>>
>>> - Andrew
>>>
>>> On Mar 16, 2015, at 4:46 PM, David Reagan  wrote:
>>>
>>> Would an iSCSI mount have the same issue? I believe our SAN supports
>>> both.
>>>
>>> --David Reagan
>>>
>>> On Mon, Mar 16, 2015 at 4:40 PM, Andrew Selden 
>>> wrote:
>>>
 Hi David,

 This is a common problem with NFS. Unfortunately the protocol assumes
 identical uid/gid mappings across all machines. It’s just one of those
 annoying sys-admin tasks that one has to take into account when using NFS.
 To get your permissions back to less permissive settings you will have to
 edit the /etc/passwd and /etc/group files to keep them in sync.

 See http://www.tldp.org/HOWTO/NFS-HOWTO/troubleshooting.html#SYMPTOM4
 for more context.

 - Andrew


 On Mar 16, 2015, at 4:04 PM, David Reagan  wrote:

 First, it is a file permissions issue. I did get snapshots to run when
 I chmoded to 777. As you can see from the ls output, /mounts/prod_backup is
 777. Prior to that it was 775 or 755. So, I could revise my question to
 "How can I get snapshots working without using insecure file permissions?"

 root@log-elasticsearch-01:~# mount
 /dev/mapper/ws--template--01-root on / type ext4 (rw,errors=remount-ro)
 proc on /proc type proc (rw,noexec,nosuid,nodev)
 sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
 none on /sys/fs/fuse/connections type fusectl (rw)
 none on /sys/kernel/debug type debugfs (rw)
 none on /sys/kernel/security type securityfs (rw)
 udev on /dev type devtmpfs (rw,mode=0755)
 devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
 tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
 none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
 none on /run/shm type tmpfs (rw,nosuid,nodev)
 /dev/sda1 on /boot type ext2 (rw)
 rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
 nfsip:/vol/Logs/prod_backup on /mounts/prod_backup type nfs
 (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
 nfsip:/vol/Logs/log-elasticsearch-01 on /mounts/log-elasticsearch-01
 type nfs (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)

 root@log-elasticsearch-01:~# ls -ld /mounts
 drwxr-xr-x 6 root root 4096 Oct  1 13:43 /mounts

 root@log-elasticsearch-01:~# ls -ld /mounts/prod_backup/
 drwxrwxrwx 4 elasticsearch elasticsearch 4096 Mar 16 13:41
 /mounts/prod_backup/

 --David Reagan

 On Mon, Mar 16, 2015 at 3:47 PM, Mark Walkom 
 wrote:

> Can you post the output from *mount* and *ls -ld /mounts
> /mounts/prod_backup*?
>
> On 16 March 2015 at 13:33, David Reagan  wrote:
>
>> Why does this happen?
>>
>>
>> curl -XPUT 'http://localhost:9200/_snapshot/my_backup?pretty=true'
>>> -d '{
>>> > "type": "fs",
>>> > "settings": {
>>> > "location": "/mounts/prod_backup/my_backup",
>>> > "compress": true
>>> > }
>>> > }'
>>> {
>>>   "error" :
>>> "RemoteTransportException[[log-elasticsearch-02][inet[/10.x.x.83:9300]][cluster:admin/repository/put]];
>>> nested: RepositoryVerificationException[[my_backup]
>>> [vxUQwUTCQwOaLyCy0eMK8A,
>>> 'RemoteTransportException[[log-elasticsearch-04][inet[/10.x.x.80:9300]][internal:admin/repository/verify]];
>>> nested: RepositoryVerificationException[[my_backup] store location
>>> [/mounts/prod_backup/my_backup] is not accessible on the node
>>> [[log-elasticsearch-04][vxUQwUTCQwOaLyCy0eMK8A][log-elasticsearch-04][inet[/10.x.x.80:9300;
>>> nested:
>>> FileNotFoundException[/mounts/prod_backup/my_backup/tests-yZ57gviiQUGS55tr_ULhhg-vxUQwUTCQwOaLyCy0eMK8A
>>> (Permission denied)]; '], [GMTt6Y-3Qle1F

Re[6]: Elasticsearch - node client does not connect to cluster

 I agree with you that in single node environment only transport layer should 
be used. But I want to know how to make node client work because maybe I will 
need it in future and I want to know what I can do with elastic java api.


Вторник, 17 марта 2015, 11:56 -06:00 от Aaron Mefford :
>What is the advantage you expect from using the Node client, especially in a 
>single node environment?
>
>With client.transport.sniff true it should discover the other nodes, if other 
>nodes exist.
>
>On Tue, Mar 17, 2015 at 11:42 AM, Александр Свиридов  < ooo_satu...@mail.ru > 
>wrote:
>>Thank you. I did this way:
>>
>> Settings settings = ImmutableSettings.settingsBuilder()
>>    .put(" cluster.name ", "elasticsearch")
>>    .put("client.transport.sniff", true).build();
>>
>>    Client client = new TransportClient(settings)
>>    .addTransportAddress(new 
>>InetSocketTransportAddress("localhost",9300));
>>
>>And everything works fine. So, both cluster and index exist.
>>
>>However, as I understand it is not node client. What you sugget is transport 
>>client. Now I want to understand how to make node client work. 
>>
>>
>>Вторник, 17 марта 2015, 11:26 -06:00 от Aaron Mefford < aa...@definemg.com >:
>>>This is what I use in my code, not sure how correct it is given the abysmal 
>>>state of the the Java API documentation.
>>>
>>>import org.elasticsearch.common.settings.Settings;
>>>import org.elasticsearch.common.settings.ImmutableSettings;
>>>import org.elasticsearch.client.Client;
>>>import org.elasticsearch.client.transport.TransportClient;
>>>import org.elasticsearch.common.transport.InetSocketTransportAddress;
>>>
>>>
>>>
>>>        Settings settings = ImmutableSettings.settingsBuilder()
>>>                                .put(" cluster.name ", elasticClusterName)
>>>                                .put("client.transport.sniff", true).build();
>>>
>>>        esClient = new TransportClient(settings)
>>>            .addTransportAddress(new 
>>>InetSocketTransportAddress(elasticHost,elasticPort));
>>>
>>>
>>>On Tue, Mar 17, 2015 at 11:19 AM, Александр Свиридов  < ooo_satu...@mail.ru 
 wrote:
I am quite newbie to elactis. Could you explain with java code what you 
mean?


Вторник, 17 марта 2015, 9:46 -07:00 от  aa...@definemg.com :
>Is there a reason not to just specify the IP address and to try and rely 
>on multicast?
>
>I realize this is all on one node as you have stated that, but that seems 
>even more reason that it would be little issue to specify the IP.  While 
>multicast makes it easy to stand up a cluster in an ideal situation, my 
>experience has been that it leads to more problems down the road, and 
>things generally work better when not using multicast.   I heard the same 
>suggestion repeatedly at Elastic{on}.
>
>Aaron
>
>On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>>I have one physical server and I work only on it (no other servers).
>>At this server I have running elastic 1.4.2 - I use this version as this 
>>is the last version elastic osgi bundle is ready for. Also at this server 
>>I have glassfish 4.1 as java-ee server.
>>I run elastic node client inside my java-ee application. And I do it this 
>>way:
>>Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
>>Client client = node.client();
>>GetResponse getResponse = 
>>client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
>>Map source = getResponse.getSource();
>>System.out.println("--");
>>System.out.println("Index: "+ getResponse.getIndex());
>>System.out.println("Type: "+ getResponse.getType());
>>System.out.println("Id: "+ getResponse.getId());
>>System.out.println("Version: "+ getResponse.getVersion());
>>System.out.println(source);
>>
>>In log I see the following:
>>>[2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.discovery] [tid: _ThreadID=30 
>>>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue: 
>>>800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>>>[2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.cluster.service] [tid: _ThreadID=128 
>>>_ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]] 
>>>[timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new 
>>>[Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][ webserver1.com 
>>>][local[1]]{local=true}}, removed {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][ 
>>>webserver1.com ][local[1]]{local=true},}, reason: 
>>>local-disco-initial_connect(master)]]
>>>[2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.http] [tid: _ThreadID=30 
>>>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664502] [levelValue: 
>>>80

Re: Why does creating a repository fail?

As has been mentioned, use uid remapping when mounting.

On 16 March 2015 at 17:37, David Reagan  wrote:

> If I were manually creating the elasticsearch user, that'd be easy. But
> I'm relying on apt to do the job for me. So, yeah...
>
> Hmm... I suppose I could manually create an elasticsearch2 user, then
> modify the defaults files to use it when running ES. Still seems clunky...
>
> --David Reagan
>
> On Mon, Mar 16, 2015 at 5:20 PM, Andrew Selden  wrote:
>
>> I’m not that familiar with iSCSI so I hesitate to say for sure, but
>> anytime you are cross-mounting filesystems on Linux you have to take
>> uid/gid consistency into account.
>>
>> - Andrew
>>
>> On Mar 16, 2015, at 4:46 PM, David Reagan  wrote:
>>
>> Would an iSCSI mount have the same issue? I believe our SAN supports
>> both.
>>
>> --David Reagan
>>
>> On Mon, Mar 16, 2015 at 4:40 PM, Andrew Selden  wrote:
>>
>>> Hi David,
>>>
>>> This is a common problem with NFS. Unfortunately the protocol assumes
>>> identical uid/gid mappings across all machines. It’s just one of those
>>> annoying sys-admin tasks that one has to take into account when using NFS.
>>> To get your permissions back to less permissive settings you will have to
>>> edit the /etc/passwd and /etc/group files to keep them in sync.
>>>
>>> See http://www.tldp.org/HOWTO/NFS-HOWTO/troubleshooting.html#SYMPTOM4
>>> for more context.
>>>
>>> - Andrew
>>>
>>>
>>> On Mar 16, 2015, at 4:04 PM, David Reagan  wrote:
>>>
>>> First, it is a file permissions issue. I did get snapshots to run when I
>>> chmoded to 777. As you can see from the ls output, /mounts/prod_backup is
>>> 777. Prior to that it was 775 or 755. So, I could revise my question to
>>> "How can I get snapshots working without using insecure file permissions?"
>>>
>>> root@log-elasticsearch-01:~# mount
>>> /dev/mapper/ws--template--01-root on / type ext4 (rw,errors=remount-ro)
>>> proc on /proc type proc (rw,noexec,nosuid,nodev)
>>> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
>>> none on /sys/fs/fuse/connections type fusectl (rw)
>>> none on /sys/kernel/debug type debugfs (rw)
>>> none on /sys/kernel/security type securityfs (rw)
>>> udev on /dev type devtmpfs (rw,mode=0755)
>>> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
>>> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
>>> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
>>> none on /run/shm type tmpfs (rw,nosuid,nodev)
>>> /dev/sda1 on /boot type ext2 (rw)
>>> rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
>>> nfsip:/vol/Logs/prod_backup on /mounts/prod_backup type nfs
>>> (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
>>> nfsip:/vol/Logs/log-elasticsearch-01 on /mounts/log-elasticsearch-01
>>> type nfs (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
>>>
>>> root@log-elasticsearch-01:~# ls -ld /mounts
>>> drwxr-xr-x 6 root root 4096 Oct  1 13:43 /mounts
>>>
>>> root@log-elasticsearch-01:~# ls -ld /mounts/prod_backup/
>>> drwxrwxrwx 4 elasticsearch elasticsearch 4096 Mar 16 13:41
>>> /mounts/prod_backup/
>>>
>>> --David Reagan
>>>
>>> On Mon, Mar 16, 2015 at 3:47 PM, Mark Walkom 
>>> wrote:
>>>
 Can you post the output from *mount* and *ls -ld /mounts
 /mounts/prod_backup*?

 On 16 March 2015 at 13:33, David Reagan  wrote:

> Why does this happen?
>
>
> curl -XPUT 'http://localhost:9200/_snapshot/my_backup?pretty=true' -d
>> '{
>> > "type": "fs",
>> > "settings": {
>> > "location": "/mounts/prod_backup/my_backup",
>> > "compress": true
>> > }
>> > }'
>> {
>>   "error" :
>> "RemoteTransportException[[log-elasticsearch-02][inet[/10.x.x.83:9300]][cluster:admin/repository/put]];
>> nested: RepositoryVerificationException[[my_backup]
>> [vxUQwUTCQwOaLyCy0eMK8A,
>> 'RemoteTransportException[[log-elasticsearch-04][inet[/10.x.x.80:9300]][internal:admin/repository/verify]];
>> nested: RepositoryVerificationException[[my_backup] store location
>> [/mounts/prod_backup/my_backup] is not accessible on the node
>> [[log-elasticsearch-04][vxUQwUTCQwOaLyCy0eMK8A][log-elasticsearch-04][inet[/10.x.x.80:9300;
>> nested:
>> FileNotFoundException[/mounts/prod_backup/my_backup/tests-yZ57gviiQUGS55tr_ULhhg-vxUQwUTCQwOaLyCy0eMK8A
>> (Permission denied)]; '], [GMTt6Y-3Qle1Fm3SGl-LTQ,
>> 'RemoteTransportException[[log-estools-01][inet[/10.x.x.8:9300]][internal:admin/repository/verify]];
>> nested: RepositoryVerificationException[[my_backup] store location
>> [/mounts/prod_backup/my_backup] is not accessible on the node
>> [[log-estools-01][GMTt6Y-3Qle1Fm3SGl-LTQ][log-estools-01][inet[/10.x.x.8:9300]]{data=false}]];
>> nested:
>> FileNotFoundException[/mounts/prod_backup/my_backup/tests-yZ57gviiQUGS55tr_ULhhg-GMTt6Y-3Qle1Fm3SGl-LTQ
>> (Permission denied)]; '], [ffpuQF_zRZGGPRkZRgq1mw,
>> 'RemoteTransportException[[log-elasticsearch-03][inet[/10.x.x.92:9300]

Re: Data not indexed into ElasticSearch from RabbitMQ

I'd recommend that you use Logstash with the rabbitmq input instead. Rivers
are being deprecated so fewer people will likely be able to help.

On 17 March 2015 at 10:23, Olalekan Elesin 
wrote:

> After proper setting up RabbitMQ river for elasticsearch, I issued the
> command GET :9200/_river/my_river/status,
>
> {
>
> "_index": "_river",
>
> "_type": "my_river",
>
> "_id": "_status",
>
> "_version": 2,
>
> "found": true,
>
> "_source": {
>
> "node": {
>
> "id": "-nA8mbDEQ4e3l4HVqlIToA",
>
> "name": "Skullfire",
>
> "transport_address": "inet[/:9300]"
>
> }
>
> }
>
> }
>
> but data is shown to be indexed. Please help.
>
> Thank you.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/124cd5c2-3de4-441d-978d-6243eb7fe22d%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_Lvi5%2BfZmxBB%2B_ALNn0jMMoiw_RV38QMp5y9JkHrfSkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Number of shards in 4 node Cluster

2015-03-17 Thread John S

Hi All,

Is there any best practices of having on the number of shards for a 
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number is 
high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Re[4]: Elasticsearch - node client does not connect to cluster

2015-03-17 Thread Aaron Mefford

What is the advantage you expect from using the Node client, especially in
a single node environment?

With client.transport.sniff true it should discover the other nodes, if
other nodes exist.

On Tue, Mar 17, 2015 at 11:42 AM, Александр Свиридов 
wrote:

> Thank you. I did this way:
>
>  Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", "elasticsearch")
> .put("client.transport.sniff",
> true).build();
>
> Client client = new TransportClient(settings)
> .addTransportAddress(new
> InetSocketTransportAddress("localhost",9300));
>
> And everything works fine. So, both cluster and index exist.
>
> However, as I understand it is not node client. What you sugget is
> transport client. Now I want to understand how to make node client work.
>
>
> Вторник, 17 марта 2015, 11:26 -06:00 от Aaron Mefford  >:
>
>   This is what I use in my code, not sure how correct it is given the
> abysmal state of the the Java API documentation.
>
> import org.elasticsearch.common.settings.Settings;
> import org.elasticsearch.common.settings.ImmutableSettings;
> import org.elasticsearch.client.Client;
> import org.elasticsearch.client.transport.TransportClient;
> import org.elasticsearch.common.transport.InetSocketTransportAddress;
>
>
>
> Settings settings = ImmutableSettings.settingsBuilder()
> .put("cluster.name", elasticClusterName)
> .put("client.transport.sniff",
> true).build();
>
> esClient = new TransportClient(settings)
> .addTransportAddress(new
> InetSocketTransportAddress(elasticHost,elasticPort));
>
>
> On Tue, Mar 17, 2015 at 11:19 AM, Александр Свиридов  > wrote:
>
> I am quite newbie to elactis. Could you explain with java code what you
> mean?
>
>
> Вторник, 17 марта 2015, 9:46 -07:00 от aa...@definemg.com
> :
>
>   Is there a reason not to just specify the IP address and to try and
> rely on multicast?
>
> I realize this is all on one node as you have stated that, but that seems
> even more reason that it would be little issue to specify the IP.  While
> multicast makes it easy to stand up a cluster in an ideal situation, my
> experience has been that it leads to more problems down the road, and
> things generally work better when not using multicast.   I heard the same
> suggestion repeatedly at Elastic{on}.
>
> Aaron
>
> On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>
> I have one physical server and I work only on it (no other servers).
>
> At this server I have running elastic 1.4.2 - I use this version as this
> is the last version elastic osgi bundle is ready for. Also at this server I
> have glassfish 4.1 as java-ee server.
>
> I run elastic node client inside my java-ee application. And I do it this
> way:
>
> Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
> Client client = node.client();
> GetResponse getResponse = 
> client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
> Map source = getResponse.getSource();
> System.out.println("--");
> System.out.println("Index: "+ getResponse.getIndex());
> System.out.println("Type: "+ getResponse.getType());
> System.out.println("Id: "+ getResponse.getId());
> System.out.println("Version: "+ getResponse.getVersion());
> System.out.println(source);
>
>
>
> In log I see the following:
>
> [2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.discovery] [tid: _ThreadID=30
> _ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue:
> 800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>
> [2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.cluster.service] [tid: _ThreadID=128
> _ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]]
> [timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new
> [Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][webserver1.com][local[1]]{local=true}},
> removed 
> {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][webserver1.com][local[1]]{local=true},},
> reason: local-disco-initial_connect(master)]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.http] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] bound_address
> {inet[/0:0:0:0:0:0:0:0:9202]}, publish_address {inet[/SERVER IP:9202]}]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.node] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] started]]
>
> and I get this exeption: ...
>
> Caused by: 
> org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
> at 
> org.elasticsearch.cluster.metadata.Meta

Re[4]: Elasticsearch - node client does not connect to cluster

 Thank you. I did this way:

 Settings settings = ImmutableSettings.settingsBuilder()
    .put("cluster.name", "elasticsearch")
    .put("client.transport.sniff", true).build();

    Client client = new TransportClient(settings)
    .addTransportAddress(new 
InetSocketTransportAddress("localhost",9300));

And everything works fine. So, both cluster and index exist.

However, as I understand it is not node client. What you sugget is transport 
client. Now I want to understand how to make node client work. 


Вторник, 17 марта 2015, 11:26 -06:00 от Aaron Mefford :
>This is what I use in my code, not sure how correct it is given the abysmal 
>state of the the Java API documentation.
>
>import org.elasticsearch.common.settings.Settings;
>import org.elasticsearch.common.settings.ImmutableSettings;
>import org.elasticsearch.client.Client;
>import org.elasticsearch.client.transport.TransportClient;
>import org.elasticsearch.common.transport.InetSocketTransportAddress;
>
>
>
>        Settings settings = ImmutableSettings.settingsBuilder()
>                                .put(" cluster.name ", elasticClusterName)
>                                .put("client.transport.sniff", true).build();
>
>        esClient = new TransportClient(settings)
>            .addTransportAddress(new 
>InetSocketTransportAddress(elasticHost,elasticPort));
>
>
>On Tue, Mar 17, 2015 at 11:19 AM, Александр Свиридов  < ooo_satu...@mail.ru > 
>wrote:
>>I am quite newbie to elactis. Could you explain with java code what you mean?
>>
>>
>>Вторник, 17 марта 2015, 9:46 -07:00 от  aa...@definemg.com :
>>>Is there a reason not to just specify the IP address and to try and rely on 
>>>multicast?
>>>
>>>I realize this is all on one node as you have stated that, but that seems 
>>>even more reason that it would be little issue to specify the IP.  While 
>>>multicast makes it easy to stand up a cluster in an ideal situation, my 
>>>experience has been that it leads to more problems down the road, and things 
>>>generally work better when not using multicast.   I heard the same 
>>>suggestion repeatedly at Elastic{on}.
>>>
>>>Aaron
>>>
>>>On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
I have one physical server and I work only on it (no other servers).
At this server I have running elastic 1.4.2 - I use this version as this is 
the last version elastic osgi bundle is ready for. Also at this server I 
have glassfish 4.1 as java-ee server.
I run elastic node client inside my java-ee application. And I do it this 
way:
Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
Client client = node.client();
GetResponse getResponse = 
client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
Map source = getResponse.getSource();
System.out.println("--");
System.out.println("Index: "+ getResponse.getIndex());
System.out.println("Type: "+ getResponse.getType());
System.out.println("Id: "+ getResponse.getId());
System.out.println("Version: "+ getResponse.getVersion());
System.out.println(source);

In log I see the following:
>[2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] [] 
>[org.elasticsearch.discovery] [tid: _ThreadID=30 
>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue: 
>800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>[2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] [] 
>[org.elasticsearch.cluster.service] [tid: _ThreadID=128 
>_ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]] 
>[timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new 
>[Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][ webserver1.com ][local[1]]{local=true}}, 
>removed {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][ webserver1.com 
>][local[1]]{local=true},}, reason: local-disco-initial_connect(master)]]
>[2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
>[org.elasticsearch.http] [tid: _ThreadID=30 
>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664502] [levelValue: 
>800] [[ [Pistol] bound_address {inet[/0:0:0:0:0:0:0:0:9202]}, 
>publish_address {inet[/SERVER IP:9202]}]]
>[2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
>[org.elasticsearch.node] [tid: _ThreadID=30 
>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664502] [levelValue: 
>800] [[ [Pistol] started]]
and I get this exeption: ...
Caused by: 
 org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
at 
org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:768)
at 
org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:691)
at 
org.elasticsearch.cluster.metadata.MetaData.concreteSingleIndex(MetaData.java:748)
at 
org.elasticsearch.action.support.single.shard.TransportShardS

Re: Re[2]: Elasticsearch - node client does not connect to cluster

2015-03-17 Thread Aaron Mefford

This is what I use in my code, not sure how correct it is given the abysmal
state of the the Java API documentation.

import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.transport.InetSocketTransportAddress;



Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", elasticClusterName)
.put("client.transport.sniff",
true).build();

esClient = new TransportClient(settings)
.addTransportAddress(new
InetSocketTransportAddress(elasticHost,elasticPort));


On Tue, Mar 17, 2015 at 11:19 AM, Александр Свиридов 
wrote:

> I am quite newbie to elactis. Could you explain with java code what you
> mean?
>
>
> Вторник, 17 марта 2015, 9:46 -07:00 от aa...@definemg.com:
>
>   Is there a reason not to just specify the IP address and to try and
> rely on multicast?
>
> I realize this is all on one node as you have stated that, but that seems
> even more reason that it would be little issue to specify the IP.  While
> multicast makes it easy to stand up a cluster in an ideal situation, my
> experience has been that it leads to more problems down the road, and
> things generally work better when not using multicast.   I heard the same
> suggestion repeatedly at Elastic{on}.
>
> Aaron
>
> On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>
> I have one physical server and I work only on it (no other servers).
>
> At this server I have running elastic 1.4.2 - I use this version as this
> is the last version elastic osgi bundle is ready for. Also at this server I
> have glassfish 4.1 as java-ee server.
>
> I run elastic node client inside my java-ee application. And I do it this
> way:
>
> Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
> Client client = node.client();
> GetResponse getResponse = 
> client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
> Map source = getResponse.getSource();
> System.out.println("--");
> System.out.println("Index: "+ getResponse.getIndex());
> System.out.println("Type: "+ getResponse.getType());
> System.out.println("Id: "+ getResponse.getId());
> System.out.println("Version: "+ getResponse.getVersion());
> System.out.println(source);
>
>
>
> In log I see the following:
>
> [2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.discovery] [tid: _ThreadID=30
> _ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue:
> 800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>
> [2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.cluster.service] [tid: _ThreadID=128
> _ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]]
> [timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new
> [Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][webserver1.com][local[1]]{local=true}},
> removed 
> {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][webserver1.com][local[1]]{local=true},},
> reason: local-disco-initial_connect(master)]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.http] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] bound_address
> {inet[/0:0:0:0:0:0:0:0:9202]}, publish_address {inet[/SERVER IP:9202]}]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
> [org.elasticsearch.node] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] started]]
>
> and I get this exeption: ...
>
> Caused by: 
> org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:768)
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:691)
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteSingleIndex(MetaData.java:748)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:139)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
> at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
> at 
> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.jav

Data not indexed into ElasticSearch from RabbitMQ

2015-03-17 Thread Olalekan Elesin

After proper setting up RabbitMQ river for elasticsearch, I issued the 
command GET :9200/_river/my_river/status, 

{

"_index": "_river",

"_type": "my_river",

"_id": "_status",

"_version": 2,

"found": true,

"_source": {

"node": {

"id": "-nA8mbDEQ4e3l4HVqlIToA",

"name": "Skullfire",

"transport_address": "inet[/:9300]"

}

}

}

but data is shown to be indexed. Please help. 

Thank you.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/124cd5c2-3de4-441d-978d-6243eb7fe22d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re[2]: Elasticsearch - node client does not connect to cluster

 I am quite newbie to elactis. Could you explain with java code what you mean?


Вторник, 17 марта 2015, 9:46 -07:00 от aa...@definemg.com:
>Is there a reason not to just specify the IP address and to try and rely on 
>multicast?
>
>I realize this is all on one node as you have stated that, but that seems even 
>more reason that it would be little issue to specify the IP.  While multicast 
>makes it easy to stand up a cluster in an ideal situation, my experience has 
>been that it leads to more problems down the road, and things generally work 
>better when not using multicast.   I heard the same suggestion repeatedly at 
>Elastic{on}.
>
>Aaron
>
>On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>>I have one physical server and I work only on it (no other servers).
>>At this server I have running elastic 1.4.2 - I use this version as this is 
>>the last version elastic osgi bundle is ready for. Also at this server I have 
>>glassfish 4.1 as java-ee server.
>>I run elastic node client inside my java-ee application. And I do it this way:
>>Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
>>Client client = node.client();
>>GetResponse getResponse = 
>>client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
>>Map source = getResponse.getSource();
>>System.out.println("--");
>>System.out.println("Index: "+ getResponse.getIndex());
>>System.out.println("Type: "+ getResponse.getType());
>>System.out.println("Id: "+ getResponse.getId());
>>System.out.println("Version: "+ getResponse.getVersion());
>>System.out.println(source);
>>
>>In log I see the following:
>>>[2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.discovery] [tid: _ThreadID=30 
>>>_ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue: 
>>>800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>>>[2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.cluster.service] [tid: _ThreadID=128 
>>>_ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]] 
>>>[timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new 
>>>[Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][ webserver1.com ][local[1]]{local=true}}, 
>>>removed {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][ webserver1.com 
>>>][local[1]]{local=true},}, reason: local-disco-initial_connect(master)]]
>>>[2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.http] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)] 
>>>[timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] bound_address 
>>>{inet[/0:0:0:0:0:0:0:0:9202]}, publish_address {inet[/SERVER IP:9202]}]]
>>>[2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
>>>[org.elasticsearch.node] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)] 
>>>[timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] started]]
>>and I get this exeption: ...
>>Caused by: 
>> org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
at 
org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:768)
at 
org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:691)
at 
org.elasticsearch.cluster.metadata.MetaData.concreteSingleIndex(MetaData.java:748)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:139)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
at 
org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
at 
org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
at 
org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
at 
org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
at 
org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
>>
>>So it can't find the index - my.index-0.2.2. However this index exists! 
>>Besides, when I do curl -XGET ' http://localhost:9200/_cluster/state?pretty=1 
>>' I see there only one node and this  is not SCKIrGHQTaC5eEYmYfZ0Iw. I 
>>suppose that the node I create using java API creates new cluster and dosn't 
>>connect to my existing cluster - that's why it says - it's master. Or I don't 
>>understand something I have problem with code. Besides I've checked tha name 
>>of cluster it's elasticsearch. So, how can I connect to my existing 
>>elasticsearch cluster?
>-- 
>You

Re: Elasticsearch - node client does not connect to cluster

We do recommend to use unicast in production.

On 17 March 2015 at 09:46,  wrote:

> Is there a reason not to just specify the IP address and to try and rely
> on multicast?
>
> I realize this is all on one node as you have stated that, but that seems
> even more reason that it would be little issue to specify the IP.  While
> multicast makes it easy to stand up a cluster in an ideal situation, my
> experience has been that it leads to more problems down the road, and
> things generally work better when not using multicast.   I heard the same
> suggestion repeatedly at Elastic{on}.
>
> Aaron
>
> On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>>
>> I have one physical server and I work only on it (no other servers).
>>
>> At this server I have running elastic 1.4.2 - I use this version as this
>> is the last version elastic osgi bundle is ready for. Also at this server I
>> have glassfish 4.1 as java-ee server.
>>
>> I run elastic node client inside my java-ee application. And I do it this
>> way:
>>
>> Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
>> Client client = node.client();
>> GetResponse getResponse = 
>> client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
>> Map source = getResponse.getSource();
>> System.out.println("--");
>> System.out.println("Index: "+ getResponse.getIndex());
>> System.out.println("Type: "+ getResponse.getType());
>> System.out.println("Id: "+ getResponse.getId());
>> System.out.println("Version: "+ getResponse.getVersion());
>> System.out.println(source);
>>
>>
>>
>> In log I see the following:
>>
>> [2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] []
>> [org.elasticsearch.discovery] [tid: _ThreadID=30
>> _ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue:
>> 800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>>
>> [2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] []
>> [org.elasticsearch.cluster.service] [tid: _ThreadID=128
>> _ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]]
>> [timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new
>> [Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][webserver1.com][local[1]]{local=true}},
>> removed {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][webserver1.com
>> ][local[1]]{local=true},}, reason: local-disco-initial_connect(master)]]
>>
>> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
>> [org.elasticsearch.http] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
>> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] bound_address
>> {inet[/0:0:0:0:0:0:0:0:9202]}, publish_address {inet[/SERVER IP:9202]}]]
>>
>> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] []
>> [org.elasticsearch.node] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)]
>> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] started]]
>>
>> and I get this exeption: ...
>>
>> Caused by: 
>> org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
>> at 
>> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:768)
>> at 
>> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:691)
>> at 
>> org.elasticsearch.cluster.metadata.MetaData.concreteSingleIndex(MetaData.java:748)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:139)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
>> at 
>> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
>> at 
>> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
>> at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
>> at 
>> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
>> at 
>> org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
>> at 
>> org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
>> at 
>> org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
>>
>> So it can't find the index - my.index-0.2.2. However this index exists!
>> Besides, when I do curl -XGET 'http://localhost:9200/_
>> cluster/state?pretty=1' I see there only one node and this *is not*
>> SCKIrGHQTaC5eEYmYfZ0Iw. I suppose that the node I create using java API
>> creates new cluster and dosn't connect to my existing cluster - that's why
>> it says - it's master. Or I don't understand something I have problem with
>> code. Besides I've checked tha name of cluster it's elasticsearch. So,
>> how

Re: ES performance tunning

Take a look at
http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster.html
for other settings.

On 17 March 2015 at 00:48, Hoon Cho  wrote:

>
> I was searching for ES performance by google and I find some documents.
> They says modify ES config is good for ES performance.
> So, I edit my ES config like below.
>
> */etc/elasticsearch/elasticsearch.yml*
> index.number_of_replica: 0
> index.number_of_shards: 3
> index.translog.flush_threshold_ops: 5
> index.refresh_interval: -1
> indices.memory.index_buffer_size: 50%
> index.store.type: mmapfs
> bootstrap.mlockall: true
>
> */etc/default/elasticsearch*
> ES_HEAP_SIZE: 4g (my machine has 8g RAM)
> MAX_LOCKED_MEMORY=unlimited
>
>
> After configure above restart ES and LS.
>
> *# /etc/init.d/elasticsearch restart && sudo restart logstash*
>
>
> And make sure ES setting is correctly using curl command.
>
> *# curl 'localhost:9200/logstash-iis-test04/_settings?pretty'*
> {
>   "logstash-iis-test04" : {
> "settings" : {
>   "index" : {
> "creation_date" : "1426555104697",
> "uuid" : "0PuOIGj-RnKS9cMKXbsryQ",
> "number_of_replicas" : "0",
> "number_of_shards" : "3",
> "refresh_interval" : "5s",
> "version" : {
>   "created" : "1010199"
> }
>   }
> }
>   }
> }
>
> As you see, index.number_of_replicas and index.number_of_shards values is
> correct
> but , index.refresh_interval is not correct (I set this value is -1)
> and another setting is not shown, where can I find another settings?
>
> I want to see settings like index.translog.flush_threshold_ops,
> index.refresh_interval, indices.memory.index_buffer_size,
> index.store.type, bootstrap.mlockall..
> and want to know this settings is correctly applied.
>
> Maybe you know why this result is shown, please advice to me.
>
> Regards
>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b7f46e6b-d212-4d0d-b577-86e0056438ee%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X89p%3DELJC%3D1PLCjVwdx2XKboWHPKQvC2mdJ_N%2BNSPXBVA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

issue with singleton analyzer in single JVM multi-index setup

2015-03-17 Thread Dmitry Kan

Hello!

I'm a newbie in elasticsearch, so forgive if the question is lame.

I have implemented a custom plugin using a custom lemmatizer and a
tokenizer. The simplified class sequence:

AnalysisMorphologyPlugin->MorphologyAnalysisBinderProcessor->SemanticAnalyzerTwitterLemmatizerProvider->RussianLemmatizingTwitterAnalyzer

In the RussianLemmatizingTwitterAnalyzer's ctor I load the custom object for
lemmatization (object unrelated to lucene/es) in a singleton fashion (in a
syncrhonized code block).
Then, when creating 14 indices in the same JVM I see
14 instances of RussianLemmatizingTwitterAnalyzer,
4 instances of SemanticAnalyzerTwitterLemmatizerProvider,
4 instances of MorphologyAnalysisBinderProcessor,
30 instances of the custom lemmatizer (in each
RussianLemmatizingTwitterAnalyzer only one instance is expected, so should be
14),
1 instance of AnalysisMorphologyPlugin.

The question is, can RussianLemmatizingTwitterAnalyzer object be made shared
between indices? Or is it by design, that they must load separately per index?
What could be wrong in the code that makes 30 instances of the custom singleton
lemmatizer instead of 14?

The current standing is that *with* the plugin 100M of RAM is reserved by the
JVM with no data. *Without* the plugin the JVM reserves 2M with no data.
Elasticsearch 1.3.2, Lucene 4.9.0.

Regards,

Dmitry Kan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7e0b09a0-c88c-4c56-bc8f-1b895d534cc0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch - node client does not connect to cluster

2015-03-17 Thread aaron

Is there a reason not to just specify the IP address and to try and rely on 
multicast?

I realize this is all on one node as you have stated that, but that seems 
even more reason that it would be little issue to specify the IP.  While 
multicast makes it easy to stand up a cluster in an ideal situation, my 
experience has been that it leads to more problems down the road, and 
things generally work better when not using multicast.   I heard the same 
suggestion repeatedly at Elastic{on}.

Aaron

On Tuesday, March 17, 2015 at 9:25:46 AM UTC-6, ooo_saturn7 wrote:
>
> I have one physical server and I work only on it (no other servers).
>
> At this server I have running elastic 1.4.2 - I use this version as this 
> is the last version elastic osgi bundle is ready for. Also at this server I 
> have glassfish 4.1 as java-ee server.
>
> I run elastic node client inside my java-ee application. And I do it this 
> way:
>
> Node node = nodeBuilder().local(true).clusterName("elasticsearch").node();
> Client client = node.client();
> GetResponse getResponse = 
> client.prepareGet("my.index-0.2.2","post","1").execute().actionGet();
> Map source = getResponse.getSource();
> System.out.println("--");
> System.out.println("Index: "+ getResponse.getIndex());
> System.out.println("Type: "+ getResponse.getType());
> System.out.println("Id: "+ getResponse.getId());
> System.out.println("Version: "+ getResponse.getVersion());
> System.out.println(source);
>
>
>
> In log I see the following:
>
> [2015-03-17T12:57:44.447+0400] [glassfish 4.1] [INFO] [] 
> [org.elasticsearch.discovery] [tid: _ThreadID=30 
> _ThreadName=http-listener-1(1)] [timeMillis: 1426582664447] [levelValue: 
> 800] [[ [Pistol] elasticsearch/SCKIrGHQTaC5eEYmYfZ0Iw]]
>
> [2015-03-17T12:57:44.449+0400] [glassfish 4.1] [INFO] [] 
> [org.elasticsearch.cluster.service] [tid: _ThreadID=128 
> _ThreadName=elasticsearch[Pistol][clusterService#updateTask][T#1]] 
> [timeMillis: 1426582664449] [levelValue: 800] [[ [Pistol] master {new 
> [Pistol][SCKIrGHQTaC5eEYmYfZ0Iw][webserver1.com][local[1]]{local=true}}, 
> removed 
> {[Pistol][uwaWFb6KTy2Sdoc8TNwdSQ][webserver1.com][local[1]]{local=true},}, 
> reason: local-disco-initial_connect(master)]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
> [org.elasticsearch.http] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)] 
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] bound_address 
> {inet[/0:0:0:0:0:0:0:0:9202]}, publish_address {inet[/SERVER IP:9202]}]]
>
> [2015-03-17T12:57:44.502+0400] [glassfish 4.1] [INFO] [] 
> [org.elasticsearch.node] [tid: _ThreadID=30 _ThreadName=http-listener-1(1)] 
> [timeMillis: 1426582664502] [levelValue: 800] [[ [Pistol] started]]
>
> and I get this exeption: ...
>
> Caused by: 
> org.elasticsearch.indices.IndexMissingException:[my.index-0.2.2] missing
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:768)
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteIndices(MetaData.java:691)
> at 
> org.elasticsearch.cluster.metadata.MetaData.concreteSingleIndex(MetaData.java:748)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:139)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:89)
> at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.doExecute(TransportShardSingleOperationAction.java:55)
> at 
> org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:75)
> at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:98)
> at 
> org.elasticsearch.client.support.AbstractClient.get(AbstractClient.java:193)
> at 
> org.elasticsearch.action.get.GetRequestBuilder.doExecute(GetRequestBuilder.java:201)
> at 
> org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:91)
> at 
> org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:65)
>
> So it can't find the index - my.index-0.2.2. However this index exists! 
> Besides, when I do curl -XGET '
> http://localhost:9200/_cluster/state?pretty=1' I see there only one node 
> and this *is not* SCKIrGHQTaC5eEYmYfZ0Iw. I suppose that the node I 
> create using java API creates new cluster and dosn't connect to my existing 
> cluster - that's why it says - it's master. Or I don't understand something 
> I have problem with code. Besides I've checked tha name of cluster it's 
> elasticsearch. So, how can I connect to my existing elasticsearch cluster?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To

Re: Logstash Geohash Question

It'll be able to read geoip.coordinates if you point to it.

On 17 March 2015 at 09:07, Michael  wrote:

> What do you mean exactly?
>
> These are the fields I'm able to obtain, whereas geoip.coordinates is
> built by using
>
> add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
> add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
>
> in my logstash.conf.
>
>
> geoip.city_name Warsaw*t*geoip.continent_code EU*#*geoip.coordinates
> ["21.0","52.25"]*t*geoip.country_code2 PL*t*geoip.country_code3 POL*t*
> geoip.country_name Poland*t*geoip.ip 217.67.205.50*#*geoip.latitude 52.25
> *#*geoip.location [21,52.25]*#*geoip.longitude 21*t*geoip.real_region_name
>  Mazowieckie*t*geoip.region_name 78*t*geoip.timezone Europe/Warsaw
> Can please be so kind and post the part of your geoip filter in your
> logstash.conf where handle the building of fields in order to use in tile
> map of kb4?
>
> Thanks in advance
>
>
>
> Am Samstag, 7. März 2015 16:40:07 UTC+1 schrieb Mark Walkom:
>>
>> ES needs a single lat+lon field to read. It or KB won't combine things
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/633b740d-1f9d-43ed-acdd-670e30829503%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9ytHKCe5SkTwC9ZPkSBkA%2Bn8cTdBCZq99TZqYqA7cCCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: How do you run ES with limited data storage space?

2015-03-17 Thread aaron

While ES does compress by default, it also stores data in data structures, 
that increase the size of the data. The net is that your data will be much 
larger than the equivalent log file gzipped.  However, running logstash to 
ingest 1.5 years of logs may well take much longer than you would expect.

There is no reason you shouldn't be able to move snapshots off of your 
shared drive onto an external drive or other storage, such as S3.

One thing you should reconsider is what you are trying to do with your 
resources.  It sounds like it is simply too much.  If the budget cannot 
budge to accommodate the requirements, then the requirements must budge to 
accommodate the budget.  Perhaps you can identify some log sources that do 
not have the same retention requirements.  Perhaps it is some segment of 
your logs that is not as important.  For instance is it really important to 
keep that Java Stack trace from a year ago?  Now I don't know the nature of 
your logs, but I do know the nature of logs, and there are important log 
entries, and there are mundane repetitive entries.  What I am driving at is 
that leveraging the ability of using ES aliasing and cross index searching 
you can segment your logs into important indexes and not important.  You 
can still search across all the indexes, but you can establish retention 
policies which differ for the less important, while preserving the precious 
resources you have for the important.

Some data you can take an RRD style approach with and create indexes that 
have summary information in them which will allow you to generate 
historical dashboards that still capture the essence of the day, if not the 
detail.  For instance while you could not show the individual requests on a 
given day, you could still show the request volume over a three year period.

While this goes against the nature of the e logging efforts, these are some 
of the ideas I had while reading about your situation.

Aaron

On Monday, March 16, 2015 at 6:42:43 PM UTC-6, Mark Walkom wrote:
>
> There's not a lot you can do here unless you want to start uploading 
> snapshots to S3, or something else that is not on your NAS.
> ES does compress by default and we are working on using a better algorithm 
> for future releases which will help, but there's no ETA for that.
>
> On 16 March 2015 at 17:29, David Reagan > 
> wrote:
>
>> So, I haven't figured out the right search terms to find the answer via 
>> Google yet, I've read a lot of the docs on the subject of Snapshot and 
>> Restore without finding an answer, and I haven't had the time or resources 
>> to test some of my own ideas. Hence, I'm posting this in the hopes that 
>> someone who has already solved this problem will share. 
>>
>> How do you run ES with limited data storage space?
>>
>> Basically, short of getting more space, what can I do to make the best 
>> use of what I have, and still meet as many of my goals as possible?
>>
>> My setup is 4 data nodes. Due to lack of resources/money, they are all 
>> thin provisioned VMs, and all my data has to be on NFS/SAN mounts. Storing 
>> data on the actual VM's hard disk would negatively effect other VMs and 
>> services.
>>
>> Our NFS SAN is also low on space. So I only have about 1.5TB to use. 
>> Initially this seemed like plenty, but a couple weeks ago, ES started 
>> complaining about running out of space. Usage on that mount was over 80%. 
>> My snapshot repository had ballooned to over 700GB, and each node's data 
>> mount point was around 150GB. 
>>
>> Currently, I'm only using ES for logs.
>>
>> For day to day use, I should be fine with 1 month of open indices. Thus, 
>> I've been keeping older indices closed already. So I can't really do much 
>> more when it comes to closing indices.
>>
>> I also run the optimize command nightly on any logstash index older that 
>> a couple days.
>>
>> I'd just delete the really old data, but I have use cases for data up to 
>> 1.5 years old. Considering that snapshots of only a few months nearly used 
>> up all my space, and how much space a month of logs is currently taking up, 
>> I'm not sure how I can store that much data.
>>
>> So, in general, how would you solve my problem? I need to have immediate 
>> access to 1 months worth of logs (via Kibana), be able to relatively 
>> quickly access up to 6 months of logs (open closed indices?), and access up 
>> to 1.5 years worth temporarily (restore snapshots to new cluster on my 
>> desktop?)
>>
>> Would there be a way to move snapshots off of the NFS SAN to an external 
>> hard drive? 
>>
>> Should I tell logstash to send logs to a text file that get's logrotated 
>> for a year and a half? Or does ES do a good enough job with compression 
>> that gzipping wouldn't help? If it was just a text file, I could unzip it, 
>> then tell Logstash to read the file into an ES cluster.
>>
>> ES already compresses stored indices by default, right? So there's 
>> nothing I can do there?
>>
>>
>>  -- 
>> You r

Elasticsearch ICU Analysis plugin for 1.4.3 / proper Lucene version

2015-03-17 Thread JZ

Dear all,

I am wondering whether you can provide a compiled version of the ICU
Analysis plugin for Elasticsearch 1.4.3. I have tried to install the plugin
version 1.4.2 on ES 1.4.3 but then I get this error on restarting:

cannot start plugin due to incorrect Lucene version: plugin [4.10.3], node
[4.10.2].

See:
https://github.com/elastic/elasticsearch-analysis-icu

I have tried to compile it from source, but then I get Maven dependency
errors returned.

Thanks in advance!

/JZ

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAA%2BD3eXq7jKOFOEJjBftQRCK%3DJ%2Bw6%2BUw7PHrjMWG8_w%2B%3Dshv%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Logstash Geohash Question

2015-03-17 Thread Michael

What do you mean exactly?

These are the fields I'm able to obtain, whereas geoip.coordinates is built 
by using 

add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]

in my logstash.conf.


geoip.city_name Warsaw*t*geoip.continent_code EU*#*geoip.coordinates 
["21.0","52.25"]*t*geoip.country_code2 PL*t*geoip.country_code3 POL*t*
geoip.country_name Poland*t*geoip.ip 217.67.205.50*#*geoip.latitude 52.25*#*
geoip.location [21,52.25]*#*geoip.longitude 21*t*geoip.real_region_name 
Mazowieckie*t*geoip.region_name 78*t*geoip.timezone Europe/Warsaw
Can please be so kind and post the part of your geoip filter in your 
logstash.conf where handle the building of fields in order to use in tile 
map of kb4?

Thanks in advance



Am Samstag, 7. März 2015 16:40:07 UTC+1 schrieb Mark Walkom:
>
> ES needs a single lat+lon field to read. It or KB won't combine things
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/633b740d-1f9d-43ed-acdd-670e30829503%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Operator "and" in highlighting

2015-03-17 Thread Nikolas Everett

On Tue, Mar 17, 2015 at 8:56 AM, Vlad Zaitsev  wrote:

But it seems that highlighter ignore operator: “and” and highlight any term
> from queries.
>
>
Its much more than that.  For the most part highlighters reduce the query
to a list of terms blindly.  Some do phrases.  They don't really have that
nuanced a view of the query itself.

Its because highlighting is totally decoupled from the actual search
portion of the job - its more like a recheck.  And Lucene isn't built to
cleanly plug the highlighters into the queries.  So they have tons of
instanceof style hacks to get the job done.  Its not super pleasant.

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0t%3DHwt-mMpTSbw6B9vLBwDk8%3DsZS88F3g%2BfXATgC_SGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: PayloadTermQuery in ElasticSearch

2015-03-17 Thread Nikolas Everett

I imagine the right way to do this is with a plugin but I'm not 100% sure.

On Tue, Mar 17, 2015 at 11:47 AM, Devaraja Swami 
wrote:

> I plan to store floats in the payload and boost the score
> (multiplicatively) based on the average value of the payloads over the
> occurrences of the matching term in the document. ie., exactly as in
> AveragePayloadFunction in Lucene.
>
> On Tue, Mar 17, 2015 at 2:16 AM, joergpra...@gmail.com <
> joergpra...@gmail.com> wrote:
>
>> The concrete implementation depends on what you store in the payload
>> (e.g. scores)
>>
>> Jörg
>>
>> On Tue, Mar 17, 2015 at 7:01 AM, Devaraja Swami 
>> wrote:
>>
>>> I need to use PayloadTermQuery from Lucene.
>>> Does anyone know how I can use this in ElasticSearch?
>>> I am using ES 1.4.4, with the Java API.
>>> In Lucene, I could use this by directly instantiating PayloadTermQuery,
>>> but there are no APIs in ES QueryBuilders for this.
>>> I don't need a query parser, because I can build the query directly
>>> using the Java API (don't need a JSON representation of the query),
>>> so I only need to be able to construct, in Java, a query builder
>>> encapsulating a PayloadTermQuery.
>>>
>>> Thanks in advance!
>>>
>>> -devarajaswami
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/8fc84082-6fc7-42aa-9caf-8ab527bc8a0b%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFwk_Ve_OK9J%2BfsEzGwbtOnaL7%2BeqT%3DR61hCoX8Mzi-fQ%40mail.gmail.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CABMSir6hF%3DuM1jp0jgoBq_v30YNVB-8JLF7PLyvFjyXbdqtLvg%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2%2B-YFPsRVB0QywSuMFEVXL-UgQyxJRGBjGn4Lw0KWT4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Why does creating a repository fail?

@Mark Walkom, So, I'm looking into iscsi. From what I have learned so far,
you actually format the LUN with whatever file system you want. So,
wouldn't the gid/uid issue show up there as well, if I formatted to ext3 or
ext4? Since Ubuntu would treat it like a normal partition and use typical
linux file perms on it.

--David Reagan

On Mon, Mar 16, 2015 at 5:37 PM, David Reagan  wrote:

> If I were manually creating the elasticsearch user, that'd be easy. But
> I'm relying on apt to do the job for me. So, yeah...
>
> Hmm... I suppose I could manually create an elasticsearch2 user, then
> modify the defaults files to use it when running ES. Still seems clunky...
>
> --David Reagan
>
> On Mon, Mar 16, 2015 at 5:20 PM, Andrew Selden  wrote:
>
>> I’m not that familiar with iSCSI so I hesitate to say for sure, but
>> anytime you are cross-mounting filesystems on Linux you have to take
>> uid/gid consistency into account.
>>
>> - Andrew
>>
>> On Mar 16, 2015, at 4:46 PM, David Reagan  wrote:
>>
>> Would an iSCSI mount have the same issue? I believe our SAN supports
>> both.
>>
>> --David Reagan
>>
>> On Mon, Mar 16, 2015 at 4:40 PM, Andrew Selden  wrote:
>>
>>> Hi David,
>>>
>>> This is a common problem with NFS. Unfortunately the protocol assumes
>>> identical uid/gid mappings across all machines. It’s just one of those
>>> annoying sys-admin tasks that one has to take into account when using NFS.
>>> To get your permissions back to less permissive settings you will have to
>>> edit the /etc/passwd and /etc/group files to keep them in sync.
>>>
>>> See http://www.tldp.org/HOWTO/NFS-HOWTO/troubleshooting.html#SYMPTOM4
>>> for more context.
>>>
>>> - Andrew
>>>
>>>
>>> On Mar 16, 2015, at 4:04 PM, David Reagan  wrote:
>>>
>>> First, it is a file permissions issue. I did get snapshots to run when I
>>> chmoded to 777. As you can see from the ls output, /mounts/prod_backup is
>>> 777. Prior to that it was 775 or 755. So, I could revise my question to
>>> "How can I get snapshots working without using insecure file permissions?"
>>>
>>> root@log-elasticsearch-01:~# mount
>>> /dev/mapper/ws--template--01-root on / type ext4 (rw,errors=remount-ro)
>>> proc on /proc type proc (rw,noexec,nosuid,nodev)
>>> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
>>> none on /sys/fs/fuse/connections type fusectl (rw)
>>> none on /sys/kernel/debug type debugfs (rw)
>>> none on /sys/kernel/security type securityfs (rw)
>>> udev on /dev type devtmpfs (rw,mode=0755)
>>> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
>>> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
>>> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
>>> none on /run/shm type tmpfs (rw,nosuid,nodev)
>>> /dev/sda1 on /boot type ext2 (rw)
>>> rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
>>> nfsip:/vol/Logs/prod_backup on /mounts/prod_backup type nfs
>>> (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
>>> nfsip:/vol/Logs/log-elasticsearch-01 on /mounts/log-elasticsearch-01
>>> type nfs (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
>>>
>>> root@log-elasticsearch-01:~# ls -ld /mounts
>>> drwxr-xr-x 6 root root 4096 Oct  1 13:43 /mounts
>>>
>>> root@log-elasticsearch-01:~# ls -ld /mounts/prod_backup/
>>> drwxrwxrwx 4 elasticsearch elasticsearch 4096 Mar 16 13:41
>>> /mounts/prod_backup/
>>>
>>> --David Reagan
>>>
>>> On Mon, Mar 16, 2015 at 3:47 PM, Mark Walkom 
>>> wrote:
>>>
 Can you post the output from *mount* and *ls -ld /mounts
 /mounts/prod_backup*?

 On 16 March 2015 at 13:33, David Reagan  wrote:

> Why does this happen?
>
>
> curl -XPUT 'http://localhost:9200/_snapshot/my_backup?pretty=true' -d
>> '{
>> > "type": "fs",
>> > "settings": {
>> > "location": "/mounts/prod_backup/my_backup",
>> > "compress": true
>> > }
>> > }'
>> {
>>   "error" :
>> "RemoteTransportException[[log-elasticsearch-02][inet[/10.x.x.83:9300]][cluster:admin/repository/put]];
>> nested: RepositoryVerificationException[[my_backup]
>> [vxUQwUTCQwOaLyCy0eMK8A,
>> 'RemoteTransportException[[log-elasticsearch-04][inet[/10.x.x.80:9300]][internal:admin/repository/verify]];
>> nested: RepositoryVerificationException[[my_backup] store location
>> [/mounts/prod_backup/my_backup] is not accessible on the node
>> [[log-elasticsearch-04][vxUQwUTCQwOaLyCy0eMK8A][log-elasticsearch-04][inet[/10.x.x.80:9300;
>> nested:
>> FileNotFoundException[/mounts/prod_backup/my_backup/tests-yZ57gviiQUGS55tr_ULhhg-vxUQwUTCQwOaLyCy0eMK8A
>> (Permission denied)]; '], [GMTt6Y-3Qle1Fm3SGl-LTQ,
>> 'RemoteTransportException[[log-estools-01][inet[/10.x.x.8:9300]][internal:admin/repository/verify]];
>> nested: RepositoryVerificationException[[my_backup] store location
>> [/mounts/prod_backup/my_backup] is not accessible on the node
>> [[log-estools-01][GMTt6Y-3Qle1Fm3SGl-LTQ][log-estools-01][inet[/10.x.x.8:

Re: PayloadTermQuery in ElasticSearch

2015-03-17 Thread Devaraja Swami

I plan to store floats in the payload and boost the score
(multiplicatively) based on the average value of the payloads over the
occurrences of the matching term in the document. ie., exactly as in
AveragePayloadFunction in Lucene.

On Tue, Mar 17, 2015 at 2:16 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> The concrete implementation depends on what you store in the payload (e.g.
> scores)
>
> Jörg
>
> On Tue, Mar 17, 2015 at 7:01 AM, Devaraja Swami 
> wrote:
>
>> I need to use PayloadTermQuery from Lucene.
>> Does anyone know how I can use this in ElasticSearch?
>> I am using ES 1.4.4, with the Java API.
>> In Lucene, I could use this by directly instantiating PayloadTermQuery,
>> but there are no APIs in ES QueryBuilders for this.
>> I don't need a query parser, because I can build the query directly using
>> the Java API (don't need a JSON representation of the query),
>> so I only need to be able to construct, in Java, a query builder
>> encapsulating a PayloadTermQuery.
>>
>> Thanks in advance!
>>
>> -devarajaswami
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/8fc84082-6fc7-42aa-9caf-8ab527bc8a0b%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFwk_Ve_OK9J%2BfsEzGwbtOnaL7%2BeqT%3DR61hCoX8Mzi-fQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABMSir6hF%3DuM1jp0jgoBq_v30YNVB-8JLF7PLyvFjyXbdqtLvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Kibana 4.0.1 / ES 1.4.4 - time field name

2015-03-17 Thread Micah Yoder

I had a field called _timestamp, which I had to add in the meta-fields list 
in the advanced settings. Maybe similar?

On Tuesday, March 17, 2015 at 10:10:51 AM UTC-5, Moshe Recanati wrote:
>
> Hi
> I would like to use Kibana. I'm able to load my index however it didn't 
> find time-field name.
> I a saw it search for '@timestamp'.
> I'm using Java and ObjectMapper to write my data into ES.
> I would like to know which field I need to define in order to have this 
> time-field.
>
> Thank you,
> Moshe
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ddf76692-8454-4f09-b682-ecd8b49cf932%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch - node client does not connect to cluster