Re: Elasticsearch 0.90 insallation with .rpm and logging

2014-01-09 Thread Alexander Reelsen
Hey,

this is a result, that all the package files are moved around in the
filesystem instead of having it centrally in one directory, like in the zip
file distribution. Usually the init script takes care of it all and you do
not have to add all those path configurations manually. Any specific reason
you dont use the init script in /etc/init.d/elasticsearch?


--Alex


On Fri, Jan 10, 2014 at 5:53 AM, Srividhya Umashanker <
srividhya.umashan...@gmail.com> wrote:

> Had to add this to  /usr/share/elasticsearch/bin/elasticsearch
> So "any" startup of elasticsearch will pickup
>
>
> ES_JAVA_OPTS="-Des.config=/etc/elasticsearch/elasticsearch.yml
> -Des.path.conf=/etc/elasticsearch/ -Des.path.home=/usr/share/elasticsearch
> -Des.path.logs=/var/log/elasticsearch -Des.path.data=/var/lib/elasticsearch
> -Des.path.work=/tmp/elasticsearch
> -Des.path.plugins=/usr/share/elasticsearch/plugins"
>
>
> Is this a defect in RPM distribution?  We do not want to edit anything
> post installation. because our installation is automatic using yum.
>
> Should I raise a defect for this?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2b39d430-e761-4e73-99c4-6de3a3501dd8%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM9juSrTVWvY8q%2BkQrGeLowMfDb100Hv_-V5GjOVcU%3DnFw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Is there a help document about bigdesk plugin?

2014-01-09 Thread Lukáš Vlček
Hi,

explanation about how to use it can be found on github pages or
bigdesk.orgweb site. There is no single document explaining individual
charts, but we
can start creating one. Feel free to ask.

Regards,
Lukáš
Dne 10.1.2014 7:35 "Eric Lu"  napsal(a):

> Or some detail introduction about the various charts?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b9a893a8-9c16-4d2d-9efe-7218a4895896%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAO9cvUb0MjLCf9UFUoeRx6PLBPP9NK4OHafYyOVDfuJo%3DbUd1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Is there a help document about bigdesk plugin?

2014-01-09 Thread Eric Lu
Or some detail introduction about the various charts?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b9a893a8-9c16-4d2d-9efe-7218a4895896%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch 0.90 insallation with .rpm and logging

2014-01-09 Thread Srividhya Umashanker
Had to add this to  /usr/share/elasticsearch/bin/elasticsearch
So "any" startup of elasticsearch will pickup


ES_JAVA_OPTS="-Des.config=/etc/elasticsearch/elasticsearch.yml 
-Des.path.conf=/etc/elasticsearch/ -Des.path.home=/usr/share/elasticsearch 
-Des.path.logs=/var/log/elasticsearch -Des.path.data=/var/lib/elasticsearch 
-Des.path.work=/tmp/elasticsearch 
-Des.path.plugins=/usr/share/elasticsearch/plugins"


Is this a defect in RPM distribution?  We do not want to edit anything post 
installation. because our installation is automatic using yum. 

Should I raise a defect for this?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b39d430-e761-4e73-99c4-6de3a3501dd8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: java client: typeExists() returns false after successful bulk index - why?

2014-01-09 Thread Nikita Tovstoles
yes! adding a sleep after Future.get() of bulk op 'fixed' my test - thank
you.

what you said re: bulk op was submitted but not processed makes sense
(perhaps there is a separate API to query for op's completion status?) but
what is puzzling to me is that comments in source of *BulkResponse* seem to
imply it is constructed *after* op completes:

Holding a response for each item responding (in order) of the
 * bulk requests. Each item holds the index/type/id is operated on, and if
it failed or not (with the
 * failure message).

..thus I was expecting that by the time
*ListenableActionFuture.get()
*returns the op is actually completed (not just submitted). Otherwise
status properties in embedded BulkItemResponse would not be useful, right?



On Thu, Jan 9, 2014 at 8:13 PM, InquiringMind wrote:

> A quick guess: The first one works because the first document for that
> type is indexed and therefore the type is created when the operation
> returns.
>
> But the second one doesn't work because there is a refresh interval
> between the completion of a bulk load operation and the actual document
> being added. And since it's the first document in the type, the type won't
> exist until that first document is indexed. Which is likely exactly what
> you want: Bulk operations need to defer until they are processed to allow
> for optimizations. I don't know Lucene internals, but a B+Tree loads vastly
> quicker when keys are presorted in bulk instead of added and committed one
> by one.
>
> The experts can chime in later, and if I'm wrong or off base anywhere I
> welcome the correction!
>
> Brian
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/V1A1HbJFio4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/6f3374ff-b623-47ca-9e93-3eb2630b6ef3%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJwaA22WFt%2B%2BQ2z%3DvBMpL_8ChjBB19OQcDWtTzwy8bd5xQ1sjw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread Jun Ohtani
Hi Brian,

Thanks!

I understand that your query does not match anything at all.

“query_string” query is changed automatically query-terms to lower-case in some 
cases.
i.e. wildcard, prefix, fuzzy…
See : 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html

I change your query as follow : 

{
"query": {
   "bool" : {
"must" : {
  "query_string" : {
"query" : "text.na:Immortal-Li*",
"lowercase_expanded_terms" : false
  }
}
  }
}
}

Then returns the two documents.

I've learned a great deal!

Regards,



Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani




2014/01/10 13:07、InquiringMind  のメール:

> If it helps, here are my index settings and mappings. Note that I chose the 
> name text.na as the non-analyzed form, not text.raw. Perhaps I should follow 
> convention? But for now, a rose by any other name is still not analyzed:
> 
> {
>   "settings" : {
> "index" : {
>   "number_of_shards" : 1,
>   "refresh_interval" : "1s",
>   "analysis" : {
> "char_filter" : { },
> "filter" : {
>   "english_snowball_filter" : {
> "type" : "snowball",
> "language" : "English"
>   }
> },
> "analyzer" : {
>   "english_stemming_analyzer" : {
> "type" : "custom",
> "tokenizer" : "standard",
> "filter" : [ "standard", "lowercase", "asciifolding", 
> "english_snowball_filter" ]
>   },
>   "english_standard_analyzer" : {
> "type" : "custom",
> "tokenizer" : "standard",
> "filter" : [ "standard", "lowercase", "asciifolding" ]
>   }
> }
>   }
> }
>   },
>   "mappings" : {
> "_default_" : {
>   "dynamic" : "strict"
> },
> "ghost" : {
>   "_all" : {
> "enabled" : false
>   },
>   "_ttl" : {
> "enabled" : true,
> "default" : "1.9m"
>   },
>   "properties" : {
> "cn" : {
>   "type" : "string",
>   "analyzer" : "english_stemming_analyzer"
> },
> "text" : {
>   "type" : "multi_field",
>   "fields" : {
> "text" : {
>   "type" : "string",
>   "analyzer" : "english_stemming_analyzer",
>   "position_offset_gap" : 4
> },
> "std" : {
>   "type" : "string",
>   "analyzer" : "english_standard_analyzer",
>   "position_offset_gap" : 4
> },
> "na" : {
>   "type" : "string",
>   "index" : "not_analyzed"
> }
>   }
> }
>   }
> },
> "elf" : {
>   "_all" : {
> "enabled" : false
>   },
>   "_ttl" : {
> "enabled" : true
>   },
>   "properties" : {
> "cn" : {
>   "type" : "string",
>   "analyzer" : "english_stemming_analyzer"
> },
> "text" : {
>   "type" : "multi_field",
>   "fields" : {
> "text" : {
>   "type" : "string",
>   "analyzer" : "english_stemming_analyzer",
>   "position_offset_gap" : 4
> },
> "std" : {
>   "type" : "string",
>   "analyzer" : "english_standard_analyzer",
>   "position_offset_gap" : 4
> },
> "na" : {
>   "type" : "string",
>   "index" : "not_analyzed"
> }
>   }
> }
>   }
> }
>   }
> }
> 
> 
> Brian
> 
> On Thursday, January 9, 2014 10:38:15 PM UTC-5, Jun Ohtani wrote:
> Hi Chris, 
> 
> I recreate your issue to the following gist. 
> 
> https://gist.github.com/johtani/8346404 
> 
> And I try to change  query as follows: 
> 
> User_Name.raw:bob.smith-jones -> matches 
> User_Name.raw:bob.smi* -> matches 
> User_Name.raw:bob.smith-j* -> matches 
> User_Name.raw:bob.smith\-j* -> matches 
> 
> I use User_Name.raw field instead of User_Name. 
> 
> Sorry, not necessary to escape… 
> 
> And I don’t know why do not work Brian example’s query_string query… 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email toelasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b0aece84-052c-4efc-8a25-1b42850fefe4%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: java client: typeExists() returns false after successful bulk index - why?

2014-01-09 Thread InquiringMind
A quick guess: The first one works because the first document for that type 
is indexed and therefore the type is created when the operation returns.

But the second one doesn't work because there is a refresh interval between 
the completion of a bulk load operation and the actual document being 
added. And since it's the first document in the type, the type won't exist 
until that first document is indexed. Which is likely exactly what you 
want: Bulk operations need to defer until they are processed to allow for 
optimizations. I don't know Lucene internals, but a B+Tree loads vastly 
quicker when keys are presorted in bulk instead of added and committed one 
by one.

The experts can chime in later, and if I'm wrong or off base anywhere I 
welcome the correction!

Brian

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6f3374ff-b623-47ca-9e93-3eb2630b6ef3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread InquiringMind
If it helps, here are my index settings and mappings. Note that I chose the 
name text.na as the non-analyzed form, not text.raw. Perhaps I should 
follow convention? But for now, a rose by any other name is still not 
analyzed:

{
  "settings" : {
"index" : {
  "number_of_shards" : 1,
  "refresh_interval" : "1s",
  "analysis" : {
"char_filter" : { },
"filter" : {
  "english_snowball_filter" : {
"type" : "snowball",
"language" : "English"
  }
},
"analyzer" : {
  "english_stemming_analyzer" : {
"type" : "custom",
"tokenizer" : "standard",
"filter" : [ "standard", "lowercase", "asciifolding", 
"english_snowball_filter" ]
  },
  "english_standard_analyzer" : {
"type" : "custom",
"tokenizer" : "standard",
"filter" : [ "standard", "lowercase", "asciifolding" ]
  }
}
  }
}
  },
  "mappings" : {
"_default_" : {
  "dynamic" : "strict"
},
"ghost" : {
  "_all" : {
"enabled" : false
  },
  "_ttl" : {
"enabled" : true,
"default" : "1.9m"
  },
  "properties" : {
"cn" : {
  "type" : "string",
  "analyzer" : "english_stemming_analyzer"
},
"text" : {
  "type" : "multi_field",
  "fields" : {
"text" : {
  "type" : "string",
  "analyzer" : "english_stemming_analyzer",
  "position_offset_gap" : 4
},
"std" : {
  "type" : "string",
  "analyzer" : "english_standard_analyzer",
  "position_offset_gap" : 4
},
"na" : {
  "type" : "string",
  "index" : "not_analyzed"
}
  }
}
  }
},
"*elf*" : {
  "_all" : {
"enabled" : false
  },
  "_ttl" : {
"enabled" : true
  },
  "properties" : {
"cn" : {
  "type" : "string",
  "analyzer" : "english_stemming_analyzer"
},
"*text*" : {
  "type" : "multi_field",
  "fields" : {
"text" : {
  "type" : "string",
  "analyzer" : "english_stemming_analyzer",
  "position_offset_gap" : 4
},
"std" : {
  "type" : "string",
  "analyzer" : "english_standard_analyzer",
  "position_offset_gap" : 4
},
"*na*" : {
  "type" : "string",
  "index" : "*not_analyzed*"
}
  }
}
  }
}
  }
}


Brian

On Thursday, January 9, 2014 10:38:15 PM UTC-5, Jun Ohtani wrote:
>
> Hi Chris, 
>
> I recreate your issue to the following gist. 
>
> https://gist.github.com/johtani/8346404 
>
> And I try to change  query as follows: 
>
> User_Name.raw:bob.smith-jones -> matches 
> User_Name.raw:bob.smi* -> matches 
> User_Name.raw:bob.smith-j* -> matches 
> User_Name.raw:bob.smith\-j* -> matches 
>
> I use User_Name.raw field instead of User_Name. 
>
> Sorry, not necessary to escape… 
>
> And I don’t know why do not work Brian example’s query_string query… 
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b0aece84-052c-4efc-8a25-1b42850fefe4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: 1.0.0 Beta2 - GET children for Parent/Child does not seem to work

2014-01-09 Thread David Pilato
Try with adding ?routing=PARENTID
where PARENTID is equal to the parent ID for a given child

HTH

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs


Le 10 janv. 2014 à 01:09, Yuri Panchenko  a écrit :

Hi,

I'm doing a simple test with 1.0.0 Beta 2.  I've indexed a parent record and 
three children.  The head plugin shows all the children, and the search 
endpoint returns all three children with different id's.  But, for some strange 
reason, I can only GET by id one of the children.  Does someone have a clue, or 
could this be a bug?


curl -XGET localhost:9200/d3/transactions/_search?pretty
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
  },
  "hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "3",
  "_score" : 1.0, "_source" : { "date" : "2012-12-01", "description" : 
"Nail polish", "amount" : 80.00}
}, {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "2",
  "_score" : 1.0, "_source" : { "date" : "2012-10-14", "description" : 
"Nail polish", "amount" : 70.00}
}, {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "1",
  "_score" : 1.0, "_source" : { "date" : "2013-01-01", "description" : 
"Nail polish", "amount" : 75.50}
} ]
  }
}

 curl -XGET localhost:9200/d3/transactions/1?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "1",
  "_version" : 2,
  "exists" : true, "_source" : { "date" : "2013-01-01", "description" : "Nail 
polish", "amount" : 75.50}
}


curl -XGET localhost:9200/d3/transactions/2?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "2",
  "exists" : false
}


curl -XGET localhost:9200/d3/transactions/3?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "3",
  "exists" : false
}


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/41376241-31ab-488a-bac8-19618cbc60be%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/143F639E-7946-4F98-B003-B47FDD84F2C3%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread Jun Ohtani
Hi Chris,

I recreate your issue to the following gist.

https://gist.github.com/johtani/8346404

And I try to change  query as follows:

User_Name.raw:bob.smith-jones -> matches
User_Name.raw:bob.smi* -> matches
User_Name.raw:bob.smith-j* -> matches
User_Name.raw:bob.smith\-j* -> matches

I use User_Name.raw field instead of User_Name.

Sorry, not necessary to escape…

And I don’t know why do not work Brian example’s query_string query…

Does it make sense?
Is this understanding mistaken?



Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani




2014/01/10 2:09、InquiringMind  のメール:

> Chris,
> 
> I updated one of my tests to reproduce your issue. My text field is a 
> multi-field where text.na is the text field without any analysis at all.
> 
> This Lucene query does not find anything at all:
> 
> {
>   "bool" : {
> "must" : {
>   "query_string" : {
> "query" : "text.na:Immortal-Li*"
>   }
> }
>   }
> }
> 
> But this one works fine:
> 
> {
>   "bool" : {
> "must" : {
>   "prefix" : {
> "text.na" : {
>   "prefix" : "Immortal-Li"
> }
>   }
> }
>   }
> }
> 
> And returns the two documents that I expected:
> 
> { "_index" : "mortal" , "_type" : "elf" , "_id" : "1" , "_version" : 1 , 
> "_score" : 1.0 , "_source" :
>{ "cn" : "Celeborn" , "text" : "Immortal-Lives forever" } }
> 
> { "_index" : "mortal" , "_type" : "elf" , "_id" : "2" , "_version" : 1 , 
> "_score" : 1.0 , "_source" :
>{ "cn" : "Galadriel" , "text" : "Immortal-Lives forever" } }
> 
> Note that in both cases, the query's case must match since the field value is 
> not analyzed at all.
> 
> I'm not sure if this is a true bug. In general, I find Lucene syntax somewhat 
> useful for ad-hoc queries, and I find their so-called Simple Query Parser 
> syntax to be completely unable to find anything when there is no _all field, 
> whether or not I specify a default field. (But that's another issue I'm going 
> to ask about in the near future.)
> 
> Brian
> 
> On Thursday, January 9, 2014 8:27:04 AM UTC-5, Chris H wrote:
> Hi, Jun.
> 
> That doesn't seem to work.  For a user with the username bob.smith-jones:
>   • bob.smith-jones -> matches
>   • bob.smith- -> matches
>   • bob.smi* -> matches
>   • bob.smith-j* -> no results
>   • bob.smith\-j* -> no results
> Also, a "$" isn't one of the special characters.
> 
> Thanks.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6cb908eb-9ca7-4f05-815f-a868c45f9f66%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Improving Relevancy for Exact matches

2014-01-09 Thread hemant pahilwani
Example below might explain what is the issue i am facing:

I am trying to search Jimmy John on first name and last name fields and i 
want result to look like:

*fname  lname*
Jimmy John
JohnJimmy
John Mayer
John Yule
.
.

but i am getting following result
*fname  lname*
John Jimmy
John Mayer
John Yule
Jimmy John
.
.

I am trying to get Jimmy John to be displayed at top but not sure why John 
Jimmy is getting displayed first.  Is there a way to fine tune elastic 
search to return Jimmy John first i.e. sequence of input query phase should 
match sequence of fields passed?  

Both fname and lname use *standard* analyzer.  Below is the multi match 
query that i am using:

{
  "from" : 0,
  "size" : 300,
  "query" : {
"multi_match" : {
  "query" : "Jimmy John",
  "fields" : [ "fname", "lname" ],
  "use_dis_max" : false
}
  },
  "min_score" : 0.15,
  "explain" : true
}


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66207eae-d15c-45e4-94cd-e8bd3a8a55f3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


java client: typeExists() returns false after successful bulk index - why?

2014-01-09 Thread Nikita Tovstoles
ES newbie here. i noticed that typeExists() queries return false after 
successful bulk index() but don't understand why. is that expected? Using 
java client v 0.90.9. thanks in advance!

non-bulk works:

 @Test
public void testTypeExists()
{

assertFalse(admin().indices().prepareTypesExists(INDEX_NAME).setTypes("foo").get().isExists());

assertEquals("foo",


client().prepareIndex().setIndex(INDEX_NAME).setType("foo").setId("1").setSource("{\"a\":\"b\"}").get()
.getType());


assertTrue(admin().indices().prepareTypesExists(INDEX_NAME).setTypes("foo").get().isExists());
 
//RETURNS TRUE AS EXPECTED
}


bulk fails:

@Test
public void testTypeExistsAfterBulkIndex()
{

assertFalse(admin().indices().prepareTypesExists(INDEX_NAME).setTypes("foo").get().isExists());

assertEquals("foo",


client().prepareBulk().add(client().prepareIndex().setIndex(INDEX_NAME).setType("foo").setId("1")

.setSource("{\"a\":\"b\"}")).execute().actionGet().getItems()[0].getType()); 
//SUCCEEDS


assertTrue(admin().indices().prepareTypesExists(INDEX_NAME).setTypes("foo").get().isExists());
 
//FAILS
}

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/10d8a7c9-7d4b-4065-a8e5-624fd2750393%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Running out of memory when parsing the large text file.

2014-01-09 Thread Ivan Ji
Hi all,

I post several large text files, which are about 20~30MB and contains all 
the text, into ES. And I use the attachment mapper to be the field type to 
store these file.
It cost memory very much. Even when I post one file, the used memory grows 
from about 150MB to 250MB. BTW, I use the default tokenizer for these field.

Although this file can be generated many tokens, but what I don't 
understand is the memory cost. Does it store all the tokens into memory?

Ideas?

Cheers,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2f200f67-7024-4cdd-9c68-05875f0155ca%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How's the encoding handling power of ES?

2014-01-09 Thread HongXuan Ji
Hi, Jason


Thanks for the reply. I read the post.

I am also wondering what the encoding process of ES works and what's the 
underlying encoding used in ES to store data?

Do you have some documents about these?

Thanks,!

Regards,

Ivan

Jason Wee於 2014年1月9日星期四UTC+8下午10時08分26秒寫道:
>
> There is example in index and query in this SO 
> http://stackoverflow.com/questions/8734888/how-to-search-for-utf-8-special-characters-in-elasticsearch
>
> hth
>
> Jason
>
>
> On Thu, Jan 9, 2014 at 5:13 PM, HongXuan Ji 
> > wrote:
>
>> Hi all, 
>>
>> I am wondering how the ElasticSearch deal with different document with 
>> different encoding, such as different language. 
>> Could you provide me some tutorial about it? Do I need to manually 
>> specify the encoding format of the document when posting?
>>
>> Best,
>>
>> Ivan
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bd32a1c8-1718-4308-bfc6-f3d91ee4f2b7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Uncertain how to properly filter when certain fields do not always exist

2014-01-09 Thread Phil Barresi
Following advice I've 
receivedfrom
 StackOverflow poster Sloan Ahrens, I made a 
test  that 
worked on a test index, but not on a main index. However, I was getting 
false matches.

I've added my index settings 
and 
found that somehow, the addition of the relatedProfiles mapping makes it so 
that I get false-hits. Can anyone explain why that is?


On Thursday, January 9, 2014 10:17:29 AM UTC-5, Phil Barresi wrote:
>
> I am trying to filter based on a field that, on some objects, does not 
> exist. I was under the impression that ES would match objects that don't 
> have that field.
>
> Ultimately, I am trying to filter as such:
>
>- Field A will always exist, and should match on any of tags 1,2,3 
>- When it exists, either Field B or C must match any of tags 5,6,7 
>- When it exists, Field B must match any of tags 10, 11, 12 
>- When it exists, Field B or C must NOT have any of tags 15, 16, 18. 
>
> In this case, all my tags are strings. In addition, fields B and C are 
> inside of another. I am uncertain if that matters.
>
> Essentially, my object is:
>
> { a: ["some", "tags", "here"],
> X : { 
> B: ["more", "tags", "here"],
> C: ["even", "more", "here"]
> }
> } 
>
>
> Ultimately, I am trying to build a whitelist and blacklist filtering 
> system.
>
> However, when filtering this way, I do not get any results that do not 
> contain field X.
>
> How do I properly format this filter? 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0951fad9-32d5-4ab8-801a-619ab4201030%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[RUBY] : elasticsearch-ruby : Special characters not escaped by the library

2014-01-09 Thread Srirang Doddihal
Hi,

I tried out the elasticsearch Ruby gem today and found that it does not 
escape the reserve characters when searching with the query_string query.

As a library providing easy to use search API, wouldn't it be better if the 
library escaped the reserve characters in this case? 
The API can support a flag,  with a sensible default value, to enable or 
disable this escaping behavior.

Or is it an explicit design decision that the users themselves have to 
escape the reserve characters before sending it to this library?

I am using v0.4.5.

Regards,
Brahmana

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0bc324f1-dc81-4640-aea0-1a3b08663f20%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


1.0.0 Beta2 - GET children for Parent/Child does not seem to work

2014-01-09 Thread Yuri Panchenko
Hi,

I'm doing a simple test with 1.0.0 Beta 2.  I've indexed a parent record 
and three children.  The head plugin shows all the children, and the search 
endpoint returns all three children with different id's.  But, for some 
strange reason, I can only GET by id one of the children.  Does someone 
have a clue, or could this be a bug?


curl -XGET localhost:9200/d3/transactions/_search?pretty
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
  },
  "hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "3",
  "_score" : 1.0, "_source" : { "date" : "2012-12-01", "description" : 
"Nail polish", "amount" : 80.00}
}, {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "2",
  "_score" : 1.0, "_source" : { "date" : "2012-10-14", "description" : 
"Nail polish", "amount" : 70.00}
}, {
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "1",
  "_score" : 1.0, "_source" : { "date" : "2013-01-01", "description" : 
"Nail polish", "amount" : 75.50}
} ]
  }
}

 curl -XGET localhost:9200/d3/transactions/1?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "1",
  "_version" : 2,
  "exists" : true, "_source" : { "date" : "2013-01-01", "description" : 
"Nail polish", "amount" : 75.50}
}


curl -XGET localhost:9200/d3/transactions/2?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "2",
  "exists" : false
}


curl -XGET localhost:9200/d3/transactions/3?pretty
{
  "_index" : "d3",
  "_type" : "transactions",
  "_id" : "3",
  "exists" : false
}


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/41376241-31ab-488a-bac8-19618cbc60be%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Upgrades causing Elastic Search downtime

2014-01-09 Thread Ivan Brusic
That is definitely not the behavior I have ever seen with elasticsearch. If
you restart a node with allocation disabled, the restarted node will have
no shards and the shards that it should contain are marked as unassigned. I
have never seen a node reinitialize the shards it has.

Cheers,

Ivan


On Thu, Jan 9, 2014 at 3:58 PM, Mark Walkom wrote:

> That setting tells the nodes to hold the shards they currently have, and
> in the event of a node going down for a restart/upgrade, don't redistribute
> across the cluster.
> When you bring the rebooted/upgraded node back it'll locally reinitialise
> the shards it still has.
>
> You can set that setting back to false when you have completed the
> upgrades/restarts and the cluster can rebalance if it feels the need to.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDJmdA8q_HC-Vsv4ucsj-p4AicAdBpz%3DZju6dohQuXhbw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Upgrades causing Elastic Search downtime

2014-01-09 Thread Mark Walkom
That setting tells the nodes to hold the shards they currently have, and in
the event of a node going down for a restart/upgrade, don't redistribute
across the cluster.
When you bring the rebooted/upgraded node back it'll locally reinitialise
the shards it still has.

You can set that setting back to false when you have completed the
upgrades/restarts and the cluster can rebalance if it feels the need to.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 10 January 2014 04:07, Ivan Brusic  wrote:

> Perhaps I am missing some functionality since I am still on version
> 0.90.2, but wouldn't you have to disable/enable allocation after each
> server restart during a rolling upgrade? A restarted node will not host any
> shards with allocation disabled.
>
> Cheers,
>
> Ivan
>
>
> On Wed, Jan 8, 2014 at 5:48 PM, Mark Walkom wrote:
>
>> Disabling allocation is definitely a temporary only change, you can set
>> it back once you're upgrades are done.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 9 January 2014 02:47, Jenny Sivapalan wrote:
>>
>>> Thanks both for the replies. Our rebalance process doesn't take too long
>>> (~5 mins per node). I had some of the plugins (head, paramedic, bigdesk)
>>> open as I was closing down the old nodes and didn't see any split brain
>>> issue although I agree we can lead ourselves down this route by doubling
>>> the instances. We want our cluster to rebalance as we bring nodes in and
>>> out so disabling is not going to work for us unless I'm misunderstanding?
>>>
>>>
>>> On Tuesday, 7 January 2014 22:16:46 UTC, Mark Walkom wrote:

 You can also use cluster.routing.allocation.disable_allocation to
 reduce the need of waiting for things to rebalance.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 8 January 2014 04:41, Ivan Brusic  wrote:

> Almost elasticsearch should support clusters of nodes with different
> minor versions, I have seen issues between minor versions. Version 0.90.8
> did contain an upgrade of Lucene (4.6), but that does not look like it
> would cause your issue. You could look at the github issues tagged
> 0.90.[8-9] and see if something applies in your case.
>
> A couple of points about upgrading:
>
> If you want to use the double-the-nodes techniques (which should not
> be necessary for minor version upgrades), you could "decommission" a node
> using the Shard API. Here is a good writeup: http://blog.sematext.
> com/2012/05/29/elasticsearch-shard-placement-control/
>
> Since you doubled the amount of nodes in the cluster,
> the minimum_master_nodes setting would be temporarily incorrect and
> potential split-brain clusters might occur. In fact, it might have 
> occurred
> in your case since the cluster state seems incorrect. Merely 
> hypothesizing.
>
> Cheers,
>
> Ivan
>
>
> On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan <
> jennifer@gmail.com> wrote:
>
>> Hello,
>>
>> We've upgraded Elastic Search twice over the last month and have
>> experienced downtime (roughly 8 minutes) during the roll out. I'm not 
>> sure
>> if it something we are doing wrong or not.
>>
>> We use EC2 instances for our Elastic Search cluster and cloud
>> formation to manage our stack. When we deploy a new version or change to
>> Elastic Search we upload the new artefact, double the number of EC2
>> instances and wait for the new instances to join the cluster.
>>
>> For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9
>> version via our deployment process and double the number nodes for the
>> cluster (12). The 6 new nodes will join the cluster with the 0.90.9
>> version.
>>
>> We then want to remove each of the 0.90.7 nodes. We do this by
>> shutting down the node (using the plugin head), wait for the cluster to
>> rebalance the shards and then terminate the EC2 instances. Then repeat 
>> with
>> the next node. We leave the master node until last so that it does the
>> re-election just once.
>>
>> The issue we have found in the last two upgrades is that while the
>> penultimate node is shutting down the master starts throwing errors and 
>> the
>> cluster goes red. To fix this we've stopped the Elastic Search process on
>> master and have had to restart each of the other nodes (though perhaps 
>> they
>> would have rebalanced themselves in a longer time period?). We find
>> that we send an increase error response to our clients during this time.
>>
>> We've set out queue size for sear

Behavior of missing and exists with nested objects

2014-01-09 Thread Nathan
I am trying to figure out how missing/exists works with nested objects, and 
what I am seeing is very strange.   Here is a gist to demonstrate: 
 https://gist.github.com/nathanmoon/8344115

I have a nested object, ratings, with a mapping like:

"ratings" : {
"type" : "nested",
"properties" : {
"rater_username" : {
"type" : "string",
"index" : "not_analyzed"
},
"rating" : {
"type" : "integer",
"index" : "not_analyzed"
}
}
}


My data looks like this:

curl -XPOST "http://localhost:9200/nestedfilters/item/"; -d '
{
"description" : "Rated by user1",
"ratings" : [{
"rater_username" : "user1",
"rating" : 10
}]
}
'

curl -XPOST "http://localhost:9200/nestedfilters/item/"; -d '
{
"description" : "Rated but missing username",
"ratings" : [{
"rating" : 10
}]
}
'

curl -XPOST "http://localhost:9200/nestedfilters/item/"; -d '
{
"description" : "Rated by empty set",
"ratings" : []
}
'

curl -XPOST "http://localhost:9200/nestedfilters/item/"; -d '
{
"description" : "Rated by nobody",
}
'


Here is what I'm getting with various filters (sorry for the formatting):

-

FILTER:

"filter" : {
"missing" : {
"field" : "ratings"
}
}

WHAT I WOULD EXPECT:

"Rated by empty set"
"Rated by nobody"

IT RETURNS:

"Rated by user1"
"Rated but missing username"
"Rated by empty set"


---


FILTER:

"filter" : {
"nested" : {
"path" : "ratings",
"filter" : {
"missing" : {
"field" : "ratings.rating"
}
}
}
}

WHAT I WOULD EXPECT:

"Rated by empty set"
"Rated by nobody"

IT RETURNS:

[empty set]

-

FILTER:

"filter" : {
"not" : {
"nested" : {
"path" : "ratings",
"filter" : {
"exists" : {
"field" : "ratings.rating"
}
}
}
}
}

WHAT I WOULD EXPECT:

"Rated by empty set"
"Rated by nobody"

IT RETURNS:

"Rated by empty set"

-

FILTER:

"filter" : {
"not" : {
"nested" : {
"path" : "ratings",
"filter" : {
"exists" : {
"field" : "ratings.rater_username"
}
}
}
}
}

WHAT I WOULD EXPECT:

"Rated but missing username"
"Rated by empty set"
"Rated by nobody"

IT RETURNS:

"Rated but missing username"
"Rated by empty set"

-


Can anyone explain what I am seeing, and what is the best way to index and 
query for 'missing' nested fields?  Thank you!

Nathan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eee98195-312f-4722-81ae-4c8c39e1f029%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: logstash vs rivers for reading data from SQL Server

2014-01-09 Thread jsp
Currently logstash provides an api to get data through sqlite. And have 
been looking at how it is being used for sqlite and will try to replicate 
that for Sql Server.
Will post code if I get it working.
Thanks
--J

On Thursday, 9 January 2014 00:23:36 UTC-8, Alexander Reelsen wrote:
>
> Hey,
>
> maybe you should ask your developers, why they recommended logstash for 
> this, I cant follow here (perhaps there is some export functionality in 
> your SQL server, which an input of logstash can use?). Would be interested 
> in reasons in this case.
>
>
> --Alex
>
>
> On Wed, Jan 8, 2014 at 5:26 PM, jsp >wrote:
>
>> Hi,
>> I am looking at implementing ES to index & query data that I get from my 
>> SQL Server databases/tables.  I was initially using river to read data from 
>> Sql server tables but one of the developers in my team recommended looking 
>> at using logstash. Can anyone comment to any benefits of using one over 
>> another?   I have not been able to find any documentation regarding reading 
>> data from SQL server using logstash . 
>>  Can someone point me to a guide on how to get started with logstash & sql 
>> server.  
>> Thanks
>> J
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/1173b62c-afd3-4d2d-9a3f-ba423ed7ede4%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a0f36d8-1868-4608-9c05-558bff9b48ac%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Setup parent/child using rivers

2014-01-09 Thread jsp


I am reading data from Sql Server database/table using jdbc-river 
currently. As of now I have created a type for each  table in my database. 
As next step in my implementation I would like to use parent/child types so 
that I can translate the relationship between my sql tables and store them.

Table1
Col_id| name| prop1|prop2|prop3

child_table1
col_id| table_id| child_prop1|child_prop2|child_prop3


curl -XPUT 'localhost:9200/_river/parent/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "",
"password" : "",
"sql" : "select * from table1",
"index" : "index1",
"type" : "parent"
}
}'

curl -XPUT 'localhost:9200/_river/child/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/test",
"user" : "",
"password" : "",
"sql" : "select * from child_table1",
"index" : "index1",
"type" : "child"
}
}'

curl -XPOST 'localhost:9200/_river/child/_mapping' -d '{
  "child":{
"_parent": {"type": "parent"}
  }
}'

I would like to store my data in the following format

{
  "id": "1",
  "name": "name1",
  "prop1": "data",
  "prop2": "data",
  "prop3": "data",

  "child": [
{
  "child_prop1": "data",
  "child_prop2": "data",
  "child_prop3": "data",
}
{
  "child_prop1": "data1",
  "child_prop2": "data1",
  "child_prop3": "data1",
}
  ]}

Can anyone comment on how can I use jdbc-rivers to store my data as 
parent/child type for above scenario.

thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de9ede70-45f2-45d5-bdfb-143d95852262%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cluster not able to keep up?

2014-01-09 Thread Steve Mayzak
I'd like to help on this if I can but as you noted, adding more nodes 
smoothed things out.

First, a couple of clarifications based on what others have said.

1.  I'll reiterate that 1024 is too high for threadpool and Elasticsearch 
comes OOTB with sensible defaults that shouldn't be changed in most cases 
in our experience.  
2.  The reason merges vary in time is that we throttle that as part of our 
default settings.  Elasticsearch is designed to be a good citizen and no 
one part of the system should overload/starve another.  See here for 
details: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html
3.  A shard is built to take advantage of multiple cores out of the box for 
indexing and querying, highly concurrent, so I wouldn't worry about trying 
to correlate the number of cores with the number of shards in an index.

Without seeing the monitoring metrics while your cluster is steady and then 
sporadic, its not as easy to troubleshoot.  If you have time to share some 
of those stats over time (CPU, Memory, Disk IO, Netowrk and JVM related 
metrics) we could help you further.

On Monday, January 6, 2014 9:15:45 AM UTC-8, tdjb wrote:
>
> Otis, our index rate actually stays very flat until the issue occurs and 
> then after that the index rate just goes up and down, it never grows at a 
> steady rate or anything, it just jumps all over the place.
>
> James, thank you for the suggestion. I've been gathering info on the 
> dirty_background_ratio topic and will start looking at our systems to see 
> if I can find anything that indicates that might be the issue.
>
> On Friday, January 3, 2014 8:16:19 PM UTC-7, Otis Gospodnetic wrote:
>>
>> Hi,
>>
>> I bet it's Lucene segment merges.  You have more machines so you can 
>> sustain high input rates longer, but I bet you'll hit the moment when the 
>> indexing rate drops again.
>>
>> Check this graph:
>> https://apps.sematext.com/spm-reports/s/eUgWhPqZrg
>> (just look at the last big "tooth")
>> Or instead of looking at the # of docs growing slower and slower (while 
>> the input rate remains the same, like in our case), look at the indexing 
>> rate graph:
>> https://apps.sematext.com/spm-reports/s/MZbHLRt4qY
>> (again, just look at the last big "tooth")
>>
>> Does your indexing rate look the same?
>>
>> If so, look at your disk IO.  Here is the disk IO for the same cluster as 
>> above:
>> https://apps.sematext.com/spm-reports/s/sHahBnvoUw
>>
>> Those reads you see there that, I believe, is due to Lucene segment 
>> merges.
>>
>> Hm, in your case you said there is no waiting in the system above 
>> there is.
>>
>> Btw. you have 32 CPU cores on each server and only 10 shards and 1 
>> replica?  You could try more shards then to keep all your CPUs busy.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>> On Friday, January 3, 2014 5:34:34 PM UTC-5, tdjb wrote:
>>>
>>> So we were able to secure some temporary machines to increase the 
>>> cluster and that seemed to fix the issue. We set the bulk threads back to 
>>> the default, added more machines and now seem to be able to handle ~40k a 
>>> second in a stable manner for a long period of time.
>>>
>>> On Friday, January 3, 2014 1:56:24 PM UTC-7, Jörg Prante wrote:

 This all sounds good from a far distance view  a wild guess is that 
 since 8-10mb a sec is pretty well (I never saw more on the systems I have 
 here, also with gigabit), there are some internal limits active which 
 might 
 need some tuning (maybe index throttling or thread pools?) or adding nodes 
 could be the next step...

 ES is not creating CPU load in a constant manner, it is more cyclic. It 
 has to start the Lucene mergers in the background, and when they kick in, 
 the indexing and searching performance is affected for seconds and 
 minutes, 
 mostly depending on the IO throughput capacity of the disk subsystem. So 
 without seeing live monitor data, it is hard to make educated guesses if 
 it's the merging effect, or if something else is going on.

 Maybe that is the point when you should ask the ES core team for 
 professional advise how to stabilize maximum performance over time.

 Jörg



-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ee89f7b7-0f94-4532-bc9a-f268eeb2a59b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


[Ann] Knapsack plugin 0.90.9.1 for Elasticsearch 0.90.9

2014-01-09 Thread joergpra...@gmail.com
Hi,

a new Knapsack plugin version 0.90.9.1 is released:

https://github.com/jprante/elasticsearch-knapsack/releases/tag/0.90.9.1


Changes:

- more archive support (ZIP, TAR, CPIO) and compression codecs (gzip,
bzip2, lzf, xz)

- ES queries can be used to select content for archiving

- 'target' parameter renamed to 'path'

- new 'map' parameter for mapping index names and index/type names

- support for document meta fields (_parent, _routing, _version,
_timestamp, _source)

- direct copy to local or remote cluster (endpoints _export/copy,
_import/copy)

- optional AWS S3 support (endpoints _export/s3, _import/s3)

- archive entry names have four components: index/type/id/fieldname, where
fieldname can contain arbitray (stored) fields, not only _source

Project docs:

http://jprante.github.io/elasticsearch-knapsack

Binaries:

https://bintray.com/pkg/show/general/jprante/elasticsearch-plugins/elasticsearch-knapsack

Best,

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHHayL8GaBzXzNH7A%2ByhQTvOjPOHDv-o7SWSSyc-5eSRA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Using NOT in a nested filter

2014-01-09 Thread Nathan Moon
Oh right.  That should have been obvious.  It seems to be working great that 
way.  Thanks!

Nathan

On Jan 9, 2014, at 1:10 PM, Sloan Ahrens  wrote:

> You were close. You just had the "nested" and "not" filters in the wrong
> order, basically. 
> 
> Your (first) query says "return items that have a rating with
> 'ratings.rater_username' not equal to 'user1'". And so you get the first
> item, since it meets that requirement. 
> 
> What you really want to say is "return items for which all ratings have
> 'ratings.rater_username' not equal to 'user1'". Here is the query you want:
> 
> curl -XPOST "http://localhost:9200/nestedfilters/item/_search"; -d'
> {
>   "query": {
>  "match_all": {}
>   },
>   "filter": {
>  "not": {
> "nested": {
>"path": "ratings",
>"filter": {
>   "term": {
>  "ratings.rater_username": "user1"
>   }
>}
> }
>  }
>   }
> }'
> 
> Here is a runnable example you can play with (you will need ES installed and
> running at localhost:9200, or supply another endpoint):
> http://sense.qbox.io/gist/289ceb80480db8b6574d5f879358e50c97aaf5da
> 
> 
> 
> 
> -
> Co-Founder and CTO, StackSearch, Inc. 
> Hosted Elasticsearch at http://qbox.io
> --
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Using-NOT-in-a-nested-filter-tp4047349p4047353.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/elasticsearch/7yWbMCYmAFw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/1389298238074-4047353.post%40n3.nabble.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/128FBB57-B971-4D1C-A3A6-E4F5A3F2BC3D%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Using NOT in a nested filter

2014-01-09 Thread Sloan Ahrens
You were close. You just had the "nested" and "not" filters in the wrong
order, basically. 

Your (first) query says "return items that have a rating with
'ratings.rater_username' not equal to 'user1'". And so you get the first
item, since it meets that requirement. 

What you really want to say is "return items for which all ratings have
'ratings.rater_username' not equal to 'user1'". Here is the query you want:

curl -XPOST "http://localhost:9200/nestedfilters/item/_search"; -d'
{
   "query": {
  "match_all": {}
   },
   "filter": {
  "not": {
 "nested": {
"path": "ratings",
"filter": {
   "term": {
  "ratings.rater_username": "user1"
   }
}
 }
  }
   }
}'

Here is a runnable example you can play with (you will need ES installed and
running at localhost:9200, or supply another endpoint):
http://sense.qbox.io/gist/289ceb80480db8b6574d5f879358e50c97aaf5da




-
Co-Founder and CTO, StackSearch, Inc. 
Hosted Elasticsearch at http://qbox.io
--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/Using-NOT-in-a-nested-filter-tp4047349p4047353.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1389298238074-4047353.post%40n3.nabble.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Restarting an active node without needing to recover all data remotely.

2014-01-09 Thread Ankush Jhalani
That would be very nice to have, thanks for the update.


On Thu, Jan 9, 2014 at 2:05 PM, Zachary Tong  wrote:

> Just wanted to add a quick note: long recovery times (due to divergence of
> shards between primary/replica) is an issue that we will be an addressing.
>  No ETA as of yet, but something that is on the roadmap. :)
>
> -Zach
>
>
>
>
> On Wednesday, December 4, 2013 7:48:04 PM UTC-5, Greg Brown wrote:
>>
>> Thanks for the many responses, they were very helpful.
>>
>> For posterity, I wrote up a more detailed post of how we are managing
>> restart times for our cluster: http://gibrown.wordpress.com/2013/12/05/
>> managing-elasticsearch-cluster-restart-time/
>>
>> -Greg
>>
>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/9uF-a5vqfkQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c0336689-f647-4b60-837e-af8c2af6a9dc%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAE779yBvAuhMOXR47J6kFgqg4MLr0c8nJ1psxKKJXXGXwyryZA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Restarting an active node without needing to recover all data remotely.

2014-01-09 Thread Zachary Tong
Just wanted to add a quick note: long recovery times (due to divergence of 
shards between primary/replica) is an issue that we will be an addressing. 
 No ETA as of yet, but something that is on the roadmap. :)

-Zach



On Wednesday, December 4, 2013 7:48:04 PM UTC-5, Greg Brown wrote:
>
> Thanks for the many responses, they were very helpful.
>
> For posterity, I wrote up a more detailed post of how we are managing 
> restart times for our cluster: 
> http://gibrown.wordpress.com/2013/12/05/managing-elasticsearch-cluster-restart-time/
>
> -Greg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c0336689-f647-4b60-837e-af8c2af6a9dc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Using NOT in a nested filter

2014-01-09 Thread Nathan
I am having trouble with a filter.  I have items in my index, with nested 
"ratings"

curl -XPOST "http://localhost:9200/nestedfilters/item/_mapping"; -d '
{
"item" : {
"properties" : {
"description" : {
"type" : "string"
},
"ratings" : {
"type" : "nested",
"properties" : {
"rater_username" : {
"type" : "string",
"index" : "not_analyzed"
},
"rating" : {
"type" : "integer",
"index" : "not_analyzed"
}
}
}
}
}
}
'

I want to be able to find items where a certain user has not rated the 
item. I have tried using NOT, but it finds anything rated by anybody else, 
regardless of whether the specific user has rated it.  I can't seem to 
figure out how to use a MISSING filter either. Here is what I have tried:

curl -XPOST "http://localhost:9200/nestedfilters/item/_search?pretty=true"; 
-d '
{
"query" : {
"match_all" : {}
},
"filter" : {
"nested" : {
"path" : "ratings",
"filter" : {
"not" : {
"term" : {
"ratings.rater_username" : "user1"
}
}
}
}
}
}
'

and


curl -XPOST "http://localhost:9200/nestedfilters/item/_search?pretty=true"; 
-d '
{
"query" : {
"match_all" : {}
},
"filter" : {
"nested" : {
"path" : "ratings",
"filter" : {
"and" : [{
"term" : {
"ratings.rater_username" : "user1"
}
},{
"missing" : {
"field" : "ratings.rating"
}
}]
}
}
}
}
'

Here is the gist with a full example: 
 https://gist.github.com/nathanmoon/8339950.

Is there another way I haven't thought of to craft a filter like this?  Or 
do I need to index my data differently to support this type of filtering? 
 Thanks for any help!

Nathan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc6ce18d-2923-4b87-b992-fc81a72c69a4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch Missing Data

2014-01-09 Thread Eric Luellen
Alexander,

1. The only odd log entry was at 19:00 on 1/7/14, which was about 1 hr. 
before logs stopped. These logs are on the master and She-Hulk is the only 
other node.

[2014-01-07 19:00:02,947][DEBUG][indices.recovery ] [ElasticSearch 
Server1] [logstash-2014.01.08][0] recovery completed from 
[She-Hulk][_MtrVsSmQIaM-BErhEtg9w][inet[/10.1.11.111:9300]], took[333ms]
   phase1: recovered_files [1] with total_size of [71b], took [68ms], 
throttling_wait [0s]
 : reusing_files   [0] with total_size of [0b]
   phase2: start took [13ms]
 : recovered [17] transaction log operations, took [12ms]
   phase3: recovered [0] transaction log operations, took [164ms]
[2014-01-07 19:00:03,375][DEBUG][indices.recovery ] [ElasticSearch 
Server1] [logstash-2014.01.08][2] recovery completed from 
[She-Hulk][_MtrVsSmQIaM-BErhEtg9w][inet[/10.1.11.111:9300]], took[502ms]
   phase1: recovered_files [1] with total_size of [71b], took [30ms], 
throttling_wait [0s]
 : reusing_files   [0] with total_size of [0b]
   phase2: start took [6ms]
 : recovered [6] transaction log operations, took [38ms]
   phase3: recovered [13] transaction log operations, took [20ms]
[2014-01-07 19:00:06,898][INFO ][cluster.metadata ] [ElasticSearch 
Server1] [logstash-2014.01.08] update_mapping [logs] (dynamic)

Also, on She-Hulk I got an error stating that the master_left at 20:52 
because it wasn't pingable, but not sure why.

2.I am not sure. I was thinking that the shard should still be there but 
just unassigned and once it came back up, it'd start processing it.
3. On both my master and my 2ndary, the config is in 
/etc/elasticsearch/elasticsearch.yml and it is ran by 
/etc/init.d/elasticsearch. On the master, it works fine and make the 
correct node name, cluster name, data directory, etc. It is an identical 
setup on the 2ndary but it only grabs the cluster name. Everything else 
defaults to some other location.On the secondary, the only data location is 
in /var/lib/elasticsearch/node-name. In the config I tell it to go to 
/etc/elasticsearch/data. On the master it is in the correct location of 
/etc/elasticsearch/data. 

So overall, I guess the first issue was something weird happened to my 
server and not much I can do about that. I'm more interested in the 3rd 
question now since I still don't know why it's not reading that full config 
file but obviously part of it since it's part of my cluster.




On Thursday, January 9, 2014 3:30:40 AM UTC-5, Alexander Reelsen wrote:
>
> Hey,
>
> a couple of things:
>
> 1. Did you check the log files? Most likely in /var/log/elasticsearch if 
> you use the packages. Is there anything suspicious at the time of your 
> outage? Please check your master node as well, if you have one (not sure if 
> it is a master or client node from the cluster health).
> 2. Why should elasticsearch pull your data? Any special configuration you 
> didnt mention? Or what exactly do you mean here?
> 3. Happy to debug your issue with the init script. The elasticsearch.yml 
> file should be in /etc/elasticsearch/ and not in /etc - anything manually 
> moved around? Can you still reproduce it?
>
>
> --Alex
>
>
>
>
> On Wed, Jan 8, 2014 at 8:10 PM, Eric Luellen 
> > wrote:
>
>> Hello,
>>
>> I've had my elasticsearch instance running for about a week with no 
>> issues, but last night it stopped working. When I went to look in Kibana, 
>> it stops logging around 20:45 on 1/7/14. I then restarted the service on 
>> both both elasticsearch servers and it started logging again and back 
>> pulled some logs from 07:10 that morning, even though I restarted the 
>> service around 10:00. So my questions are:
>>
>> 1. Why did it stop working? I don't see any obvious errors.
>> 2. When I restarted it, why didn't it go back and pull all of the data 
>> and not just some of it? I see that there are no unassigned shards.
>>
>> curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
>> {
>>   "cluster_name" : "my-elasticsearch",
>>   "status" : "green",
>>   "timed_out" : false,
>>   "number_of_nodes" : 3,
>>   "number_of_data_nodes" : 2,
>>   "active_primary_shards" : 40,
>>   "active_shards" : 80,
>>   "relocating_shards" : 0,
>>   "initializing_shards" : 0,
>>   "unassigned_shards" : 0
>>
>> Are there any additional queries or logs I can look at to see what is 
>> going on? 
>>
>> On a slight side note, when I restarted my 2nd elasticsearch server it 
>> isn't reading from the /etc/elasticsearch.yml file like it should. It isn't 
>> creating the node name correctly or putting the data files in the spot I 
>> have configured. I'm using CentOS and doing everything via 
>> /etc/init.d/elasticsearch on both servers and the elasticsearch1 server 
>> reads everything correctly but elasticsearch2 does not.
>>
>> Thanks for your help.
>> Eric
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this gr

Percolate Query using _size matches no documents

2014-01-09 Thread Ryan Small
Is filtering on the _size field allowed on percolate requests? 

Adding into a percolate query in either a filter or the query section a 
_size range matches no documents. 

Walking through reproducing the problem: 

Register the query: 

curl -XPUT 'http://localhost:9200/_percolator/test_index/queryNamedSue' -d 
'{ 
  "query" : { 
"constant_score": { 
  "filter": { 
"and": [ { 
  "query" : { 
"query_string" : { 
  "query" : "batman", 
  "default_field" : "all" 
} 
  } 
}, { 
  "bool" : { 
"must" : [ { 
   "term" : { "ni.language" : "en" } 
}, { 
  "range" : { 
"_size" : { 
  "from" : 0, 
  "to" : 1, 
  "include_lower" : true, 
  "include_upper" : true 
} 
  } 
} ] 
  } 
} ] 
  } 
} 
  } 
}' 

Then percolate a document (aside: 'ni.content' field in our mapping is a 
default search field): 

curl -XGET 'http://localhost:9200/test_index/testType/_percolate' -d '{ 
"doc": { 
  "ni": { 
"content": "So Batman walks into a bar", 
"language": "en" 
  } 
} 
}' 

Which results in 

{"ok":true,"matches":["queryNamedSue"]} 

Now if the query is changed to include a _size range, 

curl -XPUT 'http://localhost:9200/_percolator/test_index/queryNamedSue' -d 
'{ 
  "query" : { 
"constant_score": { 
  "filter": { 
"and": [ { 
  "query" : { 
"query_string" : { 
  "query" : "batman", 
  "default_field" : "all" 
} 
  } 
}, { 
  "bool" : { 
"must" : [ { 
   "term" : { "ni.language" : "en" } 
}, { 
  "range" : { 
"_size" : { 
  "from" : 0, 
  "to" : 1, 
  "include_lower" : true, 
  "include_upper" : true 
} 
  } 
} ] 
  } 
} ] 
  } 
} 
  } 
}' 

Percolating the same document yields 

{"ok":true,"matches":[]} 

I have researched and found that percolating with a mapping that enables 
and stores _size was failing over a year ago, but this issue was patched:
https://github.com/elasticsearch/elasticsearch/pull/2353. 

We set a default template that for all types in the mapping enables _size 
and sets it to store.  Our percolator node uses the following 
configuration: 

Index Settings: 

index.number_of_shards: variesBasedOnWorkload 
index.number_of_replicas: 0 
index.auto_expand_replicas: "false" 
index.dynamic: "true" 
index.mapper.dynamic: "true" 
index.store.compress.stored: "true" 
index.store.compress.tv: "true" 
index.term_index_divisor: "4" 
index.merge.scheduler.max_thread_count: 1 

Node Settings: 

cache.memory.direct: "false" 
http.enabled: "false" 
gateway.type: "none" 
index.store.type: "memory" 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f1f339b5-3419-4e91-a962-4875eb7def78%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Kibana3 - terms panel with range facet chart

2014-01-09 Thread Erik Paulsson
Hey all,

I just started using ElasticSearch with LogStash and Kibana.  I'm able to 
extract fields from my log statements using logstash/grok.  In Kibana I 
have taken some of these fields and created stats panels using them for 
stats like total/mean/min/max which works great for just seeing a 
calculated number value quickly.
What I would like to do next is create a bar chart that can display the 
count of occurrences for my extracted field within different ranges.  So 
say my field is called "upload_size", I would like to create a pie chart 
that displays the count of files uploaded within defined ranges.
For example I would like to see counts of "upload_size" fields with values 
in these ranges: 0-10KB, 10KB-100KB, 100KB-1MB, 1MB-10MB, 10MB-100MB, 
100MB-1GB, 1GB+ and plotted in a pie chart.
I've experimented with the "terms" panel creating a pie chart but don't 
don't see a way to define ranges.  It seems this would be possible using 
ElasticSearch "range 
facets": 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-range-facet.html

Is it possible to do this currently in Kibana3?  If not, is this on the 
roadmap?  I am using Kibana3 milestone 4.

Thanks,
Erik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/15c8befa-4d0e-4f58-b193-d781d48a05da%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Corrupt index creation when elasticsearch is killed just after index is created

2014-01-09 Thread InquiringMind
*Never, never, never* kill -9 and expect any application to properly and 
cleanly shut down. Never.

The -9 signal cannot be caught by the process to which it is directed. The 
process is ended in the middle for whatever it is doing.

Issue a normal kill, and then ES (via the JVM) will have a chance to finish 
up whatever it is working on, and then shut down cleanly.

Brian

On Wednesday, December 25, 2013 11:36:32 PM UTC-5, tarang dawer wrote:
>
> i have reliably recreated this many times, happens while creating index on 
> a single node, (default 5 shards). i have set  "action.auto_create_index: 
> false" , "discovery.zen.ping.multicast.enabled: false" & "node.master=true" 
> so i am creating indices via java API, . i kill(*Kill -9* ) the 
> elasticsearch immediately after the index is created.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1ba7e47f-0d9e-44fd-b1b3-628da214b499%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread InquiringMind
Chris,

I updated one of my tests to reproduce your issue. My text field is a 
multi-field where *text.na* is the text field without any analysis at all.

This Lucene query does not find anything at all:

{
  "bool" : {
"must" : {
  "query_string" : {
"query" : "*text.na:Immortal-Li**"
  }
}
  }
}

But this one works fine:

{
  "bool" : {
"must" : {
  "prefix" : {
"text.na" : {
  "*prefix*" : "*Immortal-Li*"
}
  }
}
  }
}

And returns the two documents that I expected:

{ "_index" : "mortal" , "_type" : "elf" , "_id" : "1" , "_version" : 1 , 
"_score" : 1.0 , "_source" :
   { "cn" : "Celeborn" , "text" : "Immortal-Lives forever" } }

{ "_index" : "mortal" , "_type" : "elf" , "_id" : "2" , "_version" : 1 , 
"_score" : 1.0 , "_source" :
   { "cn" : "Galadriel" , "text" : "Immortal-Lives forever" } }

Note that in both cases, the query's case must match since the field value 
is not analyzed at all.

I'm not sure if this is a true bug. In general, I find Lucene syntax 
somewhat useful for ad-hoc queries, and I find their so-called Simple Query 
Parser syntax to be completely unable to find anything when there is no 
_all field, whether or not I specify a default field. (But that's another 
issue I'm going to ask about in the near future.)

Brian

On Thursday, January 9, 2014 8:27:04 AM UTC-5, Chris H wrote:
>
> Hi, Jun.
>
> That doesn't seem to work.  For a user with the username bob.smith-jones:
>
>- bob.smith-jones -> matches
>- bob.smith- -> matches
>- bob.smi* -> matches
>- bob.smith-j* -> no results
>- bob.smith\-j* -> no results
>
> Also, a "$" isn't one of the special characters.
>
> Thanks.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6cb908eb-9ca7-4f05-815f-a868c45f9f66%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Upgrades causing Elastic Search downtime

2014-01-09 Thread Ivan Brusic
Perhaps I am missing some functionality since I am still on version 0.90.2,
but wouldn't you have to disable/enable allocation after each server
restart during a rolling upgrade? A restarted node will not host any shards
with allocation disabled.

Cheers,

Ivan


On Wed, Jan 8, 2014 at 5:48 PM, Mark Walkom wrote:

> Disabling allocation is definitely a temporary only change, you can set it
> back once you're upgrades are done.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 9 January 2014 02:47, Jenny Sivapalan wrote:
>
>> Thanks both for the replies. Our rebalance process doesn't take too long
>> (~5 mins per node). I had some of the plugins (head, paramedic, bigdesk)
>> open as I was closing down the old nodes and didn't see any split brain
>> issue although I agree we can lead ourselves down this route by doubling
>> the instances. We want our cluster to rebalance as we bring nodes in and
>> out so disabling is not going to work for us unless I'm misunderstanding?
>>
>>
>> On Tuesday, 7 January 2014 22:16:46 UTC, Mark Walkom wrote:
>>>
>>> You can also use cluster.routing.allocation.disable_allocation to
>>> reduce the need of waiting for things to rebalance.
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 8 January 2014 04:41, Ivan Brusic  wrote:
>>>
 Almost elasticsearch should support clusters of nodes with different
 minor versions, I have seen issues between minor versions. Version 0.90.8
 did contain an upgrade of Lucene (4.6), but that does not look like it
 would cause your issue. You could look at the github issues tagged
 0.90.[8-9] and see if something applies in your case.

 A couple of points about upgrading:

 If you want to use the double-the-nodes techniques (which should not be
 necessary for minor version upgrades), you could "decommission" a node
 using the Shard API. Here is a good writeup: http://blog.sematext.
 com/2012/05/29/elasticsearch-shard-placement-control/

 Since you doubled the amount of nodes in the cluster,
 the minimum_master_nodes setting would be temporarily incorrect and
 potential split-brain clusters might occur. In fact, it might have occurred
 in your case since the cluster state seems incorrect. Merely hypothesizing.

 Cheers,

 Ivan


 On Tue, Jan 7, 2014 at 9:26 AM, Jenny Sivapalan >>> > wrote:

> Hello,
>
> We've upgraded Elastic Search twice over the last month and have
> experienced downtime (roughly 8 minutes) during the roll out. I'm not sure
> if it something we are doing wrong or not.
>
> We use EC2 instances for our Elastic Search cluster and cloud
> formation to manage our stack. When we deploy a new version or change to
> Elastic Search we upload the new artefact, double the number of EC2
> instances and wait for the new instances to join the cluster.
>
> For example 6 nodes form a cluster on v 0.90.7. We upload the 0.90.9
> version via our deployment process and double the number nodes for the
> cluster (12). The 6 new nodes will join the cluster with the 0.90.9
> version.
>
> We then want to remove each of the 0.90.7 nodes. We do this by
> shutting down the node (using the plugin head), wait for the cluster to
> rebalance the shards and then terminate the EC2 instances. Then repeat 
> with
> the next node. We leave the master node until last so that it does the
> re-election just once.
>
> The issue we have found in the last two upgrades is that while the
> penultimate node is shutting down the master starts throwing errors and 
> the
> cluster goes red. To fix this we've stopped the Elastic Search process on
> master and have had to restart each of the other nodes (though perhaps 
> they
> would have rebalanced themselves in a longer time period?). We find
> that we send an increase error response to our clients during this time.
>
> We've set out queue size for search to 300 and we start to see the
> queue gets full:
>at java.lang.Thread.run(Thread.java:724)
> 2014-01-07 15:58:55,508 DEBUG action.search.type[Matt Murdock]
> [92036651] Failed to execute fetch phase
> org.elasticsearch.common.util.concurrent.EsRejectedExecutionException:
> rejected execution (queue capacity 300) on org.elasticsearch.action.
> search.type.TransportSearchQueryThenFetchAction$AsyncAction$2@23f1bc3
> at org.elasticsearch.common.util.concurrent.EsAbortPolicy.
> rejectedExecution(EsAbortPolicy.java:61)
> at java.util.concurrent.ThreadPoolExecutor.reject(
> ThreadPoolExecutor.java:821)
>
>
> But also we see the following error which we'v

Elasticsearch 0.90 insallation with .rpm and logging

2014-01-09 Thread Srividhya Umashanker
All - 

I downloaded the latest ES rpm and installed on centos. When tried running 
the script in foreground, i see that the logging is not enabled.

tried both /etc/init.d/elasticsearch -f and   /usr/share/elasticsearch/bin/ 
-f both. 

log4j:WARN No appenders could be found for logger (node).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
more info.


Searched for answers in the internet and tried the following as well

I tried adding export ES_CLASSPATH=/etc/elasticsearch/logging.yml 
changed patch.conf, path.data in elasticsearch.yml file also 

But, Nothing seems to fix the logging problem. Is there an easier solution 
to this problem? 

-vidhya

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f074d2c-4972-469e-a5f5-89f326dd7f0b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Is there a kind of query/rescore/similarity magic that lets me know if all the terms in a field are matched?

2014-01-09 Thread InquiringMind
Nik,

No, there is not.

There's a work-around in which the number of terms in a field can be stored 
in another field during indexing time. And then you can analyze your query 
string to count the number of terms, and then use that count to match 
against the documents that have the same count. But consider the following 
field:

"text" : "Very Big Dog"

Three terms in the field's value, right?

And consider the query:

"+text:very +text:very +text:very"

As in:

{
  "bool" : {
"must" : [ {
  "match" : {
"text" : {
  "query" : "very",
  "type" : "boolean"
}
  }
}, {
  "match" : {
"text" : {
  "query" : "very",
  "type" : "boolean"
}
  }
}, {
  "match" : {
"text" : {
  "query" : "very",
  "type" : "boolean"
}
  }
} ]
  }
}

Three query terms, right?

But it will match the field, and the term counts will match, and therefore 
you will then be told that *Very Very Very* is a perfect match for *Very 
Big Dog*.

Oops!

This is a Lucene limitation. Probably not a really big deal; I only know of 
two search engines that can properly handle duplicate terms: Google's, and 
the one I wrote in my previous life. But it is something that would be a 
very nice and useful feature for Lucene. Since Lucene already knows the 
word positions, it can verify that each term matches a unique word position 
(which is what I did in mine).

Brian


On Thursday, January 9, 2014 11:18:50 AM UTC-5, Nikolas Everett wrote:
>
> I'm looking to boost matches that where all the terms in the field match 
> more than I'm getting out of the default similarity.  Is there some way to 
> ask Elasticsearch to do that?  I'm ok with only checking in some small 
> window of top documents or really anything other than a large performance 
> hit.  To be honest I haven't played too much with similarities so maybe 
> what I want is there.
>
> Thanks!
>
> Nik
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9bc2284e-0f01-4b4e-aded-93db0230d4c9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Is there a kind of query/rescore/similarity magic that lets me know if all the terms in a field are matched?

2014-01-09 Thread Nikolas Everett
I'm looking to boost matches that where all the terms in the field match
more than I'm getting out of the default similarity.  Is there some way to
ask Elasticsearch to do that?  I'm ok with only checking in some small
window of top documents or really anything other than a large performance
hit.  To be honest I haven't played too much with similarities so maybe
what I want is there.

Thanks!

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3y1OME0C31M69Ugs71T%2BnU2b%2Bpyq45Wga71vOv1GTdTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Filter and Query same taking some time

2014-01-09 Thread Matt Weber
Use a filtered query, not an outer filter.   You only want to use that
outer filter when you are faceting and don't want the filter to change the
facet counts.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html

Thanks,
Matt Weber


On Thu, Jan 9, 2014 at 1:13 AM, Arjit Gupta  wrote:

> I had 13 Million documents and with the same query
> I see Filters performing worse then query
> filters are taking 400ms where as query is taking 300 ms
>
> 1. Filter
>
> {
>   "size" : 100,
>   "query" : {
> "match_all" : { }
>   },
>   "filter" : {
> "bool" : {
>   "must" : {
> "term" : {
>   "color" : "red"
> }
>   }
> }
>   },
>   "version" : true
> }
>
>
> 2. Query
>
> {
>   "size" : 100,
>   "query" : {
> "bool" : {
>   "must" : {
> "match" : {
>   "color" : {
> "query" : "red",
> "type" : "boolean",
> "operator" : "AND"
>   }
> }
>   }
> }
>   },
>   "version" : true
> }
>
> Thanks ,
> Arjit
>
>
> On Thu, Jan 9, 2014 at 1:15 PM, David Pilato  wrote:
>
>> Yeah 10 documents is not that much!
>> Not sure if you can notice a difference here as probably everything could
>> be loaded in file system cache.
>>
>> --
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet  | 
>> @elasticsearchfr
>>
>>
>> Le 9 janvier 2014 at 08:43:13, Arjit Gupta 
>> (arjit...@gmail.com)
>> a écrit:
>>
>> I have 100,000 documents  which are similar. In response I am getting the
>> whole document not just Id.
>> I am executing the query multiple times.
>>
>> Thanks ,
>> Arjit
>>
>>
>> On Thu, Jan 9, 2014 at 1:06 PM, David Pilato  wrote:
>>
>>>  You probably won't see any difference the first time you execute it
>>> unless you are using warmers.
>>>  With a second query, you should see the difference.
>>>
>>>  How many documents you have in your dataset?
>>>
>>>  --
>>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>>> @dadoonet  | 
>>> @elasticsearchfr
>>>
>>>
>>> Le 9 janvier 2014 at 06:14:06, Arjit Gupta 
>>> (arjit...@gmail.com)
>>> a écrit:
>>>
>>>   Hi,
>>>
>>> I had implemented ES search query  for all our use cases but when i came
>>> to know that some of our use cases can be solved by filters I implemented
>>> that but I dont see any gain (in response time) in filters. My search
>>> queries  are
>>>
>>> 1. Filter
>>>
>>> {
>>>   "size" : 100,
>>>   "query" : {
>>> "match_all" : { }
>>>   },
>>>   "filter" : {
>>> "bool" : {
>>>   "must" : {
>>> "term" : {
>>>   "color" : "red"
>>> }
>>>   }
>>> }
>>>   },
>>>   "version" : true
>>> }
>>>
>>>
>>> 2. Query
>>>
>>> {
>>>   "size" : 100,
>>>   "query" : {
>>> "bool" : {
>>>   "must" : {
>>> "match" : {
>>>   "color" : {
>>> "query" : "red",
>>> "type" : "boolean",
>>> "operator" : "AND"
>>>   }
>>> }
>>>   }
>>> }
>>>   },
>>>   "version" : true
>>> }
>>>
>>> By default the term query should be cached but I dont see a performance
>>> gain.
>>> Do i need to change some parameter also  ?
>>> I am using ES  0.90.1 and with 16Gb of heap space given to ES.
>>>
>>> Thanks,
>>> Arjit
>>>  --
>>>  You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>>
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/326a6640-d887-46b4-a8e7-ec15a1c9dc98%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>  --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/elasticsearch/uknnBHMnZLk/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/etPan.52ce519b.75c6c33a.1449b%40MacBook-Air-de-David.local.
>>>
>>>
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd-RzJxTrtt8gVOS6cxa%3DXNZ%3Dwa%2Bv8Vnwnqigd5gfnJ0fw%40mail.gmail.com
>> .
>>
>> For more options, visit https://groups.google.com/gro

Re: How to configure and implement Synonyms with multi words.

2014-01-09 Thread Matt Weber
He is a little example of query time multi-word synonyms:

https://gist.github.com/mattweber/7374591

Hope this helps.

Thanks,
Matt Weber



On Thu, Jan 9, 2014 at 12:56 AM, Jayesh Bhoyar wrote:

> Also I have another scenario where my index is having words like
>
> software engineer, se, ---> this should get seached when I do search on
> Software engineer
> team lead, lead, tl ---> this should get seached when I do search on Team
> Lead
>
>
>
> Following are the query to create the records.
>
> curl -XPUT 
> 'http://localhost:9200/employee/test/11?pretty'
> -d '{"designation": "software engineer"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/12?pretty'
> -d '{"designation": "se"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/13?pretty'
> -d '{"designation": "sse"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/14?pretty'
> -d '{"designation": "senior software engineer"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/15?pretty'
> -d '{"designation": "team lead"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/16?pretty&refresh=true'
> -d '{"designation": "tl"}'
> curl -XPUT 
> 'http://localhost:9200/employee/test/17?pretty&refresh=true'
> -d '{"designation": "lead"}'
>
>
>
> On Thursday, January 9, 2014 2:12:05 PM UTC+5:30, Jayesh Bhoyar wrote:
>>
>> Hi,
>>
>> I have following Synonyms that I want to configure.
>>
>> software engineer => software engineer, se,
>> senior software engineer => senior software engineer , see
>> team lead => team lead, lead, tl
>>
>> So that If I searched for se or Software Engineer it should return me the
>> records having software engineer.
>>
>> What mapping I should apply on Designation field? and what query I should
>> fire to get the result
>> It is possible to use multi_match query?
>>
>> Following are the query to create the records.
>>
>> curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d
>> '{"designation": "software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d
>> '{"designation": "software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d
>> '{"designation": "senior software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d
>> '{"designation": "senior software engineer"}'
>> curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d
>> '{"designation": "team lead"}'
>> curl -XPUT 'http://localhost:9200/employee/test/6?pretty&refresh=true'
>> -d '{"designation": "team lead"}'
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/71574363-4a46-4471-be9e-6ef1b0938d60%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoDVd04Mx3Wh1ZEtpgoSeekrhGaGUCCO5Lut%3DnKgdhOGiw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: allow_explicit_index and _bulk

2014-01-09 Thread Gabe Gorelick-Feldman
Opened an issue: https://github.com/elasticsearch/elasticsearch/issues/4668

On Thursday, January 9, 2014 3:39:39 AM UTC-5, Alexander Reelsen wrote:
>
> Hey,
>
> after having a very quick look, it looks like a bug (or wrong 
> documentation, need to check further). Can you create a github issue?
>
> Thanks!
>
>
> --Alex
>
>
> On Wed, Jan 8, 2014 at 11:08 PM, Gabe Gorelick-Feldman <
> gabego...@gmail.com > wrote:
>
>> The documentation on URL-based access 
>> control
>>  implies 
>> that _bulk still works if you set rest.action.multi.allow_explicit_index: 
>> false, as long as you specify the index in the URL. However, I can't get 
>> it to work.
>>
>> POST /foo/bar/_bulk
>> { "index": {} }
>> { "_id": 1234, "baz": "foobar" }
>>
>> returns 
>>
>> explicit index in bulk is not allowed
>>
>> Should this work?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a0d1fa2f-0c28-4142-9f6d-4b28a1695bb3%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f9fff41c-1b51-40dd-9291-c5bf4d73599c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Corrupt index creation when elasticsearch is killed just after index is created

2014-01-09 Thread joergpra...@gmail.com
Sorry my fault. I stand corrected. There is replication=sync and
consistency=all, but just for index creation that is triggered by a
document creation, and the index does not yet exist (auto creation). It's
not there for explicit index creation (where there is no document to be
created).

In case you explicitly execute index creation, you can add a master node
timeout to the operation, and if it exceeds, the operation will return that
is was not acknowledged by all nodes.

Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGgoJP%3DtkgSnOBHA366QgesFR4Ft5xztSrr1fX3ZWDDAw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Uncertain how to properly filter when certain fields do not always exist

2014-01-09 Thread Phil Barresi
 

I am trying to filter based on a field that, on some objects, does not 
exist. I was under the impression that ES would match objects that don't 
have that field.

Ultimately, I am trying to filter as such:

   - Field A will always exist, and should match on any of tags 1,2,3 
   - When it exists, either Field B or C must match any of tags 5,6,7 
   - When it exists, Field B must match any of tags 10, 11, 12 
   - When it exists, Field B or C must NOT have any of tags 15, 16, 18. 

In this case, all my tags are strings. In addition, fields B and C are 
inside of another. I am uncertain if that matters.

Essentially, my object is:

{ a: ["some", "tags", "here"],
X : { 
B: ["more", "tags", "here"],
C: ["even", "more", "here"]
}
} 


Ultimately, I am trying to build a whitelist and blacklist filtering system.

However, when filtering this way, I do not get any results that do not 
contain field X.

How do I properly format this filter? 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/754097d9-fff7-4773-bac4-54f4fe3e5172%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How's the encoding handling power of ES?

2014-01-09 Thread Jason Wee
There is example in index and query in this SO
http://stackoverflow.com/questions/8734888/how-to-search-for-utf-8-special-characters-in-elasticsearch

hth

Jason


On Thu, Jan 9, 2014 at 5:13 PM, HongXuan Ji  wrote:

> Hi all,
>
> I am wondering how the ElasticSearch deal with different document with
> different encoding, such as different language.
> Could you provide me some tutorial about it? Do I need to manually specify
> the encoding format of the document when posting?
>
> Best,
>
> Ivan
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHO4itx4%3DuWb3xD%2BbyqJ7Zoo1yTQKYkRoeKe6Hd7_DgehGjZpQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: how to stop running river plugin

2014-01-09 Thread David Pilato
When you delete the river (remove _meta doc).

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 9 janv. 2014 à 14:58, shamsul haque  a écrit :

> Hi,
> 
> i have configured and started river with my ES. But how may i stop OR close 
> my running river, if i want to do so.
> I have seen public void close() method in River code, when it get called?
> 
> Thanks
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/0fdecad2-aba3-4711-bec0-88216f014634%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0DFC95F7-DD6D-4B61-8EA2-4A2895B3818F%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


how to stop running river plugin

2014-01-09 Thread shamsul haque
Hi,

i have configured and started river with my ES. But how may i stop OR close 
my running river, if i want to do so.
I have seen public void close() method in River code, when it get called?

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0fdecad2-aba3-4711-bec0-88216f014634%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Problem with highlight after upgrade from 0.20.4 to 0.90.9

2014-01-09 Thread Calle Arnesten
So after trying around, I have found out that if just have "name" and 
"description" and not "task.name" and "task.description" in the "highlight" 
section, the highlighting works. So is "." not supported for 
highlights in 0.90.x? Or could I do it with some other syntax?

If not, how do you solve the case where you want to query many types with a 
"name" field, but have different settings per type?

/Calle

Den torsdagen den 9:e januari 2014 kl. 11:31:31 UTC+1 skrev Calle Arnesten:
>
> Hi,
>
> I have updated ElasticSearch from 0.20.4 to 0.90.9 and have problem to get 
> the highlight to work.
>
> The following search body (JSON-stringified) worked in the old version:
> query: {
>   custom_filters_score: {
> query: {
>   query_string: {
>  query: 'test',
>  fields: ['task.name^3', 'task.description']
>   }
> }
>   },
>   size: 11,
>   highlight: {
> encoder: 'html',
> fields: {
>   _all: { number_of_fragments: 5 },
>   'task.name': {
> fragment_size: 200
>   },
>   'task.description': {
> fragment_size: 100
>   },
>   require_field_match: true
> }
>   }
> }
>
> In the new version it returns:
> {
> "took": 5,
> "timed_out": false,
> "_shards": {
> "total": 1,
> "successful": 1,
> "failed": 0
> },
> "hits": {
> "total": 2,
> "max_score": 1.5957302,
> "hits": [
> {
> "_index": "board",
> "_type": "task",
> "_id": "9160af7f92b9f5c769351d62650028e0",
> "_score": 1.5957302,
> "_source": {
> "name": "Test1",
> "description": ""
> }
> },
> {
> "_index": "board",
> "_type": "task",
> "_id": "9160af7f92b9f5c769351d6265003ae4",
> "_score": 1.5957302,
> "_source": {
> "name": "Test2",
> "description": ""
> 
> }
> }
> ]
> }
> }
>
> In 0.20.4 each item in the "hits" array contained a "highlight" property, 
> but now it doesn't. Why is that not included anymore?
>
> Any help is appreciated.
>
> /Calle
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ee83405-f2f6-4db0-bc13-3880d5cff450%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: SSL and org.elasticsearch.transport.NodeDisconnectedException

2014-01-09 Thread David Pilato
We don't have that at this time.
Basically, elasticsearch nodes are very often in a backend layer so securing 
transport is not something really needed as it comes also with a cost.

Could you secure your transmissions on a network level? 

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 9 janv. 2014 à 13:30, Maciej Stoszko  a écrit :

> Thanks David,
> Does it mean that, at least currently, there is no avenue to secure transport 
> layer with SSL?
> Maciej
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/9a48efe8-143f-49ab-b216-7cca6f95f25e%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/A5E112F7-2C92-4C1A-A320-5973EE89757A%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread Chris H
Hi, Jun.

That doesn't seem to work.  For a user with the username bob.smith-jones:

   - bob.smith-jones -> matches
   - bob.smith- -> matches
   - bob.smi* -> matches
   - bob.smith-j* -> no results
   - bob.smith\-j* -> no results

Also, a "$" isn't one of the special characters.

Thanks.

On Thursday, January 9, 2014 8:52:46 AM UTC, Jun Ohtani wrote:
>
> Hi Chris, 
>
> Could you try to escape “-“ in query for “not_analyzed” field? 
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters
>  
>
> I hope this helps. 
> Regards, 
>
>  
> Jun Ohtani 
> joh...@gmail.com  
> blog : http://blog.johtani.info 
> twitter : http://twitter.com/johtani 
>
>
>
>
> 2014/01/09 17:20、Chris H > のメール: 
>
> > Hi, a bit more information. 
> > 
> > I tried adding a custom analyzer based off a recommendation I saw online 
> somewhere.  This partly works in that it's not tokenising.  But I can't do 
> wildcard searches in Kibana on the fields, and they're now case sensitive 
> :( 
> > 
> > curl localhost:9200/_template/logstash-username -XPUT -d '{ 
> > "template": "logstash-*", 
> > "settings" : { 
> > "analysis": { 
> > "analyzer": { 
> > "lc_analyzer": { 
> > "type": "custom", 
> > "tokenizer": "keyword", 
> > "filters": ["lowercase"] 
> > } 
> > } 
> > } 
> > }, 
> > "mappings": { 
> > "_default_": { 
> >  "properties" : { 
> > "User_Name" : { "type" : "string", "analyzer" : 
> "lc_analyzer" } 
> > } 
> > } 
> > } 
> > }' 
> > 
> > Thanks 
> > 
> > On Wednesday, January 8, 2014 3:26:03 PM UTC, Chris H wrote: 
> > Hi.  I've deployed elasticsearch with logstash and kibana to take in 
> Windows logs from my OSSEC log server, following this guide: 
> http://vichargrave.com/ossec-log-management-with-elasticsearch/ 
> > I've tweaked the logstash config to extract some specific fields from 
> the logs, such as User_Name.  I'm having some issues searching on these 
> fields though. 
> > 
> > These searches work as expected: 
> > • User_Name: * 
> > • User_Name: john.smith 
> > • User_Name: john.* 
> > • NOT User_Name: john.* 
> > But I'm having problems with Computer accounts, which take the format 
> "w-dc-01$" - they're being split on the "-" and the "$" is ignored.  So a 
> search for "w-dc-01" returns all the servers named "w-".  Also I 
> can't do "NOT User_Name: *$" to exclude computer accounts. 
> > 
> > The mappings are created automatically by logstash, and GET 
> /logstash-2014.01.08/_mapping shows: 
> > 
> > "User_Name": { 
> > 
> >"type": "multi_field", 
> >"fields": { 
> >   "User_Name": { 
> >  "type": "string", 
> >  "omit_norms": true 
> >   }, 
> >   "raw": { 
> >  "type": "string", 
> >  "index": "not_analyzed", 
> >  "omit_norms": true, 
> >  "index_options": "docs", 
> >  "include_in_all": false, 
> >  "ignore_above": 256 
> >   } 
> >} 
> > }, 
> > My (limited) understanding is that the "not_analyzed" should stop the 
> field being split, so that my searching matches the full name, but it 
> doesn't.  I'm trying both kibana and curl to get results. 
> > 
> > Hope this makes sense.  I really like the look of elasticsearch, but 
> being able to search on extracted fields like this is pretty key to me 
> using it. 
> > 
> > Thanks. 
> > 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com . 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/96e74e53-54f9-48ec-9e5c-8f1354b264be%40googlegroups.com.
>  
>
> > For more options, visit https://groups.google.com/groups/opt_out. 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/22cf533e-eab8-468b-9b9a-55bbe12b3d62%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Spring elastic search and configuration for Mappings and _settings files

2014-01-09 Thread Ramdev Wudali
Hi David:
Now that you have experiment2 Index created with Type NewTitles, With
the same spring configuration, Can you create a different Type in the same
Index, for example : experiment2/OldTitles ?

If you do not have any issues creating a new Type under and existing Index
without deleting the data in  the previous Type, I would like to know how
you go about it. (that is the problem I am trying to resolve)


Thanks much

Ramdev


See my vizify bio!
[image: Ramdev Wudali's Visual
Thumbprint]


On Thu, Jan 9, 2014 at 2:52 AM, David Pilato  wrote:

> Your configuration looks good to me.
>
> I modified your spring file to add a node and change server location:
>
> 
>
>esNodes="localhost:9300"
>   forceMapping="true" properties="esProperties"/>
>
> I started you main() and the factory starts as expected.
> No error seen.
>
> Not sure where your issue came from.
>
>  --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 6 janvier 2014 at 15:18:42, Ramdev Wudali 
> (agasty...@gmail.com)
> a écrit:
>
> Hi David :
>Sorry for the delay in my response.. (the weekend chores took over).
> Here is my project (a tgz archive file) its a maven project so you should
> due able to import it into your IDE of choice (I have used IntelliJ, So you
> may find some of those artifacts as well).
>
> I have not included any data. (the data format is just Strings(titles)
>  one per line).  The path is specified in the spring config file. that is
> included in the resources folder.
>
>
> Please do let me know if you do find something…
>
>
> Thanks
>
> Ramdev
>
>
>
>
> See my vizify bio!
> [image: Ramdev Wudali's Visual 
> Thumbprint]
>
>
> On Fri, Jan 3, 2014 at 3:20 PM, David Pilato  wrote:
>
>>  Could you share your project or gist your files and source code?
>>
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 3 janv. 2014 à 22:08, Ramdev Wudali  a écrit :
>>
>>   Hi David:
>>I setup the config to run on port 8200 and 8300 (instead of default
>> 9200 and 9300 as they were taken up by tomcat)
>>
>> See my vizify bio!
>> [image: Ramdev Wudali's Visual 
>> Thumbprint]
>>
>>
>> On Fri, Jan 3, 2014 at 2:38 PM, David Pilato  wrote:
>>
>>>  Is it a typo?
>>>
>>>   esNodes="elasticsearch.server:8300"
>>>
>>>
>>> Should be 9300, right?
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>> Le 3 janv. 2014 à 21:35, Ramdev Wudali  a écrit :
>>>
>>>  Hi David:
>>>Thanks for the speedy response. Here is an update to my problem. I
>>> was trying to create a different type within the same index. (Index:
>>> experiment, type : Titles and I was trying to create Type : NewTitles ) I
>>> am not sure if this has any bearing on the problem.
>>>
>>> After posting the question on the group, I  went ahead and created a
>>> separate index  (experiment2) and within this new index, I created the
>>> Type: NewTitles.
>>>
>>> When I ran my application, there was no problems during the Spring
>>> elastic search client  initialization.
>>>
>>> This basically tells me there is a conflict in creation of a new Type
>>> under an existing index. (I am not able to figure out why there is a
>>> conflict).
>>>
>>> And I am not mixing versions of ElasticSearch between client and node.
>>> (both using 0.90.5)
>>>
>>>
>>> hope this helps
>>>
>>> Thanks
>>>
>>> Ramdev
>>>
>>>
>>>
>>>
>>> See my vizify bio!
>>> [image: Ramdev Wudali's Visual 
>>> Thumbprint]
>>>
>>>
>>> On Fri, Jan 3, 2014 at 2:29 PM, David Pilato  wrote:
>>>
  Any chance you are mixing elasticsearch versions between node and
 client?

 --
 David ;-)
 Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

 Le 3 janv. 2014 à 20:16, Ramdev Wudali  a écrit :

Hi All:
 I am trying to index a set of documents with the following mapping
 :
 {
 "NewTitles": {
 "properties": {
 "DOC_ID": {
 "type":"string"
 },
 "TITLE": {
 "type": "multi_field",
 "fields" : {
 "TITLE" : {
 "type" : "string"
 },
 "sortable" : {
 "type" : "string",
 "index" : "not_analyzed"
 },
 "autocomplete" : {
 "type" : "string",
 "index_analyzer" : "shingle_analyzer"
 }
 }
 }

use doc_count in query syntax?

2014-01-09 Thread abhi patel
Can I use doc_count of facet/aggregation query in other query syntax?
I want perform some conditions on doc_count. Based on output of conditions 
result will be displayed.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ee7725a5-7eb2-47b5-93a3-88e3e1f820d3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: SSL and org.elasticsearch.transport.NodeDisconnectedException

2014-01-09 Thread Maciej Stoszko
Thanks David,
Does it mean that, at least currently, there is no avenue to secure transport 
layer with SSL?
Maciej

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9a48efe8-143f-49ab-b216-7cca6f95f25e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: cassandra river plugin installation issue

2014-01-09 Thread shamsul haque
issue solved:

in river code, when fetching data from Casandra it 
uses HFactory.createRangeSlicesQuery(keyspace, STR, STR, STR); to get data
and the table which i was using to get data contain Primary Key as int id, 
after changing that to text it starts pulling data from Cassandra to ES.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/050a535b-990b-4dd7-9af9-6caea0e6d4a5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Logstash Embedded Elasticsearch not starting.

2014-01-09 Thread Andrew
Good Morning,


I am running a very basic config of logstash,   with the embedded 
elasticsearch.


I am able to launch the logstash & embedded elasticsearch successfully, 
whilst using a local disk for the data directory,   however when I use the 
option :  -Des.path.data to specify the data directory to be located on an 
NFS share,   elasticsearch will not start.   I'm assuming this is a locking 
issue

log4j, [2014-01-08T15:13:25.560]  INFO: org.elasticsearch.node: [Set] 
version[0.90.3], pid[16971], build[5c38d60/2013-08-06T13:18:31Z]
log4j, [2014-01-08T15:13:25.561]  INFO: org.elasticsearch.node: [Set] 
initializing ...
log4j, [2014-01-08T15:13:25.561] DEBUG: org.elasticsearch.node: [Set] using 
home [/srv/log/logstash], config [/srv/log/logstash/config], data 
[[/srv/log/logstash/data]], logs [/srv/log/logstash/logs], work 
[/srv/log/logstash/work], plugins [/srv/log/logstash/plugins]
log4j, [2014-01-08T15:13:25.567]  INFO: org.elasticsearch.plugins: [Set] 
loaded [], sites []
log4j, [2014-01-08T15:13:25.584] DEBUG: 
org.elasticsearch.common.compress.lzf: using [UnsafeChunkDecoder] decoder

an no further progress.   Is there a workaround for this problem, as I have 
a requirement to use NFS for the data directory.



--
A.


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9317c2aa-1c96-4fb7-b0b1-614289f537cc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Architecture question re. routing and multi DC

2014-01-09 Thread Arik Fraimovich
For redundancy purposes, our system is split into two datacenters. One of 
the DCs is considered central where all the backoffice systems reside and 
the other is edge. Recently we started using Logstash with ElasticSearch 
and Kibana. The architecture we had is:

   - Scribe server on each instance in our cluster forwards logs to a main 
   scribe instance in the DC. 
   - If the DC is the edge, its main scribe instance forwards all logs to 
   the main scribe instance in central.
   - From the main (central) scribe server we forward message to Logstash, 
   which in turn get written to ES.

Because most logs are only stored but never retrieved, to reduce the 
traffic between DCs, we thought of using custom routing:

   - Have elastic search node in each DC (currently we have only one).
   - Tag each log message with the DC it's originated from and route the 
   log messages according to this tag, so each DC's log messages end up in its 
   own ES instance.

Will this work? Is this proper use of ElasticSearch's routing?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0017a4a8-80ca-4fcb-97df-032f9d6858c9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Terms facet on single field but also return associated id

2014-01-09 Thread Vincent van der walt
Hi, 

I'm wondering if anyone could help me.

We're doing a terms facet on a field which works fine. However I'm trying 
to also return the corresponding id of that field so I can provide click 
through functionality on my grid.

eg index = fieldid,fieldname

if I do a termsfacet I get

field1 10
field2 34
field3 50

but I'd like to return the id too rather than do another search and merge ( 
very inefficient since I'm dealing with large amounts of data). Composite 
fields won't work for us since I'll have to constantly rebuild the index. 
the terms I'm using can be very dynamic.

Is this possible ?

Thanks

Vinny




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4317dae0-690b-4205-a5f7-ea8e5908661d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Pls help me: i insert log to elasticsearch, but it use too much memory, how to solve it?thanks

2014-01-09 Thread xjj210130
Thanks David.
 
Now, i don't use mapping . I want to do the following works with 
elastisearch.

  1: query product information according to some keys;
  2: query  different user's product's price. The product's price is 
different for user.
  3: query some product for user. some user don't  have some product.
  ..., and so on.
 Th e user's number is more than 3000.
Thanks again.
On Thursday, January 9, 2014 5:23:33 PM UTC+8, David Pilato wrote:
>
> I see. You probably have to merge mappings with very big mappings!
>
> What is your application searching for? Logs? Users?
>
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 9 janv. 2014 à 10:06, xjj2...@gmail.com  a écrit :
>
> Thanks David .
> *   Yes , I test it with curl. If the json data is not too big, There 
> is no problem. The test json format is following:*
> *{*
> *"name":["user1","user2","user3",],*
> * "product":{},*
> *  "price":{}*
> *}*
>
> *The difference is the two json data is :*
> *The last json data include too many key/value, like the following:*
>
> *{*
> *"name":["user1","user2","user3",],*
> * "product":{},*
> *  "price":{},*
> *"attr":{*
>
> "user1":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
>
> "user2":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],,
>
> "user3":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
> ..
> *}*
> *}*
>
> *There are more than 3000 items in attr key. So it used too many memory.*
> *Thanks again.*
>
>  
> On Thursday, January 9, 2014 3:15:59 PM UTC+8, David Pilato wrote:
>>
>> Just wondering if you are hitting the same RAM usage when inserting 
>> without thrift?
>> Could you test it?
>>
>> Could you gist as well what gives: 
>>
>> curl -XGET 'http://localhost:9200/_nodes?all=true&pretty=true'
>>
>>
>>
>> -- 
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
>> *
>> @dadoonet  | 
>> @elasticsearchfr
>>
>>
>> Le 9 janvier 2014 at 07:11:33, xjj2...@gmail.com (xjj2...@gmail.com) a 
>> écrit:
>>
>>  The env is following: 
>>  --elasticseasrch  v0.90(  i use 0.90.9 , the problem is still exist).
>>  -- java version is 1.7.0_45
>>
>> On Wednesday, January 8, 2014 6:58:02 PM UTC+8, xjj2...@gmail.com wrote: 
>>>
>>> Dear all: 
>>>I insert 1 logs to elasticsearch, each log is about 2M, and 
>>> there are about 3000 keys and values.
>>>  when i insert about 2, it used about 30G memory, and then 
>>> elasticsearch is very slow, and it's hard to insert log.
>>>  Could someone help me how to solve it? Thanks very much.
>>>  
>>  --
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/caec9b84-c543-4bb3-8cb0-e90113972716%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/d8d1c975-a9f2-47c6-97e4-54ba5f163284%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/917d78c7-c68c-4361-8fbf-016c00559196%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Problem with highlight after upgrade from 0.20.4 to 0.90.9

2014-01-09 Thread Calle Arnesten
Hi,

I have updated ElasticSearch from 0.20.4 to 0.90.9 and have problem to get 
the highlight to work.

The following search body (JSON-stringified) worked in the old version:
query: {
  custom_filters_score: {
query: {
  query_string: {
 query: 'test',
 fields: ['task.name^3', 'task.description']
  }
}
  },
  size: 11,
  highlight: {
encoder: 'html',
fields: {
  _all: { number_of_fragments: 5 },
  'task.name': {
fragment_size: 200
  },
  'task.description': {
fragment_size: 100
  },
  require_field_match: true
}
  }
}

In the new version it returns:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.5957302,
"hits": [
{
"_index": "board",
"_type": "task",
"_id": "9160af7f92b9f5c769351d62650028e0",
"_score": 1.5957302,
"_source": {
"name": "Test1",
"description": ""
}
},
{
"_index": "board",
"_type": "task",
"_id": "9160af7f92b9f5c769351d6265003ae4",
"_score": 1.5957302,
"_source": {
"name": "Test2",
"description": ""  
  
}
}
]
}
}

In 0.20.4 each item in the "hits" array contained a "highlight" property, 
but now it doesn't. Why is that not included anymore?

Any help is appreciated.

/Calle

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/600a60a9-58c0-4f0a-bfc6-29538ae72e14%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


unexpected behavior of pagination using offset and size

2014-01-09 Thread Chetana
My application has the pagination requirement for search. I am using the 
offset and size option to achieve the pagination. 
Making quick clicks on pagination sometimes does not give results at all. 
Does the asynchronous search call bringing any side effects like this?
 
 
Thanks,
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/356ba278-d5f5-4e8d-87ae-7d6c2d7ab13c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


issues with timestamp sorting

2014-01-09 Thread Chetana
I have created a timestamp field (not be confused with _timestamp) and want 
to sort timestamp field in descending order. But the result contains some 
records out of order.
 
Mapping and the sorting criteria look like :
"timestamp":{"type" 
:"date","format":"dateOptionalTime","include_in_all":false}
 
SearchRequestBuilder.addSort("timestamp", SortOrder.DESC);
 
 
The result I get is :
 
2013-12-26T09:14:09.617Z
,2013-12-26T12:01:07.389Z,2013-12-26T12:00:20.126Z,2013-12-26T11:59:15.594Z,2013-12-26T11:58:00.083Z,2013-12-26T11:55:52.372Z
 
Is it because of dateOptionalTime format or am I missing something?
 
Thanks
 
 
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6616a823-4421-4dbe-a440-a14fdce07e18%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Converting queries returning certain distinct records to ES

2014-01-09 Thread heather
 

Okay, thank you for your response, here is an attempt of an example of what 
I am trying to achieve.

Lets say I have the documents;


{

id: 1

name: peter

class: 2

grade: b

hair:grey

 }

{

id:2

name: paul

class:2

grade:b

hair:purple

}

{

id:3

name:john

class:1

grade:b

hair:grey

}

{

id:4

name:sandra

class:1

grade:a

hair:green

}

{

id:5

name:sarah

class:1

grade:a

hair:green

}

Initially I want to get only one student from each possible [class, grade] 
combinaion so I want ES to return peter, john and sandra but not paul or 
sarah . The grades will range from the letters [a,b,c,d,e] but the class 
could be anything.

Additionally I might want to add a condition to this, such as only getting 
students with green hair. In that case I would only want to return sandra 
as while sarah has green hair - they have the same [class,grade] as sandra.

I thought about using facets for the first query but I cannot see how that 
would give me a collection of the right ids to make the second query with. 

On Thursday, January 9, 2014 7:57:09 AM UTC, David Pilato wrote:
>
> May be you could find a way to do that with a single query if you design 
> your documents in another way?
> Or using facets for the first query and Ids filter for the second?
> It's hard to tell without a concrete example of JSON documents.
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 9 janvier 2014 at 01:28:06, hea...@hodgetastic.com  (
> hea...@hodgetastic.com ) a écrit:
>
>  Hello
>
> I am currently trying to migrate an sql application to Elasticsearch. 
>
> I need to be able to select a collection of results from an index 
> which, for given search conditions, have distinct pairings of two certain 
> columns. In sql I do the following two queries:
>
> Query 1:
>  SELECT column_A, column_B, GROUP_CONCAT (table_name..id) id FROM 
> `table_name` WHERE `column_?` = '' GROUP BY column_A, column_B, 
> column_?
>  
> Query 2:
>   SELECT `table_name`.* FROM `table_name `  WHERE  `column_?` = 
> ''  AND (`table_name.id` IN ())
>  
> The first query returns me a list of ids from table_name such that each id 
> satisfies the condition `column_?` = '' and the record with that 
> id has a distinct [column_A,column_B]
>
> The second query then returns me all the records satisfying  `column_?` = 
> '' but only from that range of ids (I realise I probably do not 
> need to do `column_?` = ' again in the second query.)
>
> The result is that each record returned by the second query has satisfies 
> the condition  `column_?` = ''  and I am only returned one 
> record for each [column_A,column_B] paring.
>
> Since there is not really a 'distinct' option yet I am having trouble 
> finding a way replicate this output with ES and wondered if anyone might 
> have any thoughts as how I might go about it?
>
> At the moment I am open to any mapping / query combinations that will 
> achieve what I need.
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/6a857778-0399-4b3c-9973-a3e353436311%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/49dc9bd6--4398-aabf-7133852907e5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Pls help me: i insert log to elasticsearch, but it use too much memory, how to solve it?thanks

2014-01-09 Thread David Pilato
I see. You probably have to merge mappings with very big mappings!

What is your application searching for? Logs? Users?


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 9 janv. 2014 à 10:06, xjj210...@gmail.com a écrit :

> Thanks David .
>Yes , I test it with curl. If the json data is not too big, There is 
> no problem. The test json format is following:
> {
> "name":["user1","user2","user3",],
>  "product":{},
>   "price":{}
> }
> 
> The difference is the two json data is :
> The last json data include too many key/value, like the following:
> 
> {
> "name":["user1","user2","user3",],
>  "product":{},
>   "price":{},
> "attr":{
> "user1":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
> "user2":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],,
> "user3":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
> ..
> }
> }
> 
> There are more than 3000 items in attr key. So it used too many memory.
> Thanks again.
>  
> 
> On Thursday, January 9, 2014 3:15:59 PM UTC+8, David Pilato wrote:
>> 
>> Just wondering if you are hitting the same RAM usage when inserting without 
>> thrift?
>> Could you test it?
>> 
>> Could you gist as well what gives: 
>> 
>> curl -XGET 'http://localhost:9200/_nodes?all=true&pretty=true'
>> 
>> 
>> -- 
>> David Pilato | Technical Advocate | Elasticsearch.com
>> @dadoonet | @elasticsearchfr
>> 
>> 
>> Le 9 janvier 2014 at 07:11:33, xjj2...@gmail.com (xjj2...@gmail.com) a écrit:
>> 
>>>  The env is following:
>>>  --elasticseasrch  v0.90(  i use 0.90.9 , the problem is still exist).
>>>  -- java version is 1.7.0_45
>>> 
>>> On Wednesday, January 8, 2014 6:58:02 PM UTC+8, xjj2...@gmail.com wrote:
 
 Dear all:
I insert 1 logs to elasticsearch, each log is about 2M, and 
 there are about 3000 keys and values.
  when i insert about 2, it used about 30G memory, and then 
 elasticsearch is very slow, and it's hard to insert log.
  Could someone help me how to solve it? Thanks very much.
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/caec9b84-c543-4bb3-8cb0-e90113972716%40googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/d8d1c975-a9f2-47c6-97e4-54ba5f163284%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9B042F72-32F3-43EE-8BD2-2A32347E0984%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.


Re: No hit using scan/scroll with has_parent filter

2014-01-09 Thread Jean-Baptiste Lièvremont
Hi Martijn,

Thanks for your answer. You can find in the gist below some HTTP 
conversations made on my ES 0.90.6 node, as well as a link to the Java code 
responsible for the calls:
https://gist.github.com/jblievremont/8331460

Please note that the issue appears only when combining scan/scroll with 
has_parent filter, as it seems to work using a has_parent query instead.

Best regards,
-- Jean-Baptiste Lièvremont

Le jeudi 9 janvier 2014 00:18:14 UTC+1, Martijn v Groningen a écrit :
>
> Hi Jean,
>
> Can you share how you execute the scan request with the has_parent filter? 
> (via a gist or something like that)
>
> Martijn
>
>
> On 8 January 2014 15:17, Jean-Baptiste Lièvremont <
> jean-baptist...@sonarsource.com > wrote:
>
>> Hi folks,
>>
>> I use a parent/child mapping configuration which works flawlessly with 
>> "classic" search requests, e.g using has_parent to find child documents 
>> with criteria on the parent documents.
>>
>> I am trying to get all child document IDs that match a given set of 
>> criteria using scan and scroll, which also works well - until I introduce 
>> the has_parent filter, in which case the scroll request returns no hit 
>> (although total_hits is correct).
>>
>> Is it a known issue?
>>
>> I can provide sample mapping files and queries with associated/expected 
>> results. Please note that this behavior has been noticed on 0.90.6 but is 
>> still present in 0.90.9.
>>
>> Thanks, best regards,
>> -- Jean-Baptiste Lièvremont
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/fd7c563e-34f7-4aa8-ab1a-460840ba2af0%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
>
> -- 
> Met vriendelijke groet,
>
> Martijn van Groningen 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8bafbc47-a68f-41fa-8730-d17cf1832011%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Best way to match URLs

2014-01-09 Thread Johan Rask
Hi!

I am using ES together with logstash and we are indexing simple access log 
files.

Our problem is that we want to now the number of image views for a resource 
which is determined by a specific
REST url:

"GET /resource//image" => i.e "GET /resource/abcde/image"

This results in millions of different URL´s that all mean = image view.

Another problem is that there are other "unpredictable" resources under 
/image => "GET /resource/abcde/image/"
so a search like  

url:get AND url:resource AND url:image AND - url:

does not work since I do not know what to exclude

I was thinking about using regexp for this, performance is not really a 
problem (at least not at the moment), since
this is mainly for reporting. However, I have not been able to solve it.

If using regex, should the field be analyzed or not_analyzed? I have tried 
with both using a template but I am still
unable to get it working.

 "url" : {
"type" : "multi_field",
"fields" : {
"name" : {"type" : "string", "index" : "analyzed" },
"facet" : {"type" : "string", "index" : 
"not_analyzed"}
}
}


Anyway, any suggestions about how to solve this would be highly appreciated.

Kind regards, Johan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/01e52f9e-6cc3-4723-ba44-b5a5fdbe9fcc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Filter and Query same taking some time

2014-01-09 Thread Arjit Gupta
I had 13 Million documents and with the same query
I see Filters performing worse then query
filters are taking 400ms where as query is taking 300 ms

1. Filter

{
  "size" : 100,
  "query" : {
"match_all" : { }
  },
  "filter" : {
"bool" : {
  "must" : {
"term" : {
  "color" : "red"
}
  }
}
  },
  "version" : true
}


2. Query

{
  "size" : 100,
  "query" : {
"bool" : {
  "must" : {
"match" : {
  "color" : {
"query" : "red",
"type" : "boolean",
"operator" : "AND"
  }
}
  }
}
  },
  "version" : true
}

Thanks ,
Arjit


On Thu, Jan 9, 2014 at 1:15 PM, David Pilato  wrote:

> Yeah 10 documents is not that much!
> Not sure if you can notice a difference here as probably everything could
> be loaded in file system cache.
>
> --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 9 janvier 2014 at 08:43:13, Arjit Gupta 
> (arjit...@gmail.com)
> a écrit:
>
> I have 100,000 documents  which are similar. In response I am getting the
> whole document not just Id.
> I am executing the query multiple times.
>
> Thanks ,
> Arjit
>
>
> On Thu, Jan 9, 2014 at 1:06 PM, David Pilato  wrote:
>
>>  You probably won't see any difference the first time you execute it
>> unless you are using warmers.
>>  With a second query, you should see the difference.
>>
>>  How many documents you have in your dataset?
>>
>>  --
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>> @dadoonet  | 
>> @elasticsearchfr
>>
>>
>> Le 9 janvier 2014 at 06:14:06, Arjit Gupta 
>> (arjit...@gmail.com)
>> a écrit:
>>
>>   Hi,
>>
>> I had implemented ES search query  for all our use cases but when i came
>> to know that some of our use cases can be solved by filters I implemented
>> that but I dont see any gain (in response time) in filters. My search
>> queries  are
>>
>> 1. Filter
>>
>> {
>>   "size" : 100,
>>   "query" : {
>> "match_all" : { }
>>   },
>>   "filter" : {
>> "bool" : {
>>   "must" : {
>> "term" : {
>>   "color" : "red"
>> }
>>   }
>> }
>>   },
>>   "version" : true
>> }
>>
>>
>> 2. Query
>>
>> {
>>   "size" : 100,
>>   "query" : {
>> "bool" : {
>>   "must" : {
>> "match" : {
>>   "color" : {
>> "query" : "red",
>> "type" : "boolean",
>> "operator" : "AND"
>>   }
>> }
>>   }
>> }
>>   },
>>   "version" : true
>> }
>>
>> By default the term query should be cached but I dont see a performance
>> gain.
>> Do i need to change some parameter also  ?
>> I am using ES  0.90.1 and with 16Gb of heap space given to ES.
>>
>> Thanks,
>> Arjit
>>  --
>>  You received this message because you are subscribed to the Google
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/326a6640-d887-46b4-a8e7-ec15a1c9dc98%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/uknnBHMnZLk/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/etPan.52ce519b.75c6c33a.1449b%40MacBook-Air-de-David.local.
>>
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd-RzJxTrtt8gVOS6cxa%3DXNZ%3Dwa%2Bv8Vnwnqigd5gfnJ0fw%40mail.gmail.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/uknnBHMnZLk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/etPan.52ce53a0.275ac794.1449b%40MacBook-Air-de-David.local
> .
>
> For more options, visit https://

How's the encoding handling power of ES?

2014-01-09 Thread HongXuan Ji
Hi all, 

I am wondering how the ElasticSearch deal with different document with 
different encoding, such as different language. 
Could you provide me some tutorial about it? Do I need to manually specify 
the encoding format of the document when posting?

Best,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Pls help me: i insert log to elasticsearch, but it use too much memory, how to solve it?thanks

2014-01-09 Thread xjj210130
Thanks David .
*   Yes , I test it with curl. If the json data is not too big, There 
is no problem. The test json format is following:*
*{*
*"name":["user1","user2","user3",],*
* "product":{},*
*  "price":{}*
*}*

*The difference is the two json data is :*
*The last json data include too many key/value, like the following:*

*{*
*"name":["user1","user2","user3",],*
* "product":{},*
*  "price":{},*
*"attr":{*
"user1":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
"user2":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],,
"user3":{{"costprice":"122"},{"sellprice":"124"},{"stock":"12"},{"sell":"122"},{},{}],
..
*}*
*}*

*There are more than 3000 items in attr key. So it used too many memory.*
*Thanks again.*

 
On Thursday, January 9, 2014 3:15:59 PM UTC+8, David Pilato wrote:
>
> Just wondering if you are hitting the same RAM usage when inserting 
> without thrift?
> Could you test it?
>
> Could you gist as well what gives: 
>
> curl -XGET 'http://localhost:9200/_nodes?all=true&pretty=true'
>
>
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 9 janvier 2014 at 07:11:33, xjj2...@gmail.com  (
> xjj2...@gmail.com ) a écrit:
>
>  The env is following: 
>  --elasticseasrch  v0.90(  i use 0.90.9 , the problem is still exist).
>  -- java version is 1.7.0_45
>
> On Wednesday, January 8, 2014 6:58:02 PM UTC+8, xjj2...@gmail.com wrote: 
>>
>> Dear all: 
>>I insert 1 logs to elasticsearch, each log is about 2M, and 
>> there are about 3000 keys and values.
>>  when i insert about 2, it used about 30G memory, and then 
>> elasticsearch is very slow, and it's hard to insert log.
>>  Could someone help me how to solve it? Thanks very much.
>>  
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/caec9b84-c543-4bb3-8cb0-e90113972716%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8d1c975-a9f2-47c6-97e4-54ba5f163284%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How to configure and implement Synonyms with multi words.

2014-01-09 Thread Jayesh Bhoyar
Also I have another scenario where my index is having words like 

software engineer, se, ---> this should get seached when I do search on 
Software engineer
team lead, lead, tl ---> this should get seached when I do search on Team 
Lead


Following are the query to create the records.

curl -XPUT 
'http://localhost:9200/employee/test/11?pretty'
 
-d '{"designation": "software engineer"}'
curl -XPUT 
'http://localhost:9200/employee/test/12?pretty'
 
-d '{"designation": "se"}'
curl -XPUT 
'http://localhost:9200/employee/test/13?pretty'
 
-d '{"designation": "sse"}'
curl -XPUT 
'http://localhost:9200/employee/test/14?pretty'
 
-d '{"designation": "senior software engineer"}'
curl -XPUT 
'http://localhost:9200/employee/test/15?pretty'
 
-d '{"designation": "team lead"}'
curl -XPUT 
'http://localhost:9200/employee/test/16?pretty&refresh=true'
 
-d '{"designation": "tl"}'
curl -XPUT 
'http://localhost:9200/employee/test/17?pretty&refresh=true'
 
-d '{"designation": "lead"}'



On Thursday, January 9, 2014 2:12:05 PM UTC+5:30, Jayesh Bhoyar wrote:
>
> Hi,
>
> I have following Synonyms that I want to configure.
>
> software engineer => software engineer, se, 
> senior software engineer => senior software engineer , see
> team lead => team lead, lead, tl
>
> So that If I searched for se or Software Engineer it should return me the 
> records having software engineer.
>
> What mapping I should apply on Designation field? and what query I should 
> fire to get the result
> It is possible to use multi_match query?
>
> Following are the query to create the records.
>
> curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d 
> '{"designation": "software engineer"}'
> curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d 
> '{"designation": "software engineer"}'
> curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d 
> '{"designation": "senior software engineer"}'
> curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d 
> '{"designation": "senior software engineer"}'
> curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d 
> '{"designation": "team lead"}'
> curl -XPUT 'http://localhost:9200/employee/test/6?pretty&refresh=true' -d 
> '{"designation": "team lead"}'
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/71574363-4a46-4471-be9e-6ef1b0938d60%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread Jun Ohtani
Hi Chris,

Could you try to escape “-“ in query for “not_analyzed” field?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters

I hope this helps.
Regards,


Jun Ohtani
joht...@gmail.com
blog : http://blog.johtani.info
twitter : http://twitter.com/johtani




2014/01/09 17:20、Chris H  のメール:

> Hi, a bit more information.
> 
> I tried adding a custom analyzer based off a recommendation I saw online 
> somewhere.  This partly works in that it's not tokenising.  But I can't do 
> wildcard searches in Kibana on the fields, and they're now case sensitive :(
> 
> curl localhost:9200/_template/logstash-username -XPUT -d '{
> "template": "logstash-*",
> "settings" : {
> "analysis": {
> "analyzer": {
> "lc_analyzer": {
> "type": "custom",
> "tokenizer": "keyword",
> "filters": ["lowercase"]
> }
> }
> }
> },
> "mappings": {
> "_default_": {
>  "properties" : {
> "User_Name" : { "type" : "string", "analyzer" : "lc_analyzer" 
> }
> }
> }
> }
> }'
> 
> Thanks
> 
> On Wednesday, January 8, 2014 3:26:03 PM UTC, Chris H wrote:
> Hi.  I've deployed elasticsearch with logstash and kibana to take in Windows 
> logs from my OSSEC log server, following this guide: 
> http://vichargrave.com/ossec-log-management-with-elasticsearch/
> I've tweaked the logstash config to extract some specific fields from the 
> logs, such as User_Name.  I'm having some issues searching on these fields 
> though.
> 
> These searches work as expected:
>   • User_Name: * 
>   • User_Name: john.smith
>   • User_Name: john.*
>   • NOT User_Name: john.*
> But I'm having problems with Computer accounts, which take the format 
> "w-dc-01$" - they're being split on the "-" and the "$" is ignored.  So a 
> search for "w-dc-01" returns all the servers named "w-".  Also I 
> can't do "NOT User_Name: *$" to exclude computer accounts.
> 
> The mappings are created automatically by logstash, and GET 
> /logstash-2014.01.08/_mapping shows:
> 
> "User_Name": {
> 
>"type": "multi_field",
>"fields": {
>   "User_Name": {
>  "type": "string",
>  "omit_norms": true
>   },
>   "raw": {
>  "type": "string",
>  "index": "not_analyzed",
>  "omit_norms": true,
>  "index_options": "docs",
>  "include_in_all": false,
>  "ignore_above": 256
>   }
>}
> },
> My (limited) understanding is that the "not_analyzed" should stop the field 
> being split, so that my searching matches the full name, but it doesn't.  I'm 
> trying both kibana and curl to get results.
> 
> Hope this makes sense.  I really like the look of elasticsearch, but being 
> able to search on extracted fields like this is pretty key to me using it.
> 
> Thanks.
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/96e74e53-54f9-48ec-9e5c-8f1354b264be%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Spring elastic search and configuration for Mappings and _settings files

2014-01-09 Thread David Pilato
Your configuration looks good to me.

I modified your spring file to add a node and change server location:

    

    

I started you main() and the factory starts as expected.
No error seen.

Not sure where your issue came from.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 6 janvier 2014 at 15:18:42, Ramdev Wudali (agasty...@gmail.com) a écrit:

Hi David :
   Sorry for the delay in my response.. (the weekend chores took over). Here is 
my project (a tgz archive file) its a maven project so you should due able to 
import it into your IDE of choice (I have used IntelliJ, So you may find some 
of those artifacts as well). 

I have not included any data. (the data format is just Strings(titles)  one per 
line).  The path is specified in the spring config file. that is included in 
the resources folder.


Please do let me know if you do find something…


Thanks

Ramdev 




See my vizify bio!



On Fri, Jan 3, 2014 at 3:20 PM, David Pilato  wrote:
Could you share your project or gist your files and source code?


--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 3 janv. 2014 à 22:08, Ramdev Wudali  a écrit :

Hi David:
   I setup the config to run on port 8200 and 8300 (instead of default 9200 and 
9300 as they were taken up by tomcat)

See my vizify bio!



On Fri, Jan 3, 2014 at 2:38 PM, David Pilato  wrote:
Is it a typo?

esNodes="elasticsearch.server:8300"

Should be 9300, right?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 3 janv. 2014 à 21:35, Ramdev Wudali  a écrit :

Hi David:
   Thanks for the speedy response. Here is an update to my problem. I was 
trying to create a different type within the same index. (Index: experiment, 
type : Titles and I was trying to create Type : NewTitles ) I am not sure if 
this has any bearing on the problem. 

After posting the question on the group, I  went ahead and created a separate 
index  (experiment2) and within this new index, I created the Type: NewTitles. 

When I ran my application, there was no problems during the Spring elastic 
search client  initialization. 

This basically tells me there is a conflict in creation of a new Type under an 
existing index. (I am not able to figure out why there is a conflict). 

And I am not mixing versions of ElasticSearch between client and node. (both 
using 0.90.5)


hope this helps

Thanks

Ramdev




See my vizify bio!



On Fri, Jan 3, 2014 at 2:29 PM, David Pilato  wrote:
Any chance you are mixing elasticsearch versions between node and client?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 3 janv. 2014 à 20:16, Ramdev Wudali  a écrit :

Hi All:
    I am trying to index a set of documents with the following mapping :
{
    "NewTitles": {
        "properties": {
            "DOC_ID": {
                "type":"string"
            },
            "TITLE": {
                "type": "multi_field",
                "fields" : {
                    "TITLE" : {
                        "type" : "string"
                    },
                    "sortable" : {
                        "type" : "string",
                        "index" : "not_analyzed"
                    },
                    "autocomplete" : {
                        "type" : "string",
                        "index_analyzer" : "shingle_analyzer"
                    }
                }
            }
        }
    }
}

(which resides in the src/main/es/experiment folder in my project)

and there is a _settings.json file which defines the shingle_analyzer  like so :

{
    "index" : {
        "analysis": {
            "filter": {
                "shingle_filter": {
                    "type": "shingle",
                    "min_shingle_size": 2,
                    "max_shingle_size": 5
                }
            },
            "analyzer": {
                "shingle_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "shingle_filter"
                    ]
                }
            }
        }
    }
}

I am initializing the Elasticsearch client using the spring elastic search  
like so :

    
        
    

    

The elastic Search instance already has the index : experiment and Type : Titles

When I run my app to index some new content, I get the an error during Spring 
initialization,  :

12:47:03.044 CST INFO  [main      ] 
f.p.s.e.ElasticsearchTransportClientFactoryBean - Starting ElasticSearch client
12:47:03.753 CST INFO  [main      ] org.elasticsearch.plugins - [Ringleader] 
loaded [], sites []
Exception in thread "main" 
org.springframework.beans.factory.BeanCreationException: Error creating bean 
with name 'esClient2': Invocation of init method failed; nested exception is 
org.elasticsearch.transport.TransportSerializationException: Failed to 
deserialize exception response from stream

Upon c

How to configure and implement Synonyms with multi words.

2014-01-09 Thread Jayesh Bhoyar
Hi,

I have following Synonyms that I want to configure.

software engineer => software engineer, se, 
senior software engineer => senior software engineer , see
team lead => team lead, lead, tl

So that If I searched for se or Software Engineer it should return me the 
records having software engineer.

What mapping I should apply on Designation field? and what query I should 
fire to get the result
It is possible to use multi_match query?

Following are the query to create the records.

curl -XPUT 'http://localhost:9200/employee/test/1?pretty' -d 
'{"designation": "software engineer"}'
curl -XPUT 'http://localhost:9200/employee/test/2?pretty' -d 
'{"designation": "software engineer"}'
curl -XPUT 'http://localhost:9200/employee/test/3?pretty' -d 
'{"designation": "senior software engineer"}'
curl -XPUT 'http://localhost:9200/employee/test/4?pretty' -d 
'{"designation": "senior software engineer"}'
curl -XPUT 'http://localhost:9200/employee/test/5?pretty' -d 
'{"designation": "team lead"}'
curl -XPUT 'http://localhost:9200/employee/test/6?pretty&refresh=true' -d 
'{"designation": "team lead"}'

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/486c56cf-11a3-4b14-b5b4-66e4a465cb13%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: allow_explicit_index and _bulk

2014-01-09 Thread Alexander Reelsen
Hey,

after having a very quick look, it looks like a bug (or wrong
documentation, need to check further). Can you create a github issue?

Thanks!


--Alex


On Wed, Jan 8, 2014 at 11:08 PM, Gabe Gorelick-Feldman <
gabegorel...@gmail.com> wrote:

> The documentation on URL-based access 
> control
>  implies
> that _bulk still works if you set rest.action.multi.allow_explicit_index:
> false, as long as you specify the index in the URL. However, I can't get
> it to work.
>
> POST /foo/bar/_bulk
> { "index": {} }
> { "_id": 1234, "baz": "foobar" }
>
> returns
>
> explicit index in bulk is not allowed
>
> Should this work?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a0d1fa2f-0c28-4142-9f6d-4b28a1695bb3%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM990rKacVv7DQ6eeJRciwLwGRiA8OezUYs8xqE17vrGgA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Kibana Static Dashboard ?

2014-01-09 Thread Mark Walkom
Then you can put it in $KIBANA_ROOT/app/dashboards and load it from there.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 9 January 2014 19:32, vineeth mohan  wrote:

> Hello Jay ,
>
> An advice here.
> You can save Kibana dashboard as a static file too.
> Follow these steps
>
> Save -> Advanced -> "Export as schema"
>
> Thanks
>   Vineeth
>
>
> On Thu, Jan 9, 2014 at 3:16 AM, Jay Wilson  wrote:
>
>> As I understand Kibana when a dashboard is saved, it is placed into
>> elasticsearch. I don't want it in elasticsearch. I want it in a static file.
>>
>>
>>
>>
>> On Wednesday, January 8, 2014 2:32:50 PM UTC-7, vineeth mohan wrote:
>>
>>> Hello Jay ,
>>>
>>> Cant you do the same from the kibana side by adding a must not filter.
>>> Here once you save that dashboard , you can always go back to the same
>>> link to see the same static dashboard.
>>>
>>> Thanks
>>>  Vineeth
>>>
>>>
>>> On Thu, Jan 9, 2014 at 2:42 AM, Jay Wilson  wrote:
>>>
  I am modifying the guided.json dashboard. Down in Events panel I would
 like to tell kibana to statically filter out specific records. I tried
 adding this to the file.

   "query": {
   "filtered": {
   "query": {
 "bool": {
   "should": [
   {
   "query_string": {
"query": "record-type: traffic-stats"
   }
   }
 ]
  }
  }
   }
   },

 Doesn't appear to work.


  --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.

 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/83bd80b2-5a61-4a15-b359-125fd600f3cd%
 40googlegroups.com.
 For more options, visit https://groups.google.com/groups/opt_out.

>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/f207abdd-9fce-4379-aa9a-dd1dd35aa398%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAGdPd5k0zxt4G%2BWquQt8-iwY%2Bk_kwWgcOrGpB3Yd0%2BOkfr6fvg%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624Z4jBX6Yi5GLpLMBH0zCJh6RsRqWXSSb9-4m7%3DrUdOqTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Kibana Static Dashboard ?

2014-01-09 Thread vineeth mohan
Hello Jay ,

An advice here.
You can save Kibana dashboard as a static file too.
Follow these steps

Save -> Advanced -> "Export as schema"

Thanks
  Vineeth


On Thu, Jan 9, 2014 at 3:16 AM, Jay Wilson  wrote:

> As I understand Kibana when a dashboard is saved, it is placed into
> elasticsearch. I don't want it in elasticsearch. I want it in a static file.
>
>
>
>
> On Wednesday, January 8, 2014 2:32:50 PM UTC-7, vineeth mohan wrote:
>
>> Hello Jay ,
>>
>> Cant you do the same from the kibana side by adding a must not filter.
>> Here once you save that dashboard , you can always go back to the same
>> link to see the same static dashboard.
>>
>> Thanks
>>  Vineeth
>>
>>
>> On Thu, Jan 9, 2014 at 2:42 AM, Jay Wilson  wrote:
>>
>>>  I am modifying the guided.json dashboard. Down in Events panel I would
>>> like to tell kibana to statically filter out specific records. I tried
>>> adding this to the file.
>>>
>>>   "query": {
>>>   "filtered": {
>>>   "query": {
>>> "bool": {
>>>   "should": [
>>>   {
>>>   "query_string": {
>>>"query": "record-type: traffic-stats"
>>>   }
>>>   }
>>> ]
>>>  }
>>>  }
>>>   }
>>>   },
>>>
>>> Doesn't appear to work.
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>>
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/83bd80b2-5a61-4a15-b359-125fd600f3cd%
>>> 40googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f207abdd-9fce-4379-aa9a-dd1dd35aa398%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5k0zxt4G%2BWquQt8-iwY%2Bk_kwWgcOrGpB3Yd0%2BOkfr6fvg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Elasticsearch Missing Data

2014-01-09 Thread Alexander Reelsen
Hey,

a couple of things:

1. Did you check the log files? Most likely in /var/log/elasticsearch if
you use the packages. Is there anything suspicious at the time of your
outage? Please check your master node as well, if you have one (not sure if
it is a master or client node from the cluster health).
2. Why should elasticsearch pull your data? Any special configuration you
didnt mention? Or what exactly do you mean here?
3. Happy to debug your issue with the init script. The elasticsearch.yml
file should be in /etc/elasticsearch/ and not in /etc - anything manually
moved around? Can you still reproduce it?


--Alex




On Wed, Jan 8, 2014 at 8:10 PM, Eric Luellen  wrote:

> Hello,
>
> I've had my elasticsearch instance running for about a week with no
> issues, but last night it stopped working. When I went to look in Kibana,
> it stops logging around 20:45 on 1/7/14. I then restarted the service on
> both both elasticsearch servers and it started logging again and back
> pulled some logs from 07:10 that morning, even though I restarted the
> service around 10:00. So my questions are:
>
> 1. Why did it stop working? I don't see any obvious errors.
> 2. When I restarted it, why didn't it go back and pull all of the data and
> not just some of it? I see that there are no unassigned shards.
>
> curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
> {
>   "cluster_name" : "my-elasticsearch",
>   "status" : "green",
>   "timed_out" : false,
>   "number_of_nodes" : 3,
>   "number_of_data_nodes" : 2,
>   "active_primary_shards" : 40,
>   "active_shards" : 80,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 0
>
> Are there any additional queries or logs I can look at to see what is
> going on?
>
> On a slight side note, when I restarted my 2nd elasticsearch server it
> isn't reading from the /etc/elasticsearch.yml file like it should. It isn't
> creating the node name correctly or putting the data files in the spot I
> have configured. I'm using CentOS and doing everything via
> /etc/init.d/elasticsearch on both servers and the elasticsearch1 server
> reads everything correctly but elasticsearch2 does not.
>
> Thanks for your help.
> Eric
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/fc191ee4-b312-4c52-89d9-de04c4309b65%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8EOWdC5esVkfZ5hogocQkgreJBQUbF2zE7s-gGCt4NdQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: logstash vs rivers for reading data from SQL Server

2014-01-09 Thread Alexander Reelsen
Hey,

maybe you should ask your developers, why they recommended logstash for
this, I cant follow here (perhaps there is some export functionality in
your SQL server, which an input of logstash can use?). Would be interested
in reasons in this case.


--Alex


On Wed, Jan 8, 2014 at 5:26 PM, jsp  wrote:

> Hi,
> I am looking at implementing ES to index & query data that I get from my
> SQL Server databases/tables.  I was initially using river to read data from
> Sql server tables but one of the developers in my team recommended looking
> at using logstash. Can anyone comment to any benefits of using one over
> another?   I have not been able to find any documentation regarding reading
> data from SQL server using logstash .
>  Can someone point me to a guide on how to get started with logstash & sql
> server.
> Thanks
> J
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1173b62c-afd3-4d2d-9a3f-ba423ed7ede4%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-X070J9GKmx7ysce1KCnX5yinMoeDhkkwHfDV%3D_BqDwA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Searching indexed fields without analysing

2014-01-09 Thread Chris H
Hi, a bit more information.

I tried adding a custom analyzer based off a recommendation I saw online 
somewhere.  This partly works in that it's not tokenising.  But I can't do 
wildcard searches in Kibana on the fields, and they're now case sensitive :(

curl localhost:9200/_template/logstash-username -XPUT -d '{
"template": "logstash-*",
"settings" : {
"analysis": {
"analyzer": {
"lc_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filters": ["lowercase"]
}
}
}
},
"mappings": {
"_default_": {
 "properties" : {
"User_Name" : { "type" : "string", "analyzer" : 
"lc_analyzer" }
}
}
}
}'

Thanks

On Wednesday, January 8, 2014 3:26:03 PM UTC, Chris H wrote:
>
> Hi.  I've deployed elasticsearch with logstash and kibana to take in 
> Windows logs from my OSSEC log server, following this guide: 
> http://vichargrave.com/ossec-log-management-with-elasticsearch/
> I've tweaked the logstash config to extract some specific fields from the 
> logs, such as User_Name.  I'm having some issues searching on these fields 
> though.
>
> These searches work as expected:
>
>- User_Name: * 
>- User_Name: john.smith
>- User_Name: john.*
>- NOT User_Name: john.*
>
> But I'm having problems with Computer accounts, which take the format 
> "w-dc-01$" - they're being split on the "-" and the "$" is ignored.  So a 
> search for "w-dc-01" returns all the servers named "w-".  Also I 
> can't do "NOT User_Name: *$" to exclude computer accounts.
>
> The mappings are created automatically by logstash, and GET 
> /logstash-2014.01.08/_mapping shows:
>
> "User_Name": {
>
>"type": "multi_field",
>"fields": {
>   "User_Name": {
>  "type": "string",
>  "omit_norms": true
>   },
>   "raw": {
>  "type": "string",
>  "index": "*not_analyzed*",
>  "omit_norms": true,
>  "index_options": "docs",
>  "include_in_all": false,
>  "ignore_above": 256
>   }
>}
> },
>
> My (limited) understanding is that the "not_analyzed" should stop the 
> field being split, so that my searching matches the full name, but it 
> doesn't.  I'm trying both kibana and curl to get results.
>
> Hope this makes sense.  I really like the look of elasticsearch, but being 
> able to search on extracted fields like this is pretty key to me using it.
>
> Thanks.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/96e74e53-54f9-48ec-9e5c-8f1354b264be%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.