geo_shape filter in post_filter

2014-02-11 Thread HeonKyu Park
I have 2 types of document as below. - *poi*: 4 millions, it has a point(lon, lat) field - *region*: it has a polygon field I'd like to search POI documents in a certain region by using geo_shape filter. So I made query as below. { "query": { "filtered": { "query": {

Re: ES sorting not working when use min_score in query

2014-02-11 Thread Vallabh Bothre
Thanks for replying Alex, You can see my below code so that you can come to know what i am using, --Creating the index first, curl -X PUT 'http://localhost:9200/adminvenue/?pretty=true' -d ' { "settings" : { "analysis" : { "analyzer" : { "venue_analyzer" :

Re: Computing idf in elasticsearch

2014-02-11 Thread sunayana choudhary
Thanks Brin, your answer solved my problem. Thanks Ivan to you too, I am having 5 shards, idf is getting calculated on the maxdocs present in that shard. Doesn't that leads to misleading idf? On Tuesday, 11 February 2014 20:10:52 UTC+5:30, sunayana choudhary wrote: > > Hi all, > > I have been an

Re: fieldNorm & queryNorm in explain api

2014-02-11 Thread Ivan Brusic
The norms are calculated by Lucene's TFIDF algorithm: http://lucene.apache.org/core/4_6_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html The field norm is encoded into a single byte, so there is a loss of precision. -- Ivan On Tue, Feb 11, 2014 at 9:32 PM, wrote: > Hi, > > I

Re: elastic result search by text with nearest distance using latitude and longitude

2014-02-11 Thread Vallabh Bothre
Thanks for replying Binh, Yeap first 25 matches having closer distance (lat,lon values) but they are not relevant matches. For Ex. When i search for "palexpo" from San Francisco (37.77519600,-122.41920400) it gives me, 1. PLUGZ -- 37.80664300 -122.41628300 -- 2.181 Miles from san francisco 2.

fieldNorm & queryNorm in explain api

2014-02-11 Thread usha2626
Hi, I want to know what are fieldNorm and queryNorm in the output displayed by the explain api..how are they calculated? Thanks Usha -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails fro

query related to explain api

2014-02-11 Thread Navneet Mathpal
hi, what is queryWeight and fieldWeight value which we get as an output of explain api, and how is it calculated? Thanks Navneet Mathpal -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving em

query related to explain api

2014-02-11 Thread Navneet Mathpal
hi, what is the queryWeight and fieldweight and how does it calculated.I am getting the follwing results. Thanks Navneet Mathpal -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from

Re: Data Loss

2014-02-11 Thread Mark Walkom
Split brain would be one of the main one I can think of. Though I know some people have had issues with primary shards not initialising, though I am not sure what would cause that. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmon

Re: Optimized Rest requests in case of embedded elastic search

2014-02-11 Thread Jay Modi
Look at the Java API. >From my look through elasticsearch code, it does not generate HTTP requests but routes them through services. How many logs do you generate on average? Do your servers have enough resources to

Re: ElasticSearch server lock up

2014-02-11 Thread Jay Modi
I'll definitely post back with what we find in the logs with the updated version. The bulk indexes do not contain a delete by query; we're just converting some Java objects to a XContentBuilder and sending them to be indexed like so: BulkRequestBuilder bulkRequest = client.prepareBulk(); for (1

Re: Newbies and SUSE/openSUSE Users

2014-02-11 Thread Mark Walkom
Nice work! Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 12 February 2014 11:40, Tony Su wrote: > On my openSUSE wiki, I have created and updated a couple pages... > > > *For Beginners learning Elasticsearch, Log

Newbies and SUSE/openSUSE Users

2014-02-11 Thread Tony Su
On my openSUSE wiki, I have created and updated a couple pages... *For Beginners learning Elasticsearch, Logstash and Kibana* *http://en.opensuse.org/User:Tsu2/elasticsearch_1.0* Previews and describes some fundamental concepts when running the

Re: Trying to optimize configuration for better cluster restart/recovery

2014-02-11 Thread Tony Su
Cool. Thx all. Tony On Tuesday, February 11, 2014 10:38:18 AM UTC-8, Binh Ly wrote: > Tony, > > What you are seeing with the shard recovery is normal - but doesn't mean > it couldn't use more improvement in the future. For now you can throttle > the recovery using a combination of settings (bu

Re: Elasticsearch with Hadoop HDFS.

2014-02-11 Thread Jong Min Kim
Thanks. Everytime I post question, I get answers with knowledge. :) 2014년 2월 11일 화요일 오후 8시 4분 8초 UTC+9, Costin Leau 님의 말: > > Hi, > > > On 11/02/2014 6:40 AM, Jong Min Kim wrote: > > I was searching infos about ES with HDFS. What I see is, using ES with > Hadoop does not mean using HDFS as main

Correctly indexing data into one place with multiple analyzers

2014-02-11 Thread Kevin Claggett
I have some documents with ~30 fields, most of which i just want to analyze with the defaults, a couple i want to use snowballing or other custom analyzers on. The recommended way to do this seems to be using the index_name property to aliase a custom _all field, such as: curl -XPOST $elasticl

how to search on number of nested terms matches?

2014-02-11 Thread Colin Surprenant
I am trying to figure if/how it is possible to craft a specific query using nested objects: For example, given a simple author with nested books mapping: { "author":{ "properties" : { "name" : { "type" : "string" }, "books" : { "type" : "nested", "properties" :

Re: Elasticsearch issue.

2014-02-11 Thread Binh Ly
Forgot to mention, Marvel only works with ES 0.90.9 and later. Just FYI. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.

Re: Elasticsearch issue.

2014-02-11 Thread Binh Ly
Chris, you're right, I'm doing it on a newer version. For your case, try: curl "localhost:9200/_cluster/state?pretty" You'll get a lot more info but just look under the routing_table and routing_nodes sections for the details I mentioned before. -- You received this message because you are sub

Re: Elasticsearch issue.

2014-02-11 Thread Chris
Hi Binh, That command did not seem to work. I am running version 90.6, is that supported in this version? $ curl http://server:9200/_cluster/state/routing_table?pretty { "error" : "IndexMissingException[[_cluster] missing]", "status" : 404 } Thanks, -- You received this message because yo

Re: Elasticsearch issue.

2014-02-11 Thread Chris
I am using bigdesk and marvel that I just installed today, I am running version 90.6 of elasticsearch and I am not getting data back from marvel. I want to upgrade to most recent version however, I want to resolve this issue first. Do you know how to assign primary shards? Thanks, Chris. O

Re: common terms query with cutoff_frequency

2014-02-11 Thread vinamar
Hi Alex, I used the multi_match query to query with stop words but the returned results doesn't contain the stop words. I used exact match with double quotes on the query string. Also the highlighted results - doesn't contain the stop words in the results. "query": { "multi_match":

Re: Elasticsearch issue.

2014-02-11 Thread Binh Ly
Chris, You'll probably need to find out which node contains whichever shards that you think are bad. If you do something like this, you can get a detailed breakdown of which indexes has which shards on which nodes and their corresponding shard states: curl "localhost:9200/_cluster/state/routin

Re: Elasticsearch issue.

2014-02-11 Thread Mark Walkom
Not if your cluster is in a red state, that means you have unassigned primary shards. What are you using to monitor things? If you're only using the API then look at plugins like elastichq, kopf, bigdesk or marvel. They will give you better insight into what is happening. Regards, Mark Walkom In

Re: Elasticsearch issue.

2014-02-11 Thread Chris
Thanks for the feedback Mark / Binh, I am not sure if it is a single node that is causing the problem. Querying _cluster/health/indexdata?level=shards gives me this response below. Is deleting the data from the bad node, consistent when the shards are in the state as below? { "cluster_name

Re: Elasticsearch issue.

2014-02-11 Thread Mark Walkom
The bad node is the one that ran out of space. If you have installed ES on linux using a package (deb/rpm) then the data is usually under /var/lib/elasticsearch. Just manually delete it and then rejoin the node. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmo

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread InquiringMind
Raffaele, Thanks! Also, note the range (that is, maximum distance traveled with appropriate fuel minimums) of the following aircraft. All are in nm(nautical miles): http://www.gulfstream.com/careers/our_products.html Cool stuff! (Now back to work!) Brian *On Monday, February 10, 2014 6:54:44

Re: Plugin development guidance

2014-02-11 Thread Josh Harrison
Great, thanks Jörg! I'll start fiddling around with the langdetect plugin to see if I can get it going with our library. On Tue, Feb 11, 2014 at 1:18 PM, joergpra...@gmail.com < joergpra...@gmail.com> wrote: > An analyzer plugin is the right thing. Adding the recognized/extracted > terms needs a

Re: How to create custom categories for text files indexed in Elasticsearch using the attachment plugin?

2014-02-11 Thread Sagar Singh
Thanks, I guess i should be asking how would i set those meta data during indexing, is there a feature that ES can perform for me. i saw a way to have custom analyzers and was trying those out to see if i can use that to set the meta for me if there is something in that document. On Tuesday, Fe

Re: Plugin development guidance

2014-02-11 Thread joergpra...@gmail.com
An analyzer plugin is the right thing. Adding the recognized/extracted terms needs access to ES mapping service. There are a few plugins out there which work in this manner, for example, the attachment mapper plugin. Or the lang-detect plugin, it adds the recognized language(s) as a keyword code i

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread InquiringMind
Ivan, Thanks so much for the advice. I created the nm-branch (nautical miles branch) off of master, then cloned it to my laptop. Found the two files (DistanceUnit, and DistanceUnitTests). Made the (very simple) changes. Then built ES and ran only the tests in DistanceUnitTests. The build was s

Data Loss

2014-02-11 Thread Mohit Anchlia
I've read some blogs and some email groups where users have indicated they have had data loss. In some cases user is able to recover using the source. I am wondering what are the common reasons this could happen due to ES software issue assuming there are 2+ replicas and multiple nodes available?

Re: Elasticsearch issue.

2014-02-11 Thread Chris
On Tuesday, February 11, 2014 12:44:02 PM UTC-8, Binh Ly wrote: > > If all your other nodes contain enough replicas of all your indexes (i.e. > you have lost no data), then you can safely take down the bad node, wipe > out whatever data is in the data directory (assuming it is local to the > n

Re: Elasticsearch issue.

2014-02-11 Thread Chris
On Tuesday, February 11, 2014 12:44:02 PM UTC-8, Binh Ly wrote: > > If all your other nodes contain enough replicas of all your indexes (i.e. > you have lost no data), then you can safely take down the bad node, wipe > out whatever data is in the data directory (assuming it is local to the > n

Re: How to create custom categories for text files indexed in Elasticsearch using the attachment plugin?

2014-02-11 Thread Binh Ly
Ooops, forgot to mention "store": "yes" is optional. You don't have to store anything if you don't need stored metadata fields. On Tuesday, February 11, 2014 3:57:52 PM UTC-5, Binh Ly wrote: > > You can simply add more fields on the same level as file. For example: > > "mappings": { > "doc"

Re: How to create custom categories for text files indexed in Elasticsearch using the attachment plugin?

2014-02-11 Thread Binh Ly
You can simply add more fields on the same level as file. For example: "mappings": { "doc": { "properties" : { "file" : { "type" : "attachment", "fields": { "file": { "store": "yes" } } }, "meta1": { "type":

Re: Elasticsearch issue.

2014-02-11 Thread Binh Ly
If all your other nodes contain enough replicas of all your indexes (i.e. you have lost no data), then you can safely take down the bad node, wipe out whatever data is in the data directory (assuming it is local to the node) and then join it back to the cluster. If the bad node actually contain

Custom Aggregations

2014-02-11 Thread Justin Uang
Is there any way we can define our own aggregation functions beyond the provided metric and bucket aggregations? Thanks! Justin -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it,

Re: Computing idf in elasticsearch

2014-02-11 Thread Ivan Brusic
Oops, obvious answer. :) I see questions about incorrect TFIDF scores and my mind automatically goes to DFS scoring (which is actually about TF, not IDF). -- Ivan On Tue, Feb 11, 2014 at 10:22 AM, Binh Ly wrote: > Also be aware that the log should be a natural log, i.e. the base is e > instea

Re: get MapperParsingException failed to parse in 0.90.10

2014-02-11 Thread Ivan Brusic
That is your template. Use the Get Mapping API to find out what actually is in effect. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-get-mapping.html On Tue, Feb 11, 2014 at 12:17 PM, Stefan Sabolowitsch < sabolowitsc...@in-trier.de> wrote: > Hi Ivan, > thanks f

Re: get MapperParsingException failed to parse in 0.90.10

2014-02-11 Thread Stefan Sabolowitsch
Hi Ivan, thanks for your answer, i use as an indexer logstash. this is my current mapping: { "template" : "logstash-*", "settings" : { "index.refresh_interval" : "5s", "analysis" : { "analyzer" : { "default" : { "type" : "standard", "stopwords" : "_none_"

Re: completion suggester caching

2014-02-11 Thread Jorge Sanchez
Hi, twitter typeahead is autocomplete library which is broadly used to implement such features on websits. It allows to fetch remote suggests via AJAX which is the way how I use it. The AJAX query has the value the user searched for , in below case I typed "j" in the search bar. The typeahead

Re: get MapperParsingException failed to parse in 0.90.10

2014-02-11 Thread Ivan Brusic
What is your current mapping? Use the GetMapping API. The file field is an inner object, but you do not have one defined in your mapping. Very likely you already have indexed a document with the file field as another type. -- Ivan On Tue, Feb 11, 2014 at 7:12 AM, Stefan Sabolowitsch < sabolowi

How to create custom categories for text files indexed in Elasticsearch using the attachment plugin?

2014-02-11 Thread Sagar Singh
I have my mappings as such { "attachment": { "properties": { "file": { "type": "attachment", "fields": { "title": { "store": "yes" }, "author": {

Elasticsearch issue.

2014-02-11 Thread Chris
Hi all, I am having an issue with the Cluster. After loading 30+ million records into the system, over the weekend one of the servers (out of 6) ran out of disk space and ever since, I cannot seem to get it back online. Any help would be appreciated. If anyone has any suggestions at all, any he

Re: Trying to optimize configuration for better cluster restart/recovery

2014-02-11 Thread Binh Ly
Tony, What you are seeing with the shard recovery is normal - but doesn't mean it couldn't use more improvement in the future. For now you can throttle the recovery using a combination of settings (but cannot 100% avoid it). Just FYI, there is a reason hashing cannot be done (for now) and this

Re: Computing idf in elasticsearch

2014-02-11 Thread Binh Ly
Also be aware that the log should be a natural log, i.e. the base is e instead of 10. So for example, pulling the first IDF from your results: value: 5.88784 description: idf(docFreq=2, maxDocs=398) idf = 1 + ln(398 / (2 + 1)) = 5.8878397166163280134321081764042

Re: Redesigning ES Cluster, questions about optimization

2014-02-11 Thread joergpra...@gmail.com
You write "ES will usually crash" - but how does it crash? Are there messages in the log? Do not use Java 7u51, it may cause trouble, 7u25 is known to be stable. Why do you only use 12G heap if you have 64G RAM on a node? Why do you limit your resources with ES_DIRECT_SIZE? Why do you use 5 shard

Plugin development guidance

2014-02-11 Thread Josh Harrison
Hi all, We've got an internal Java library that allows us to do keyword extraction that seems like a great thing to turn into an integrated elasticsearch function. Ultimately, I want to be able to access the result of this library from search results/etc, but I wanted to do a sanity check to ma

Re: Redesigning ES Cluster, questions about optimization

2014-02-11 Thread Harry Truman
As of the time of this posting: elasticsearch-0.90.9-1 jdk-1.7.0_51 ES_HEAP_SIZE=12g ES_DIRECT_SIZE=20g index.number_of_replicas: 1 Shards: "number_of_nodes" : 2, "number_of_data_nodes" : 2, "active_primary_shards" : 30, "active_shards" : 60, And rather than a block of text, here are th

Re: Discrepancy on processed nodes

2014-02-11 Thread Binh Ly
Yup, When you look closely at the _node results, one node has these characteristics: "attributes" : { "client" : "true", "data" : "false" }, This means the node is a client node and is not a data node. On Tuesday, February 11, 2014 10:06:47 AM UTC-5, David Patiashvi

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread Ivan Brusic
Create a branch for your changes. Submit a PR from the branch and not master. Make sure to update DistanceUnitTests.java as well. The trickiest part is getting the Elasticsearch team to notice your PR. :) They must be super busy with the 1.0 release. Lots of tutorials online: http://gun.io/blog/ho

Re: Trying to optimize configuration for better cluster restart/recovery

2014-02-11 Thread Tony Su
Update: Whereas my previous tries to optimize for recovery failed miserably, the "gateway.recover_after_nodes" setting in elasticsearch.yml worked... To a point. I noticed - No ES node was responsive at all after nodes were brought online until the quorum was met. - It can take a long time for

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread InquiringMind
Working on a pull request... I've created a fork off of master and cloned it to my laptop. (First time using git and GitHub in this way...) Brian -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receivin

Re: scoring on a multi_field

2014-02-11 Thread Binh Ly
This will be addressed better in the future. For now, you can split/rewrite your multi_match query into a bool query with multiple should clauses where each clause is targeting each field in your multi_match (or query_string) query. This still won't be quite a "sum", but at least it will "combin

Re: scoring on a multi_field

2014-02-11 Thread Ivan Brusic
Try setting use_dis_max to false in your query. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_multi_field_2 Cheers, Ivan On Tue, Feb 11, 2014 at 1:15 AM, Alexander Ott wrote: > Hallo, > > i have the same problem if i send a query like

Re: One particular value in a field isn't indexed

2014-02-11 Thread Ivan Brusic
Actually in your case, your search terms probably do not need to be analyzed at all since you are not executing full-text searches on that field. Try setting the field as non_analyzed and use a term query (which does not analyze search terms). Better yet, using a term filter since filters are faste

Re: One particular value in a field isn't indexed

2014-02-11 Thread Ivan Brusic
Very rookie problem. :) The default (aka standard) analyzer uses a stopword filter and "it" is a stopword. Try configuring your field with a custom analyzer which does not use stopwords or a custom set of stopwords. Cheers, Ivan On Tue, Feb 11, 2014 at 7:57 AM, wrote: > Hi, > > this might be

Re: Odd hot MVEL

2014-02-11 Thread Nikolas Everett
On Tue, Feb 11, 2014 at 11:26 AM, Ivan Brusic wrote: > Great catch. Which Elasticsearch version and which JDK? > > Thankfully my documents are uniform, so I have been able to skip isEmpty > checks. > I believe I started seeing it on 0.90.6. I'm running 0.90.10 in production now and see it. I v

Re: Odd hot MVEL

2014-02-11 Thread Ivan Brusic
Great catch. Which Elasticsearch version and which JDK? Thankfully my documents are uniform, so I have been able to skip isEmpty checks. -- Ivan On Tue, Feb 11, 2014 at 7:52 AM, Nikolas Everett wrote: > Sorry to resurrect a dead thread, but I figured it out: > https://github.com/elasticsearc

Re: Setting up elasticsearch to scale: shards per index

2014-02-11 Thread joergpra...@gmail.com
Three master nodes are enough, for as many data nodes as you wish to add. You can search this mailing list for discussions where kimchy explained the "dedicated master nodes", and how it fits for split-brain situations For example https://groups.google.com/forum/#!topic/elasticsearch/dxjpMd4vNXQ

One particular value in a field isn't indexed

2014-02-11 Thread felix . kofink
Hi, this might be a rookie problem since I'm very new to elasticsearch. I'm trying to put JSON documents into elasticsearch with a field "lang". However if "lang" is set to "it" elasticsearch doesn't seem to recognize the field since it's only returned when I filter for missing fields. The prob

[ANN] elasticsearch-transport-thrift plugin 1.8.0 and 2.0.0.RC2

2014-02-11 Thread David Pilato
Heya, We just released elasticsearch-transport-thrift plugin 1.8.0 for elasticsearch 0.90.10 (and >) and 2.0.0.RC2 for elasticsearch 1.0.0.RC1 (and >): https://github.com/elasticsearch/elasticsearch-transport-thrift Issue fixed in both branches:  https://github.com/elasticsearch/elasticsearch

Re: Odd hot MVEL

2014-02-11 Thread Nikolas Everett
Sorry to resurrect a dead thread, but I figured it out: https://github.com/elasticsearch/elasticsearch/issues/5086 High level: 1. Hit ~1 million documents with a script score. 2. Do something like (doc['foo'].empty ? 0 : doc['foo'].value) * doc['bar']. The .empty is the key here. 3. If most of the

Re: elastic result search by text with nearest distance using latitude and longitude

2014-02-11 Thread Binh Ly
I'm curious, can you show the results (lon, lat values) of the first 25 matches? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@goog

Re: Elasticsearch swapping

2014-02-11 Thread Vahid
Hi Alex, thank you. I've run the command and also it shows that mlockall is set to true ! On Tuesday, February 11, 2014 2:33:49 PM UTC+1, Alexander Reelsen wrote: > > Hey, > > with recent elasticsearch versions (including newer 0.90), you can see if > bootstrap.mlockall setting is really applie

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread InquiringMind
Alex, I created issue https://github.com/elasticsearch/elasticsearch/issues/5085 I don't use GitHub that much, and I kinda muffed the issue, so I'll let someone else add the one enumeration to wherever it should best go: NAUTICALMILES(1852.0, "nm", "nmi"), Thanks! Brian Brian On Tuesday, Fe

Re: ElasticSearch server lock up

2014-02-11 Thread simonw
Those commits look good to me! I'd be super curious what you see in the logs especially coming from this: logger.warn("Searcher was released twice", new ElasticSearchIllegalStateException("Double release")); The bulk index does it contain delete by query or so? simon On Tuesday, February 11,

Re: Faceted search using RDF-triple like related documents

2014-02-11 Thread joergpra...@gmail.com
It's the semantic web. For inference, see http://www.w3.org/standards/semanticweb/inference Materialization is the pre-computation and storage of inferred triples http://www.w3.org/wiki/LargeTripleStores In fact, I use JSON-LD, which is convenient for both storing triples and loading them for se

get MapperParsingException failed to parse in 0.90.10

2014-02-11 Thread Stefan Sabolowitsch
Hi all, get MapperParsingException failed to parse in 0.90.10 [2014-02-11 16:05:09,402][DEBUG][action.bulk ] [Thunderbolt] [logstash-2014.02.11][4] failed to execute bulk item (index) index {[logstash-2014.02.11][suricata][deuCC2bkRvehNSA62tuuHw], source[{"tags":["suricata"],"@vers

Re: Computing idf in elasticsearch

2014-02-11 Thread Ivan Brusic
TF and IDF are calculated by shard, not per index, so the aggregated explanation might not have the exact numbers. Try changing your search type to a distributed one for more accurate results: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html#dfs-

Re: Faceted search using RDF-triple like related documents

2014-02-11 Thread Peter Oostwoud-Sibiryak
Hi Jörg, The assumptions you've made on my use case are correct. The nightly update could definitely work, but I think even live updates could work as the data is quite static in nature. A few more questions: * You're talking about recreation of the index, with this you mean update I presume?;

Re: Discrepancy on processed nodes

2014-02-11 Thread David Patiashvili
I just realized. The second node just LogStash. I just update now they are has a single node ... Le mercredi 5 février 2014 17:20:05 UTC+1, David Patiashvili a écrit : > > Where can I find the other node? > > Le mercredi 5 février 2014 17:17:51 UTC+1, Tony Su a écrit : >> >> Hi David, >> Your lat

Re: Faceted search using RDF-triple like related documents

2014-02-11 Thread joergpra...@gmail.com
I'm not sure, and I try hard to understand your use case. I assume you want a single query that can filter attributes of both the entity "1" and for attributes of related entities "2" and "3". As you have noticed, in a single query, this is not possible unless you had "bubbled up" the relevant at

Re: Setting up elasticsearch to scale: shards per index

2014-02-11 Thread Matt Fulton
Hey thanks for clarifying! I actually ended up setting it up as 1x master-only, 2x master-eligible data-nodes, realizing that I would need 3 eligible masters while putting it all together. On the heap problems, could you be more specific about what you are referring to, or maybe point me towards a

Computing idf in elasticsearch

2014-02-11 Thread sunayana choudhary
Hi all, I have been analysing Elasticsearch results with explain:true condition, I am not able to understand what technique has been applied to calculate idf. I went through the lucene scoring formula i.e. idf(t) = 1+log(NumDocs/Doc frequency+1) Does not matches my results. Following is

Check new node module.

2014-02-11 Thread Max Max
Hi, I've just published new library to work with elastic search in node. Check it out, https://www.npmjs.org/package/baio-es Suggestions, bug reports are appreciated. Thank. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from t

Re: Stuck with searching on nested documents

2014-02-11 Thread Binh Ly
I'd probably just simplify and eliminate all the nested stuff. So for example, if your document is like this: { "name": "Shirt1", "color": ["Red"] "size": ["XL", "S", "M"] } It's easy to execute queries like this: { "query": { "filtered": { "filter": { "bool": {

Re: Date Histogram Facet with custom bucketing

2014-02-11 Thread Alexander Reelsen
Hey, you can use the range facet, as it also supports dates and provide your own custom date ranges there maybe? See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-range-facet.html --Alex On Tue, Feb 11, 2014 at 2:36 PM, mooky wrote: > Bump? > > > On Mond

Re: ES sorting not working when use min_score in query

2014-02-11 Thread Alexander Reelsen
Hey, can you provide a minimal example using curl, so people can reproduce it using commandline tools and do not need a programming language and its environment? See http://www.elasticsearch.org/help A quick test was not revealing any suspicious behaviour to me, but maybe you are doing something

Re: Marvel and basic_auth

2014-02-11 Thread Al Smith
Works great - thanks for the quick turnaround! Regards, Al. Original message From: Boaz Leskes To: elasticsearch@googlegroups.com Cc: Date: 04/02/2014 16:59:00 Subject: Re: Marvel and basic_auth Hey Al, We just release marvel 1.0.2, which contains support for basic auth for the da

Re: discription about explain api

2014-02-11 Thread Alexander Reelsen
Hey, the description about tf/idf similarity in the lucene javadocs might help here: https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html --Alex On Tue, Feb 11, 2014 at 11:51 AM, Navneet Mathpal < navneetmathpa...@gmail.com> wrote: > hi , > > I

Re: Date Histogram Facet with custom bucketing

2014-02-11 Thread mooky
Bump? On Monday, 3 February 2014 16:33:02 UTC, mooky wrote: > > Hi All, > > I need a facet that does exactly what Date Histogram Facet does, but I > need a different (complicated) "monthly" bucketing. > > Rather than the start/end of every month, I need a monthly period that > begins/ends on "th

Re: Faceted search using RDF-triple like related documents

2014-02-11 Thread Peter Oostwoud-Sibiryak
Hi Jörg, Thank you for your answer. Lots of new stuff in there though which will require some studying to understand :) ! JSON-LD seems like an excellent addition to JSON which could actually mean some competition for graph databases?! I've tried to setup the following simple 2 dispenser, 2 disp

Re: Elasticsearch swapping

2014-02-11 Thread Alexander Reelsen
Hey, with recent elasticsearch versions (including newer 0.90), you can see if bootstrap.mlockall setting is really applied in the nodes info. So make sure setting it, was really successful. curl -XGET 'http://localhost:9200/_nodes' and search for mlockall, which must be set to true. --Alex O

Re: Highlight the matched word in elasticsearch

2014-02-11 Thread Alexander Reelsen
Hey, please see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-highlighting.html You can just add a 'highlight' field to your JSON document, which specifies the fields to highlight on. Also, dont use explain in production, but only for debug purposes. --Ale

Re: How to search an email.

2014-02-11 Thread Binh Ly
Alternatively, if you want to preserve email address and web urls, you can use the uax_url_email tokenizer and then term and match queries should work without any problems. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from t

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-11 Thread Alexander Reelsen
Hey, never thought about such a use-case, but it sounds useful. Feel free to create an issue, and even better, a pull request to add that functionality to DistanceUnit --Alex On Tue, Feb 11, 2014 at 12:54 AM, Raffaele Sena wrote: > One nautical mile is one minute of arc along the meridian li

Re: Threadpool "caller" reject policy

2014-02-11 Thread Alexander Reelsen
Hey, you can use the nodes statistics to find out which thread pool contains all those rejected tasks, before trying to tune. If it is the search thread pool, you can try to increase the size or the queue, alternatively add another node to scale out your search load or try to improve your queries.

Re: How to search an email.

2014-02-11 Thread Tiago Rodrigues
I have found the answer and i wish to share. I just need to pass the 'type' : 'phrase' on match query or just use ' match_phrase' query. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails

Re: Import CSV file

2014-02-11 Thread joergpra...@gmail.com
Read the curl man page for sending a file with @ Jörg -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this

Re: Import CSV file

2014-02-11 Thread Bert Cauwelier
Is the @ important? when i use the command you send i get: # curl -XPOST http://localhost:9200/_river/my_csv_river/_meta --data-binary @insertrivercsv.json Warning: Couldn't read data from file "insertrivercsv.json", this makes an Warning: empty POST. {"error":"MapperParsingException[failed to

Re: completion suggester caching

2014-02-11 Thread Alexander Reelsen
Hey, most likely I didnt get your use-case. But something like twitter typeahead would return a twitter handle as output, which is unique? Or something like "@spinscale | Alexander Reelsen", which is unique as well. Not sure what kind of output you want to return which is not unique. If you are se

Re: Experiencing very high CPU usage on 1.0.0 RC2 - preventing _nodes/stats returning in sensible time

2014-02-11 Thread Alexander Reelsen
Hey, maybe it is possible to exclude the segment statistics (if you do not need them) and not run into that performance problem as a quick hack... --Alex On Mon, Feb 10, 2014 at 6:54 PM, joergpra...@gmail.com < joergpra...@gmail.com> wrote: > Interesting fact is that your stacktraces point to

Re: Import CSV file

2014-02-11 Thread joergpra...@gmail.com
If you use curl, then check the parameters for sending valid input. You certainly want something like this curl -XPOST http://localhost:9200/_river/my_csv_river/_meta --data-binary @insertrivercsv.json Jörg -- You received this message because you are subscribed to the Google Groups "elasticse

How to search an email.

2014-02-11 Thread Tiago Rodrigues
Hello to all, I need a query to search for an email. I tried using the query and match query term. But I did not get the results I expected. For example, I tried to search for 'ti...@email.com', match query gives me the results that have 'tiago' ('ti...@hotmail.com', 'ti...@gmail.com') or ha

Re: ElasticSearch server lock up

2014-02-11 Thread Jay Modi
Simon, prior to this post I decided to try a custom build from the 0.90.11 tag with the following commits cherry picked onto my branch: ad1097f1ba109d6cb235ba541251ba63abb27c16 b4ec18814b3eeb35d948c01abec3e04745f57458 93e9d2146e77f6c0523875b93c768ab7f81cfe04 6056d2cb5553e6277b023e6860739847fbd95bb

Import CSV file

2014-02-11 Thread Bert Cauwelier
Hello, i am new to elasticsearch and i am trying to import a csv file. can anyoone help? an example csv file : name comment me hello json file { "type" : "csv", "csv_file" : { "folder" : "~/test", "filename_mask" : ".*\\.csv$", "poll":"5m", "f

Corrupted ElasticSearch index ?

2014-02-11 Thread bizzorama
Hi, I've noticed a very disturbing ElasticSearch behaviour ... my environment is: 1 logstash (1.3.2) (+ redis to store some data) + 1 elasticsearch (0.90.10) + kibana which process about 7 000 000 records per day, everything worked fine on our test environment, untill we run some tests for a l

ES sorting not working when use min_score in query

2014-02-11 Thread Vallabh Bothre
Hello Friends, When i use "min_score" in my query sorting stop working. Also min_score displays only 10 results? Below is my code, $result = $es->search(array( "query" => array( "dis_max" => array( "queries" => array( 0 => array( "field" => array( "title" => $search ) )

  1   2   >