Re: 3 Node Cluster With Nodes Out of Sync

2014-11-25 Thread Itamar Syn-Hershko
minimum_master_nodes still doesn't protect you from all possible failure scenarios, see http://aphyr.com/posts/317-call-me-maybe-elasticsearch What version are you running? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer &

Re: Performance issue while indexing lot of documents

2014-11-06 Thread Itamar Syn-Hershko
It may worth looking at 2 things: 1. Using the latest Elasticsearch version (1.4). Many work went on optimizing those type of scenarios on the server side. 2. Disabling refresh / flush - I assume this is an ETL process and as such this could greatly help. -- Itamar Syn-Hershko http://code972

Re: ES-to-ES river?

2014-10-21 Thread Itamar Syn-Hershko
I personally recommend https://github.com/elasticsearch/stream2es -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Oct 21, 2014 at 3

Re: Enabling doc_values for _timestamp and _parent fields

2014-10-21 Thread Itamar Syn-Hershko
Yes -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Oct 21, 2014 at 10:47 AM, Costya Regev wrote: > Hi, > > It's not cle

Re: Hot backup strategy for Elasticsearch

2014-10-15 Thread Itamar Syn-Hershko
Incremental. See http://www.elasticsearch.org/blog/introducing-snapshot-restore/ -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Wed, Oct 15,

Re: NotFilter dude

2014-10-15 Thread Itamar Syn-Hershko
See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-not-filter.html You should probably switch to a bool and a should clause before instead of an and filter -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Fre

Re: Hot backup strategy for Elasticsearch

2014-10-15 Thread Itamar Syn-Hershko
No - you should definitely use the snapshot and restore as its the most stable and efficient way for backups there is. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http:

Re: running on EC2 S3 vs EBS

2014-10-13 Thread Itamar Syn-Hershko
Yes, you don't want to use anything other than local storage for Elasticsearch. Not EBS and definitely not S3. You can use the snapshot/restore API to continously backup to S3 and get all the data protection you need. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twi

Re: checking update results

2014-10-01 Thread Itamar Syn-Hershko
HTTP status codes are used to communicate errors, for example a runtime error would return HTTP status 500 -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhers

Re: Percolator Requests

2014-09-30 Thread Itamar Syn-Hershko
Yes, they are. Pretty much like any search request with Elasticsearch. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Sep 30, 2014 at 1

Re: Elasticsearch.Net, strange deserialization after .Index

2014-09-29 Thread Itamar Syn-Hershko
This is probably a bug of the .NET client API, and you should log it on github where they monitor issues for it You might find this alternative library useful: https://github.com/synhershko/NElasticsearch available from nuget as well https://www.nuget.org/packages/NElasticsearch/ -- Itamar Syn

Re: Percolator and Couchbase

2014-09-29 Thread Itamar Syn-Hershko
cess, it seems to me that would be the solution for you anyway. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Sep 29, 2014 at 8:13 AM, CB

Re: Logstash vs Elasticsearch

2014-09-24 Thread Itamar Syn-Hershko
Mainly the ability to parse log texts easily and restructure it as json ("grok") There's no "better" way really, if your app knows what it's doing and you use batching and have proper error recovery its probably better to have it that way -- Itamar Syn-Hershko ht

Re: Sorting a random set of documents.

2014-09-22 Thread Itamar Syn-Hershko
No, it is just that sort is overriding the random_score. I would say just do the sorting on the client side - while it is possible to do still on ES (using scripted fields for example) it will just get too complicated. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.

Re: How elasticsearch.yml are coordinated between cluster nodes?

2014-09-21 Thread Itamar Syn-Hershko
) can only be changed via manually editing elasticsearch.yml. My advice to you would be to use Puppet, Chef or any other configuration and deployment management tools to avoid unmoderated changes to elasticsearch.yml -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/syn

Re: Limiting pagination?

2014-09-18 Thread Itamar Syn-Hershko
No, but since you should never expose your clusters to end users directly you could always impose this limit on the software facade that connects between your application and the cluster -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Dev

Re: how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Itamar Syn-Hershko
This will still mean less overhead than having those distinct field in discreet indexes. I wouldn't worry about that. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <

Re: how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Itamar Syn-Hershko
? Maybe look at your data model and try to re-arrange it. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> -- You received this message because yo

Re: COUCHBASE + ELASTIC Parent/child mapping

2014-09-14 Thread Itamar Syn-Hershko
Seems like it will in the next version (looking at the couchbase elastic transport plugin commits) -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/>

Re: Elasticsearch bad indexing timing

2014-09-14 Thread Itamar Syn-Hershko
Sure thing -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Sun, Sep 14, 2014 at 7:19 PM, Niv Penso wrote: > Amazing answer helped me so

Re: Elasticsearch.net client, endpoint strategy?

2014-09-14 Thread Itamar Syn-Hershko
ent manner using the cluster meta-data it has stored locally. This is by design and has many optimizations in place as well, and also allows you as a client to use round-robin or make requests to a load balancer. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.c

Re: Elasticsearch bad indexing timing

2014-09-14 Thread Itamar Syn-Hershko
servers instead of having them on one server ("virtual shards"). This will help fan out the indexing load. 5. If you don't specify the document IDs yourself, make sure you use the latest ES, there's a significant improvement there in the ID generation mechanism which could help

Re: Do I need a plugin to search geographically

2014-09-08 Thread Itamar Syn-Hershko
yes, that'd be wise to do -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Sep 8, 2014 at 4:22 PM, James wrote: > Ah cool ok.

Re: Do I need a plugin to search geographically

2014-09-08 Thread Itamar Syn-Hershko
1000 entries is a very small set. If you can have this local to your code that would be best. Otherwise yes a 2-phase query is probably your best bet - note you could also use the suggesters to improve speed -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhers

Re: Do I need a plugin to search geographically

2014-09-08 Thread Itamar Syn-Hershko
This is just a key/value lookup. If it's not too big I'd just hold it all in memory. Otherwise of course you can use Elasticsearch for that. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of Rav

Re: Changing tokenizer

2014-09-03 Thread Itamar Syn-Hershko
You will need to reindex to another index (as this change of mappings isn't backwards compatible) or to a field with a different name. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB

Available for contracts & consultancy in CA / WA during Oct-Nov

2014-09-03 Thread Itamar Syn-Hershko
rom you as I'll be available for short time contracts and consultancy gigs. More details: http://code972.com/elasticsearch-consulting-and-development-services Thanks, -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & C

Re: Do I need a plugin to search geographically

2014-09-03 Thread Itamar Syn-Hershko
Yes -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Sep 4, 2014 at 12:49 AM, Employ wrote: > Thank you. And no plugin is re

Re: Looking for Elasticsearch projects

2014-09-03 Thread Itamar Syn-Hershko
Well, this response is also public :) I'll ping you sometime next week with more details, juggling with too many things currently. Would definitely love to have an extra set of eyes. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelanc

Re: moustache search templates are too limited

2014-09-03 Thread Itamar Syn-Hershko
summarizes all this very well: http://stackoverflow.com/a/15041966/135701 -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Wed, Sep 3, 2014 at 12:33

Re: Do I need a plugin to search geographically

2014-09-03 Thread Itamar Syn-Hershko
You don't need any external geo-data, see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geo-distance-filter.html You only need external data sources if you want to give some coordinates / polygons names (like countries, neighborhoods etc) -- Itamar Syn-He

Re: Looking for Elasticsearch projects

2014-09-03 Thread Itamar Syn-Hershko
7;t thing Jorg actually meant that..). I was involved with https://lucene.apache.org/openrelevance/ but its now discontinued and in some spare time I have I'm trying to take that initiative forward. Ping me privately if that sounds interesting and we can continue discussing. -- Itamar Syn-Hershko http:/

Re: Stop words and Keyword tokenizer

2014-08-28 Thread Itamar Syn-Hershko
- Yumbo <-- synonyms to Mulaló (multiple tokens at the same position) etc And then you would use a tokenizer normally (and tokenize on commas, for example) Then you still lose the full-text search capabilities but in exchange for more precision (and more setup work on your part) -- Itamar Syn-Hers

Re: Stop words and Keyword tokenizer

2014-08-28 Thread Itamar Syn-Hershko
Take a look at suggesters - they are meant for that plus they are more performant! http://www.elasticsearch.org/blog/you-complete-me/ -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Ac

Re: Stop words and Keyword tokenizer

2014-08-28 Thread Itamar Syn-Hershko
What would be the usecase for such a process (removing stop words without tokenization)? This may be a good read btw: http://www.elasticsearch.org/blog/stop-stopping-stop-words-a-look-at-common-terms-query/ -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhers

Re: What the heck is this search?? :)

2014-08-21 Thread Itamar Syn-Hershko
I'm going to bet on Head. Disable it and see what happens. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Aug 21, 2014 at 7:22

Re: What the heck is this search?? :)

2014-08-20 Thread Itamar Syn-Hershko
to an ES index? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Wed, Aug 20, 2014 at 11:53 PM, Ivan Brusic wrote: > Very strange query ind

Re: What the heck is this search?? :)

2014-08-20 Thread Itamar Syn-Hershko
un on a decent sized installation. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Wed, Aug 20, 2014 at 10:14 PM, Chris Neal wrote: > Hi guys, &

Re: Ability to filter the results with a configurable amount of lines around an event

2014-08-11 Thread Itamar Syn-Hershko
You can do this using the timestamp (with a range, and grow it if necessary) or if you have a serial ID of some sort on the log message you can do a range query on that -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consult

Re: 3rd party scoring service

2014-07-31 Thread Itamar Syn-Hershko
You should bring the price over to Elasticsearch and not the other way around. Scoring against an external service is an added friction with huge performance costs. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant

Re: Hyper-Threading

2014-07-20 Thread Itamar Syn-Hershko
It would also depends on the caching abilities and that the currently paralelized threads are not sharing a resource. In the end, like with all Computer Science stuff, the answer is "it depends" :) -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhersh

Re: Delete by a Filed value

2014-07-20 Thread Itamar Syn-Hershko
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/>

Re: Getting Totals from a Terms Aggregation

2014-07-16 Thread Itamar Syn-Hershko
Aggregations operate on the results of a search query, so you can definitely use that total also when you have sub-aggregations. As for filter aggregations, you can have a subtract which acts as a sink for all unused docs and subtract it's count from the total count -- Itamar Syn-Hershko

Re: Getting Totals from a Terms Aggregation

2014-07-16 Thread Itamar Syn-Hershko
You get it from the search request that wraps the terms aggregation, under total hits in the root of the response -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.c

Re: Best practice architecture using ES

2014-07-14 Thread Itamar Syn-Hershko
your server side facade -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Jul 14, 2014 at 6:21 PM, Danny Lieberman wrote: > Itamar >

Re: Best practice architecture using ES

2014-07-14 Thread Itamar Syn-Hershko
I would strongly suggest against that. Never expose ES to the public, always put it behind a server facade. To get a glimpse of what you are exposing yourself, see this recent blog post http://www.elasticsearch.org/blog/scripting-security/ -- Itamar Syn-Hershko http://code972.com | @synhershko

Re: Suspected bug with Panel.parameters.length when there # samples than length

2014-07-14 Thread Itamar Syn-Hershko
You don't You either pull a small page of data to display (match_all query or any filtering query), or ask ES to aggregate the data for you and get back the metrics or buckets. You can also do both at the same time. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twi

Re: Keep the number of segments to 5

2014-07-14 Thread Itamar Syn-Hershko
ly so, but definitely not to 230GB on a 16GB server even when there's no aggregation involved. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon,

Re: Best practice architecture using ES

2014-07-14 Thread Itamar Syn-Hershko
uery types. One advice I could tell you is to try and avoid introducing too much friction, like duplicating the model too many times (DTO, DAO etc). If you can use the same structure for display as you use for indexing in ES, use that. HTH -- Itamar Syn-Hershko http://code972.com | @synhers

Re: longs being shown as strings

2014-07-13 Thread Itamar Syn-Hershko
What you are seeing is the Lucene numeric field terms. Try having the applicationid field as a not_analyzed string field for the purpose of faceting, instead of having it as a numeric field (which is usable for range queries or sorts) -- Itamar Syn-Hershko http://code972.com | @synhershko <ht

Re: Keep the number of segments to 5

2014-07-13 Thread Itamar Syn-Hershko
ot of aggregation operations (aka faceting) You probably can find ways to fine tune and squeeze more performance out of what you currently have (again - using filters, codecs and other advanced configs) but it's probably just wiser to scale out -- Itamar Syn-Hershko http://code972.com | @synhers

Re: Keep the number of segments to 5

2014-07-13 Thread Itamar Syn-Hershko
erver isn't cool, especially if you use aggregations), look into codecs and much more. There's no need for you to look into segments, especially since if this is a live index which is being written to there's a large cost (CPU, IO and GC) associated with merging segments -- Itamar Syn-He

Re: Keep the number of segments to 5

2014-07-13 Thread Itamar Syn-Hershko
How did you arrive at this number of 5? To being with, what sizes are your shards? what are the specs of your servers? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http:

Re: Inter-document Queries

2014-07-07 Thread Itamar Syn-Hershko
onse=401. But this will not distinguish between A->B->C and >> B->A->C. Perhaps I could use the script filter for the "last mile" and from >> the term filtered results throw out B-A-C and it will run more quickly >> because of the reduced docu

Re: Elasticsearch with azure cloud plugin

2014-07-07 Thread Itamar Syn-Hershko
t, see http://code972.com/blog/2014/07/74-the-definitive-guide-for-elasticsearch-on-windows-azure HTH, -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhersh

Re: Failed to merge - There is not enough space on the disk

2014-06-29 Thread Itamar Syn-Hershko
If it was corrupted you would have seen other errors, not 503. Check your network settings. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On S

Re: Failed to merge - There is not enough space on the disk

2014-06-29 Thread Itamar Syn-Hershko
This error means indexing has stopped at one point, up to that point everything is preserved. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html#disk for how to avoid this from now on -- Itamar Syn-Hershko http://code972.com | @synhershko

Re: Exact duplicate results (same _id) for a search query. Is this a bug?

2014-06-27 Thread Itamar Syn-Hershko
Is this 1 Elasticsearch instance running locally or do multiple servers / nodes participate? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/>

Re: Jespen article reaction

2014-06-24 Thread Itamar Syn-Hershko
ree/feature/improve_zen and there may be other related tickets as well -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Jun 24, 2014 at 7:04 PM,

Re: Splunk vs. Elastic search performance?

2014-06-21 Thread Itamar Syn-Hershko
ures, but what Aphyr showed is that when failure conditions happen the chances you will are pretty high. Thanks to the Fallacies of Distributed Computing, that basically means those are bound to happen every now and then. If and how much data you lose will vary based on volumes, setups etc. HTH --

Re: How to find the number of authors who have written between 2-3 books?

2014-06-19 Thread Itamar Syn-Hershko
bucketing is one example. So if you need exact values, I'd go for a model that does it. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/>

Re: searching on nested docs - geting back the nested docs as a response

2014-06-19 Thread Itamar Syn-Hershko
ok ID. Unless you use data from the book level along with full-text searches on the texts, which even then in some scenarios I would consider denormalization. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Autho

Re: searching on nested docs - geting back the nested docs as a response

2014-06-19 Thread Itamar Syn-Hershko
This is usually something that's being solved using parent-child, but the question here really is what do you mean by needing to retrieve both books & pages. Can you describe the actual scenario and what you are trying to achieve? -- Itamar Syn-Hershko http://code972.com | @synhersh

Re: How do people typically handle shard failures in their results?

2014-06-18 Thread Itamar Syn-Hershko
FWIW, this is usually why you use replicas, so even if a shard goes down there's a back-up shard (ideally more than one) that you can fallback to. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of Rav

Re: Type Ahead feature for contact list

2014-06-17 Thread Itamar Syn-Hershko
Take a look here: http://www.elasticsearch.org/blog/you-complete-me/ -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Tue, Jun 17, 2014

Re: ES 1.2.1 sort by _timestamp

2014-06-13 Thread Itamar Syn-Hershko
This is just to debug this, to make sure results are indeed not sorted by _timestamp, as you claim. Probably easier to just set _timestamp to stored. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of

Re: ES 1.2.1 sort by _timestamp

2014-06-13 Thread Itamar Syn-Hershko
Possibly, because it's not provided in the _source, or just use this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html#_path_2 -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelanc

Re: ES 1.2.1 sort by _timestamp

2014-06-12 Thread Itamar Syn-Hershko
This is weird. Are you sure what you are seeing is not overridden documents (can happen if you specify the ID yourself)? Can you add the _timestamp field to the results and verify the documents are indeed not sorted by _timestamp? -- Itamar Syn-Hershko http://code972.com | @synhershko <ht

Re: How to use elasticsearch graphical reports in asp.net

2014-06-11 Thread Itamar Syn-Hershko
If you mean Kibana dashboards, take a look here this might be of help: https://github.com/synhershko/RavenDB.ElasticsearchReplication/tree/master/Kibana.Host -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author

Re: Speculative deletes

2014-06-07 Thread Itamar Syn-Hershko
I'm pretty sure you're right On Jun 6, 2014 8:03 PM, "Nikolas Everett" wrote: > I'm in the position where I need to make _sure_ a document is deleted from > the index when something occurs in my source system. I want to just hit it > with a DELETE every time. Is that a good idea? > > It looks t

Re: Kibana 3: display the number of items in a Text panel?

2014-06-04 Thread Itamar Syn-Hershko
number of lines where? you can always show a Count facet that will count the number of results of a query -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhersh

Re: Inter-document Queries

2014-06-04 Thread Itamar Syn-Hershko
;path" and buckets on the user. To check the condition of the previous path you should be able to bucket again using a script, or maybe even with a query on a nested type. This is just from the top of my head but should definitely work if you can get to that model -- Itamar Syn-Hershko http:

Re: [ANN] Elasticsearch Simple Action Plugin

2014-06-04 Thread Itamar Syn-Hershko
You should have released this before my talk last week, I could have mentioned it :\ https://www.youtube.com/watch?v=FbAO2k57bdg -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Ac

Re: Cluster gets stuck after full re-index

2014-06-04 Thread Itamar Syn-Hershko
emory requirements etc as a data node. Finally, there has been (and still is) a lot of work put into this so I strongly recommend upgrading to the latest (currently it is 1.2.1). -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Co

Re: Inter-document Queries

2014-06-04 Thread Itamar Syn-Hershko
you can use ES properly using queries or the aggregations framework? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Jun 5, 2014 at

Re: Identify word as dominant word in search

2014-06-03 Thread Itamar Syn-Hershko
Depending on your corpus, this should happen automatically. That's what TF/IDF is about. What you can do further is use NLP methods to tag those items in search and indexing. Look up POS tagging and entity extraction. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twi

Re: Configuring cross-cloud cluster via REST API

2014-06-02 Thread Itamar Syn-Hershko
Just enable multicast using the plugin for your cloud provider... -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Jun 2, 2014 at 2:08 P

Re: Configuring cross-cloud cluster via REST API

2014-06-02 Thread Itamar Syn-Hershko
th as one -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Jun 2, 2014 at 1:49 PM, Martin Harris < martin.har...@cloudsoftcorp.com>

Re: Document found with _search but not with GET

2014-05-26 Thread Itamar Syn-Hershko
When you search, what does the _id field of the result indicate? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, May 26, 2014 at 11:

Re: Document found with _search but not with GET

2014-05-26 Thread Itamar Syn-Hershko
What is the exact URLs you're approaching? Are you specifying the index name and type name as well in your GET ? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://mann

Re: Trigram-accelerated regex searches

2014-05-22 Thread Itamar Syn-Hershko
Aye, and then you can use edit distance on single words (fuzzy query) to cope with fast typers On May 22, 2014 8:22 PM, "Robert Muir" wrote: > On Wed, May 21, 2014 at 6:01 PM, Erik Rose wrote: > > I'm trying to move Mozilla's source code search engine (dxr.mozilla.org) > > from a custom-written

Using Elasticsearch as a storage for git repositories

2014-05-20 Thread Itamar Syn-Hershko
most out of your document store. More details here: http://code972.com/blog/2014/05/71-using-elasticsearch-as-a-storage-for-git-repositories I'll have some concrete use-cases / demos to share soon Cheers, -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhersh

Re: missing documents after bulk indexing

2014-05-19 Thread Itamar Syn-Hershko
What are the details of that exception? can it be that ES has issues parsing the docs? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon,

Re: Fastes way to import 100m rows

2014-05-19 Thread Itamar Syn-Hershko
That doesn't seem right, try making larger bulk sizes. Also, what size is your docs? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On M

Re: Fastes way to import 100m rows

2014-05-19 Thread Itamar Syn-Hershko
That's a very low rate. Are you importing locally or via remote connection? -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Mon, Ma

Re: Disk-based shard allocation per node settings

2014-05-16 Thread Itamar Syn-Hershko
use percentages for the watermark values. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Fri, May 16, 2014 at 3:26 PM, Michel Con

Re: NEST: How can I get the raw JSON that comprises a doc just before it's indexed?

2014-05-11 Thread Itamar Syn-Hershko
This is effective only when you run with the debugger attached, but yes. This is effective for all response types. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.c

Re: NEST: How can I get the raw JSON that comprises a doc just before it's indexed?

2014-05-11 Thread Itamar Syn-Hershko
What version are you using? the latest one (v 1.0.0 beta1, a pre-release on nuget) should have this feature -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synh

Re: Mult-language searchable in one field

2014-05-09 Thread Itamar Syn-Hershko
ten times multi-lingual search demands a lot of attention Not long ago I gave a talk about this topic, you might find it helpful: https://skillsmatter.com/skillscasts/4968-approaches-to-multi-lingual-text-search-with-elasticsearch-and-lucene -- Itamar Syn-Hershko http://code972.com | @sy

Re: Mult-language searchable in one field

2014-05-09 Thread Itamar Syn-Hershko
And then what analyzer you will use for that? It is doable, but I'd strongly suggest against it unless you know what you are doing: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_multi_field_2 -- Itamar Syn-Hershko http://code97

Re: Nest Custom analyser

2014-05-05 Thread Itamar Syn-Hershko
while your code is a client written in .NET, so no Otherwise, you can define the analyzer via index settings from client code as well -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Ac

Re: Can I create a histogram of ages based on birthdates?

2014-05-05 Thread Itamar Syn-Hershko
This should be possible to do using script fields: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_script_field However, you will need to figure out how to do date manipulation there -- Itamar Syn-Hershko http://code972.com | @synhershko

Re: Nest Custom analyser

2014-05-05 Thread Itamar Syn-Hershko
It would have to be defined on ES, the analyzer name has to be the one that ES recognizes (as the plugin defines it) -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.c

Re: Nest Custom analyser

2014-05-05 Thread Itamar Syn-Hershko
This should work: [ElasticProperty(Analyzer = "my_analyzer")] public string Content { get; set;} You can also specify Index/SearchAnalyzer this way (for fine-grained control) -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Fre

Re: Recommended way to reduce overload on ES

2014-04-24 Thread Itamar Syn-Hershko
u may want to upgrade gradually (0.90 and then 1.x) just to be safe, but you don't have to reindex. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/&

Re: Recommended way to reduce overload on ES

2014-04-24 Thread Itamar Syn-Hershko
There's no need to reindex, it is enough to do full cluster restart after upgrading the binaries and ES/Lucene will take care of the rest -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB

Re: Recommended way to reduce overload on ES

2014-04-24 Thread Itamar Syn-Hershko
. There's no thumb rule for that one I'm afraid. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On Thu, Apr 24, 2014 at 4:37 PM, wr

Re: Recommended way to reduce overload on ES

2014-04-24 Thread Itamar Syn-Hershko
or delete an entire index and not use TTLs or delete-by-query processes. Deciding on the optimal size of an index in that scenario highly depends on your data, usage patterns and a lot of experimenting. That's to answer 1 & 2 3. Definitely, 0.20 is a very old version -- Itamar Syn-Her

Re: Improved stemming for Arabic

2014-04-24 Thread Itamar Syn-Hershko
even 4-5 grams perform much better than the above two, you should try those. They do not require a dictionary. I can't seem to find that paper now, the link I had to it seems to be broken. -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Dev

Re: Facetting by first letter

2014-04-23 Thread Itamar Syn-Hershko
search.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_script_field -- Itamar Syn-Hershko http://code972.com | @synhershko <https://twitter.com/synhershko> Freelance Developer & Consultant Author of RavenDB in Action <http://manning.com/synhershko/> On

<    1   2   3   >