The data loss scenarios in Aphyr's post are easily generated because his
tools stress test the database systems he's testing to the limit, he's
practically provoking the DBs he tests to fail (tho they shouldn't really).

In normal operations you normally should not see failures, but what Aphyr
showed is that when failure conditions happen the chances you will are
pretty high. Thanks to the Fallacies of Distributed Computing, that
basically means those are bound to happen every now and then. If and how
much data you lose will vary based on volumes, setups etc.

HTH

--

Itamar Syn-Hershko
http://code972.com | @synhershko <https://twitter.com/synhershko>
Freelance Developer & Consultant
Author of RavenDB in Action <http://manning.com/synhershko/>


On Sat, Jun 21, 2014 at 2:56 AM, Brian <brian.from...@gmail.com> wrote:

> Mark,
>
> I've read one post (can't remember where) that the Node client was
> preferred, but have also read where the HTTP interface is minimal overhead.
> So yes, I am currently using logstash with the HTTP interface and it works
> fine.
>
> I also performed some experiments with clustering (not much, due to
> resource and time constraints) and used unicast discovery. Then I read
> someone who strongly recommended multicast recovery, and I started to feel
> like I'd gone down the wrong path. Then I watched the ELK webinar and heard
> that unicast discovery was preferred. I think it's not a big deal either
> way; it's what works best for your particular networking infrastructure.
>
> In addition, I was recently given this link:
> http://aphyr.com/posts/317-call-me-maybe-elasticsearch. It hasn't
> dissuaded me at all, but it is a thought-provoking read. I am a little
> confused by some things, though. In all of my high-performance banging on
> ES, even with my time-to-live test feature enabled, I never lost any
> documents at all. But I wasn't using auto-id; I was specifying my own
> unique ID. And when run in my 3-node cluster (slow due to being hosted by 3
> VMs running on a dual-code machine), I still didn't lose any data. So I am
> not sure of the high data loss scenarios he describes in his missive; I
> have seen no evidence of any data loss due to false insert positives at all.
>
> Brian
>
>
> On Friday, June 20, 2014 6:30:27 PM UTC-4, Mark Walkom wrote:
>>
>> I wasn't aware that the elasticsearch_http output wasn't recommended?
>> When I spoke to a few of the ELK devs a few months ago, they indicated
>> that there was minimal performance difference, at the greater benefit of
>> not being locked to specific LS+ES versioning.
>>
>> Regards,
>> Mark Walkom
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f7621a17-9366-4166-9612-61415938013f%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/f7621a17-9366-4166-9612-61415938013f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zt%3Dhfog0zL2dp5y0Bs9R4foZ4wfzEOkOL%2B-WtAENMaBew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to