Awesome! Great to know that. So as a conclusion the steps will be: 1) Stream tweets from twitter 2) Use the bulk API to make batches of 1000 (or more) tweets 3) Once the batch size is reached, spawn a new thread which will index the data into ES, meanwhile my original thread will continue streaming tweets
Do these steps sound alright to you or did I miss something? On Thursday, January 15, 2015 at 7:58:19 PM UTC+5:30, David Pilato wrote: > > I can index on my laptop 10000-12000 docs per second. SSD drives of course. > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 15 janv. 2015 à 13:43, Chinch Pokli <cpo...@gmail.com <javascript:>> a > écrit : > > No, so the whole point was that, will elasticsearch be able to index say > 10,000 documents per second? If yes, I can simply hook up my twitter code > to es. If not, I would need to think of how to make that happen. > Typically I've seen es indexes just around 30 docs per second which is > pretty low. > > I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get > some breathing room and enable it to index up to 10K docs per second. > > On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote: >> >> You have a Twitter input so you can extract content from Twitter and send >> to elasticsearch. No need to have Redis here. >> >> -- >> David ;-) >> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs >> >> Le 15 janv. 2015 à 00:02, Chinch Pokli <cpo...@gmail.com> a écrit : >> >> Thanks. I'll have a look at the raw option. >> Regarding logstash, I don't fully understand it's utility. It says that >> it can take messages from a Redis server. But if I have to set up Redis, I >> could simply use the Redis river to index into Elasticsearch. Is there any >> additional benefit that Logstash would give me? >> >> On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote: >>> >>> You should look at raw option or better look at Logstash. >>> >>> My 2 cents. >>> >>> David >>> >>> Le 14 janv. 2015 à 23:29, Chinch Pokli <cpo...@gmail.com> a écrit : >>> >>> Hi, >>> >>> I am using elasticsearch to index twitter stream. Until recently I was >>> using the official river which was working great but realized that it >>> throwing out much of the data (e.g. it is not storing number of followers >>> etc. data). >>> >>> Is there a way to make the river to store all the data? If not, I am >>> fine with writing a streaming code which will stream and index. But have a >>> concern. How many documents can elasticsearch index per second? I might >>> eventually need to index almost 10,000 documents (each document = 2 KB) per >>> second (current requirement is of 100 documents per second). Is this even >>> feasible? If yes, do I need to make any special modifications? >>> >>> Thanks-in-advance!! >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearc...@googlegroups.com <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11bf4f30-d7f6-41ac-886a-c5281dac31bd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.