Awesome! Great to know that. So as a conclusion the steps will be:
1) Stream tweets from twitter
2) Use the bulk API to make batches of 1000 (or more) tweets
3) Once the batch size is reached, spawn a new thread which will index the 
data into ES, meanwhile my original thread will continue streaming tweets

Do these steps sound alright to you or did I miss something?

On Thursday, January 15, 2015 at 7:58:19 PM UTC+5:30, David Pilato wrote:
>
> I can index on my laptop 10000-12000 docs per second. SSD drives of course.
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 15 janv. 2015 à 13:43, Chinch Pokli <cpo...@gmail.com <javascript:>> a 
> écrit :
>
> No, so the whole point was that, will elasticsearch be able to index say 
> 10,000 documents per second? If yes, I can simply hook up my twitter code 
> to es. If not, I would need to think of how to make that happen.
> Typically I've seen es indexes just around 30 docs per second which is 
> pretty low.
>
> I am hoping Redis/ Kafka/ Logstash/ etc. might help elasticsearch to get 
> some breathing room and enable it to index up to 10K docs per second.
>
> On Thursday, January 15, 2015 at 10:47:31 AM UTC+5:30, David Pilato wrote:
>>
>> You have a Twitter input so you can extract content from Twitter and send 
>> to elasticsearch. No need to have Redis here. 
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 15 janv. 2015 à 00:02, Chinch Pokli <cpo...@gmail.com> a écrit :
>>
>> Thanks. I'll have a look at the raw option.
>> Regarding logstash, I don't fully understand it's utility. It says that 
>> it can take messages from a Redis server. But if I have to set up Redis, I 
>> could simply use the Redis river to index into Elasticsearch. Is there any 
>> additional benefit that Logstash would give me?
>>
>> On Thursday, January 15, 2015 at 4:06:12 AM UTC+5:30, David Pilato wrote:
>>>
>>> You should look at raw option or better look at Logstash.
>>>
>>> My 2 cents.
>>>
>>> David
>>>
>>> Le 14 janv. 2015 à 23:29, Chinch Pokli <cpo...@gmail.com> a écrit :
>>>
>>> Hi,
>>>
>>> I am using elasticsearch to index twitter stream. Until recently I was 
>>> using the official river which was working great but realized that it 
>>> throwing out much of the data (e.g. it is not storing number of followers 
>>> etc. data).
>>>
>>> Is there a way to make the river to store all the data? If not, I am 
>>> fine with writing a streaming code which will stream and index. But have a 
>>> concern. How many documents can elasticsearch index per second? I might 
>>> eventually need to index almost 10,000 documents (each document = 2 KB) per 
>>> second (current requirement is of 100 documents per second). Is this even 
>>> feasible? If yes, do I need to make any special modifications?
>>>
>>> Thanks-in-advance!!
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/da547692-903b-4793-a77e-fd5f0b5a01b7%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/d89e6057-ab58-49ef-a553-c5bd5265c172%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/a5c75aed-e290-4152-9f8d-160510f3ecfa%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/11bf4f30-d7f6-41ac-886a-c5281dac31bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to