Streaming sessions from BulkOutputFormat job being listed long after they were killed

2012-02-17 Thread Erik Forsberg

Hi!

If I run a hadoop job that uses BulkOutputFormat to write data to 
Cassandra, and that hadoop job is aborted, i.e. streaming sessions are 
not completed, it seems like the streaming sessions hang around for a 
very long time, I've observed at least 12-15h, in output from 'nodetool 
netstats'.


To me it seems like they go away only after a restart of Cassandra.

Is this a known behaviour? Does it cause any problems, f. ex. consuming 
memory, or should I just ignore it?


Regards,
\EF



Re: Streaming sessions from BulkOutputFormat job being listed long after they were killed

2012-02-17 Thread Yuki Morishita
Erik,

Currently, streaming failure handling is poorly functioning. 
There are several discussions and bug reports regarding streaming failure on 
jira.

Hanged streaming session will be left in memory unless you restart C*, but it 
does not cause problem I believe. 

-- 
Yuki Morishita


On Friday, February 17, 2012 at 6:18 AM, Erik Forsberg wrote:

 Hi!
 
 If I run a hadoop job that uses BulkOutputFormat to write data to 
 Cassandra, and that hadoop job is aborted, i.e. streaming sessions are 
 not completed, it seems like the streaming sessions hang around for a 
 very long time, I've observed at least 12-15h, in output from 'nodetool 
 netstats'.
 
 To me it seems like they go away only after a restart of Cassandra.
 
 Is this a known behaviour? Does it cause any problems, f. ex. consuming 
 memory, or should I just ignore it?
 
 Regards,
 \EF