Re: failover of nimbus server

2014-09-10 Thread 潘臻轩
sure, now I merge official release to my branch

2014-09-10 13:54 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 normally, i do not directly modify libraries, it will be complex to merge
 into the future official release, otherwise it will cause some
 unpredictable bugs. The better way to do this is inherit the class from
 official release in the separate place(of course, it still will cause bugs,
 but should be less), or directly send feedback to the author.

 2014-09-10 0:59 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 I modify many storm code, and i maintains self branch for dev..
 change write jar/conf/topo to local file system to hdfs

 2014-09-10 12:30 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 I am also interested in how can you make the storm be connected with
 HDFS? have you modified the lib from storm? can you guys roughly describe
 the steps?
 thanks

 2014-09-10 0:16 GMT-04:00 Jiang Jacky jiang0...@gmail.com:

 the best solution is that we can add the multiple nimbus server in the
 storm.yaml, those should be for failover, it also will be easy to configure

 2014-09-09 22:19 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 yes, I have implement this way.. and it ok in fact..
 I implement a total ha solution for nimbus.
 and our team write a total scheduler for storm(such as yarn for
 support 700+ cluster)

 2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally
 on how we can handle it better. We are considering moving to an HDFS 
 based
 solution where Nimbus will upload the jars into hdfs instead of local 
 disk
 (as that is a single point of failure) and supervisors will be 
 downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since
 too many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are
 actively pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check
 it with script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, 
 then
 the secondary server can be up. Thanks












Re: failover of nimbus server

2014-09-09 Thread 潘臻轩
*According to my knowledge, is not the case。you should check it with script
or other way.*

2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus daemon
 is fail-fast. But I am not sure if it is like Hadoop, there is secondary
 server for failover, if the nimbus server is totally down, then the
 secondary server can be up. Thanks



Re: failover of nimbus server

2014-09-09 Thread 潘臻轩
I not agree Nathan, if just nimbus down, it is fail-fast.but if the machine
happen error(such as disk error), this may lead
topology clear.

2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus daemon
 is fail-fast. But I am not sure if it is like Hadoop, there is secondary
 server for failover, if the nimbus server is totally down, then the
 secondary server can be up. Thanks





Re: failover of nimbus server

2014-09-09 Thread Ankit Toshniwal
Yes, that's a problem area, and we have been discussing it internally on
how we can handle it better. We are considering moving to an HDFS based
solution where Nimbus will upload the jars into hdfs instead of local disk
(as that is a single point of failure) and supervisors will be downloading
the jar's from hdfs as well.

The other problem we ran into was nic saturation on Nimbus host since too
many machines were doing copy of the jar's (180MB in size) to worker
machines leading to the total increase in time. Thus, with moving to HDFS
based solution we can do this more effectively and faster plus it scales
better.

 We do not have a working prototype for it, but something we are actively
pursuing.

Ankit

On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus daemon
 is fail-fast. But I am not sure if it is like Hadoop, there is secondary
 server for failover, if the nimbus server is totally down, then the
 secondary server can be up. Thanks






Re: failover of nimbus server

2014-09-09 Thread 潘臻轩
yes, I have implement this way.. and it ok in fact..
I implement a total ha solution for nimbus.
and our team write a total scheduler for storm(such as yarn for support
700+ cluster)

2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally on
 how we can handle it better. We are considering moving to an HDFS based
 solution where Nimbus will upload the jars into hdfs instead of local disk
 (as that is a single point of failure) and supervisors will be downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since too
 many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are actively
 pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, then
 the secondary server can be up. Thanks







Re: failover of nimbus server

2014-09-09 Thread Jiang Jacky
the best solution is that we can add the multiple nimbus server in the
storm.yaml, those should be for failover, it also will be easy to configure

2014-09-09 22:19 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 yes, I have implement this way.. and it ok in fact..
 I implement a total ha solution for nimbus.
 and our team write a total scheduler for storm(such as yarn for support
 700+ cluster)

 2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally on
 how we can handle it better. We are considering moving to an HDFS based
 solution where Nimbus will upload the jars into hdfs instead of local disk
 (as that is a single point of failure) and supervisors will be downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since too
 many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are actively
 pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, then
 the secondary server can be up. Thanks








Re: failover of nimbus server

2014-09-09 Thread Jiang Jacky
I am also interested in how can you make the storm be connected with HDFS?
have you modified the lib from storm? can you guys roughly describe the
steps?
thanks

2014-09-10 0:16 GMT-04:00 Jiang Jacky jiang0...@gmail.com:

 the best solution is that we can add the multiple nimbus server in the
 storm.yaml, those should be for failover, it also will be easy to configure

 2014-09-09 22:19 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 yes, I have implement this way.. and it ok in fact..
 I implement a total ha solution for nimbus.
 and our team write a total scheduler for storm(such as yarn for support
 700+ cluster)

 2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally on
 how we can handle it better. We are considering moving to an HDFS based
 solution where Nimbus will upload the jars into hdfs instead of local disk
 (as that is a single point of failure) and supervisors will be downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since
 too many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are
 actively pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, then
 the secondary server can be up. Thanks









Re: failover of nimbus server

2014-09-09 Thread 潘臻轩
I modify many storm code, and i maintains self branch for dev..
change write jar/conf/topo to local file system to hdfs

2014-09-10 12:30 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 I am also interested in how can you make the storm be connected with HDFS?
 have you modified the lib from storm? can you guys roughly describe the
 steps?
 thanks

 2014-09-10 0:16 GMT-04:00 Jiang Jacky jiang0...@gmail.com:

 the best solution is that we can add the multiple nimbus server in the
 storm.yaml, those should be for failover, it also will be easy to configure

 2014-09-09 22:19 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 yes, I have implement this way.. and it ok in fact..
 I implement a total ha solution for nimbus.
 and our team write a total scheduler for storm(such as yarn for support
 700+ cluster)

 2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally
 on how we can handle it better. We are considering moving to an HDFS based
 solution where Nimbus will upload the jars into hdfs instead of local disk
 (as that is a single point of failure) and supervisors will be downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since
 too many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are
 actively pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, 
 then
 the secondary server can be up. Thanks










Re: failover of nimbus server

2014-09-09 Thread Jiang Jacky
normally, i do not directly modify libraries, it will be complex to merge
into the future official release, otherwise it will cause some
unpredictable bugs. The better way to do this is inherit the class from
official release in the separate place(of course, it still will cause bugs,
but should be less), or directly send feedback to the author.

2014-09-10 0:59 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 I modify many storm code, and i maintains self branch for dev..
 change write jar/conf/topo to local file system to hdfs

 2014-09-10 12:30 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 I am also interested in how can you make the storm be connected with
 HDFS? have you modified the lib from storm? can you guys roughly describe
 the steps?
 thanks

 2014-09-10 0:16 GMT-04:00 Jiang Jacky jiang0...@gmail.com:

 the best solution is that we can add the multiple nimbus server in the
 storm.yaml, those should be for failover, it also will be easy to configure

 2014-09-09 22:19 GMT-04:00 潘臻轩 zhenxuan...@gmail.com:

 yes, I have implement this way.. and it ok in fact..
 I implement a total ha solution for nimbus.
 and our team write a total scheduler for storm(such as yarn for support
 700+ cluster)

 2014-09-10 10:02 GMT+08:00 Ankit Toshniwal ankitoshni...@gmail.com:

 Yes, that's a problem area, and we have been discussing it internally
 on how we can handle it better. We are considering moving to an HDFS based
 solution where Nimbus will upload the jars into hdfs instead of local disk
 (as that is a single point of failure) and supervisors will be downloading
 the jar's from hdfs as well.

 The other problem we ran into was nic saturation on Nimbus host since
 too many machines were doing copy of the jar's (180MB in size) to worker
 machines leading to the total increase in time. Thus, with moving to HDFS
 based solution we can do this more effectively and faster plus it scales
 better.

  We do not have a working prototype for it, but something we are
 actively pursuing.

 Ankit

 On Tue, Sep 9, 2014 at 6:43 PM, 潘臻轩 zhenxuan...@gmail.com wrote:

 I not agree Nathan, if just nimbus down, it is fail-fast.but if the
 machine happen error(such as disk error), this may lead
 topology clear.

 2014-09-10 9:39 GMT+08:00 潘臻轩 zhenxuan...@gmail.com:

 *According to my knowledge, is not the case。you should check it with
 script or other way.*

 2014-09-10 0:49 GMT+08:00 Jiang Jacky jiang0...@gmail.com:

 Hi, I read the articles about the nimbus, it specifies the nimbus
 daemon is fail-fast. But I am not sure if it is like Hadoop, there is
 secondary server for failover, if the nimbus server is totally down, 
 then
 the secondary server can be up. Thanks