[GitHub] storm pull request: STORM-513 check heartbeat from multilang subpr...

2014-10-24 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/storm/pull/286#discussion_r19326747
  
--- Diff: storm-core/src/jvm/backtype/storm/task/ShellBolt.java ---
@@ -305,4 +283,95 @@ private void die(Throwable exception) {
 System.exit(11);
 }
 }
+
+private class BoltHeartbeatTimerTask extends TimerTask {
+private ShellBolt bolt;
+
+public BoltHeartbeatTimerTask(ShellBolt bolt) {
+this.bolt = bolt;
+}
+
+@Override
+public void run() {
+long currentTimeMillis = System.currentTimeMillis();
+long lastHeartbeat = getLastHeartbeat();
+
+LOG.debug(BOLT - current time : {}, last heartbeat : {}, 
worker timeout (ms) : {},
+currentTimeMillis, lastHeartbeat, workerTimeoutMills);
+
+if (currentTimeMillis - lastHeartbeat  workerTimeoutMills) {
+bolt.die(new RuntimeException(subprocess heartbeat 
timeout));
+}
+
+String genId = Long.toString(_rand.nextLong());
+try {
+_pendingWrites.put(createHeartbeatBoltMessage(genId));
--- End diff --

@itaifrenkel Oh, I see. I didn't know that options exists. Thanks for 
letting me know!
Then we can flip heartbeat flag which means it's time to send heartbeat 
as you state, and let BoltWriter.run() loop takes care of it first.
What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-513) ShellBolt keeps sending heartbeats even when child process is hung

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182577#comment-14182577
 ] 

ASF GitHub Bot commented on STORM-513:
--

Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/storm/pull/286#discussion_r19326747
  
--- Diff: storm-core/src/jvm/backtype/storm/task/ShellBolt.java ---
@@ -305,4 +283,95 @@ private void die(Throwable exception) {
 System.exit(11);
 }
 }
+
+private class BoltHeartbeatTimerTask extends TimerTask {
+private ShellBolt bolt;
+
+public BoltHeartbeatTimerTask(ShellBolt bolt) {
+this.bolt = bolt;
+}
+
+@Override
+public void run() {
+long currentTimeMillis = System.currentTimeMillis();
+long lastHeartbeat = getLastHeartbeat();
+
+LOG.debug(BOLT - current time : {}, last heartbeat : {}, 
worker timeout (ms) : {},
+currentTimeMillis, lastHeartbeat, workerTimeoutMills);
+
+if (currentTimeMillis - lastHeartbeat  workerTimeoutMills) {
+bolt.die(new RuntimeException(subprocess heartbeat 
timeout));
+}
+
+String genId = Long.toString(_rand.nextLong());
+try {
+_pendingWrites.put(createHeartbeatBoltMessage(genId));
--- End diff --

@itaifrenkel Oh, I see. I didn't know that options exists. Thanks for 
letting me know!
Then we can flip heartbeat flag which means it's time to send heartbeat 
as you state, and let BoltWriter.run() loop takes care of it first.
What do you think?


 ShellBolt keeps sending heartbeats even when child process is hung
 --

 Key: STORM-513
 URL: https://issues.apache.org/jira/browse/STORM-513
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating
 Environment: Linux: 2.6.32-431.11.2.el6.x86_64 (RHEL 6.5)
Reporter: Dan Blanchard
Priority: Blocker
 Fix For: 0.9.3-rc2


 If I'm understanding everything correctly with how ShellBolts work, the Java 
 ShellBolt executor is the part of the topology that sends heartbeats back to 
 Nimbus to let it know that a particular multilang bolt is still alive.  The 
 problem with this is that if the multilang subprocess/bolt severely hangs 
 (i.e., it will not even respond to {{SIGALRM}} and the like), the Java 
 ShellBolt does not seem to notice or care. Simply having the tuple get 
 replayed when it times out will not suffice either, because the subprocess 
 will still be stuck.
 The most obvious way to handle this seem to be to add heartbeating to the 
 multilang protocol itself, so that the ShellBolt expects a message of some 
 kind every {{timeout}} seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: [STORM-533] Added in client and server IConnec...

2014-10-24 Thread revans2
GitHub user revans2 opened a pull request:

https://github.com/apache/storm/pull/302

[STORM-533] Added in client and server IConnection metrics.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/revans2/incubator-storm iconn-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #302


commit b6142189005a6741b6996fc08cec759c1082bc17
Author: Robert (Bobby) Evans ev...@yahoo-inc.com
Date:   2014-10-16T17:54:35Z

[STORM-533] Added in client and server IConnection metrics.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-533) Add metrics collection for IConnection

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182932#comment-14182932
 ] 

ASF GitHub Bot commented on STORM-533:
--

GitHub user revans2 opened a pull request:

https://github.com/apache/storm/pull/302

[STORM-533] Added in client and server IConnection metrics.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/revans2/incubator-storm iconn-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/302.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #302


commit b6142189005a6741b6996fc08cec759c1082bc17
Author: Robert (Bobby) Evans ev...@yahoo-inc.com
Date:   2014-10-16T17:54:35Z

[STORM-533] Added in client and server IConnection metrics.




 Add metrics collection for IConnection
 --

 Key: STORM-533
 URL: https://issues.apache.org/jira/browse/STORM-533
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans

 It would really be great to get some metrics from an IConnection that are 
 then sent to the metrics consumer. 
 We have seen issues in the past where a fire wall rule is mis-configured and 
 one host is unable to talk to another host.  If we had some metrics about how 
 many reconnection attempts are being made by the client to a given host we 
 could easily diagnose this.
 There are other metrics that would be nice to know too, like how many 
 bytes/tuples are being sent between different hosts. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: Update README.md

2014-10-24 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/292#issuecomment-60419239
  
+1

Thanks for fixing this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-532) Supervisor should restart worker immediately, if the worker process does not exist any more

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183059#comment-14183059
 ] 

ASF GitHub Bot commented on STORM-532:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/293#issuecomment-60419007
  
@caofangkun If this pull request has been superseded by #296 would you mind 
closing this?


 Supervisor should restart worker immediately, if the worker process does not 
 exist any more 
 

 Key: STORM-532
 URL: https://issues.apache.org/jira/browse/STORM-532
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: caofangkun
Priority: Minor

 For now 
 if the worker process does not exist any more 
 Supervisor will have to wait a few seconds for worker heartbeart timeout and 
 restart worker .
 If supervisor knows the worker processid  and check if the process exists in 
 the sync-processes thread ,may need less time to restart worker.
 1: record worker process id in the worker local heartbeart 
 2: in supervisor  sync-processes ,get process id from worker local heartbeat 
 and check if the process exits 
 3: if not restart it immediately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-540) Change default time format in logs to ISO8601 in order to include timezone

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183191#comment-14183191
 ] 

ASF GitHub Bot commented on STORM-540:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/301#issuecomment-60427539
  
+1


 Change default time format in logs to ISO8601 in order to include timezone
 --

 Key: STORM-540
 URL: https://issues.apache.org/jira/browse/STORM-540
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Michael Pershyn
Assignee: Michael Pershyn
Priority: Trivial

 Now default time format in logback/cluster.xml is:
 {code}
 pattern%d{-MM-dd HH:mm:ss} %c{1} [%p] %m%n/pattern
 {code}
 It is human-readable, but does not include timezone, and cluster log files 
 configuration have to be changed for error-free machine processing, for 
 example using Logstash-ElasticSearch-Kibana pipeline.
 If some of the cluster nodes time is not correctly configured it will take 
 lots of time to find out an issue with topology. Especially if there is such 
 a pipeline and time is converted from times in logs (in host timezones) to 
 timestamps (in UTC).
 I have checked out time formats, the most reasonable seems to be 
 [ISO8601|http://en.wikipedia.org/wiki/ISO_8601]. It has all the fields, 
 timezone, is similar to format that is currently used, but is a standard and 
 is widely supported.
 In regards to logging pipelines, logstash's grok [has out of the 
 box|http://grokdebug.herokuapp.com/patterns#] rule for TIMESTAMP_ISO8601.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-307: reset LocalState if files are corru...

2014-10-24 Thread Parth-Brahmbhatt
Github user Parth-Brahmbhatt commented on the pull request:

https://github.com/apache/storm/pull/282#issuecomment-60429586
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-307) After host crash, supervisor is unable to restart itself

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183234#comment-14183234
 ] 

ASF GitHub Bot commented on STORM-307:
--

Github user Parth-Brahmbhatt commented on the pull request:

https://github.com/apache/storm/pull/282#issuecomment-60429586
  
+1


 After host crash, supervisor is unable to restart itself
 

 Key: STORM-307
 URL: https://issues.apache.org/jira/browse/STORM-307
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.1-incubating
 Environment: Debian Linux Wheezy
 Zookeeper 3.3.3
 Java 1.7.0_25
Reporter: Damien Raude-Morvan
 Attachments: supeof.tar.bz2


 Hi,
 I've observed [multiple times|#links] that supervisor state de-serialisation 
 after host crash or reboot can fail. Supervisor is then unable to come up 
 without manual intervention. AFAICT, it seems that serialized supervisor 
 state if invalid and coun't be read at next start.
 Observed error in supervisor log :
 {noformat}
 2014-04-29 19:38:35 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting
 2014-04-29 19:38:35 o.a.z.ZooKeeper [INFO] Initiating client connection, 
 connectString=127.0.0.1:2181/storm sessionTimeout=2 
 watcher=com.netflix.curator.ConnectionState@18d055e0
 2014-04-29 19:38:35 o.a.z.ClientCnxn [INFO] Opening socket connection to 
 server /127.0.0.1:2181
 2014-04-29 19:38:35 o.a.z.ClientCnxn [INFO] Socket connection established to 
 localhost/127.0.0.1:2181, initiating session
 2014-04-29 19:38:35 o.a.z.ClientCnxn [INFO] Session establishment complete on 
 server localhost/127.0.0.1:2181, sessionid = 0x145a7cc1c7e48b1, negotiated 
 timeout = 2
 2014-04-29 19:38:35 b.s.d.supervisor [INFO] Starting supervisor with id 
 71b01216-9d00-4fb6-8538-6673058ab5ef at host storm
 2014-04-29 19:38:36 b.s.event [ERROR] Error when processing event
 java.lang.RuntimeException: java.io.EOFException
 at backtype.storm.utils.Utils.deserialize(Utils.java:86) 
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 at backtype.storm.utils.LocalState.snapshot(LocalState.java:45) 
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 at backtype.storm.utils.LocalState.get(LocalState.java:56) 
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 at 
 backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:207) 
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 at clojure.lang.AFn.applyToHelper(AFn.java:161) 
 ~[clojure-1.4.0.jar:na]
 at clojure.lang.AFn.applyTo(AFn.java:151) ~[clojure-1.4.0.jar:na]
 at clojure.core$apply.invoke(core.clj:603) ~[clojure-1.4.0.jar:na]
 at clojure.core$partial$fn__4070.doInvoke(core.clj:2343) 
 ~[clojure-1.4.0.jar:na]
 at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.4.0.jar:na]
 at backtype.storm.event$event_manager$fn__2593.invoke(event.clj:39) 
 ~[na:na]
 at clojure.lang.AFn.run(AFn.java:24) ~[clojure-1.4.0.jar:na]
 at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
 Caused by: java.io.EOFException: null
 at 
 java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
  ~[na:1.7.0_25]
 at 
 java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
  ~[na:1.7.0_25]
 at 
 java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:799) 
 ~[na:1.7.0_25]
 at java.io.ObjectInputStream.init(ObjectInputStream.java:299) 
 ~[na:1.7.0_25]
 at backtype.storm.utils.Utils.deserialize(Utils.java:81) 
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 ... 11 common frames omitted
 2014-04-29 19:38:36 b.s.util [INFO] Halting process: (Error when processing 
 an event)
 {noformat}
 Current workaround : full stop supervisor daemon and delete all Storm's 
 data/supervisor directory helped, and after restarting Supervisor is now 
 running smoothly. 
 {anchor:links} Here is some references of very similar issues :
 * 
 http://mail-archives.apache.org/mod_mbox/storm-user/201402.mbox/%3c23100d14e7ac4cef947f7236ef896...@by2pr08mb144.namprd08.prod.outlook.com%3E
 * https://groups.google.com/forum/#!topic/storm-user/SL9FK9XeoI8
 * https://groups.google.com/forum/#!topic/storm-user/2gapTYTRrX8
 Regards,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-541) Build produces maven warnings

2014-10-24 Thread jez wain (JIRA)
jez wain created STORM-541:
--

 Summary: Build produces maven warnings
 Key: STORM-541
 URL: https://issues.apache.org/jira/browse/STORM-541
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3-rc2
 Environment: Ubuntu 14.04 on POWER
Reporter: jez wain
Priority: Minor


The maven compilation task produces the following warnings because version 
numbers are missing from some plugins in the pom.xml file. I have created a 
patch file.

[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:maven-shade-clojure-transformer:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:storm-core:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:storm-starter:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:storm-kafka:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:storm-hdfs:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
com.github.ptgoetz:storm-hbase:jar:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ 
org.apache.storm:storm:0.9.3-rc2-SNAPSHOT, /home/jez/storm/pom.xml, line 661, 
column 21
[WARNING]
[WARNING] Some problems were encountered while building the effective model for 
org.apache.storm:storm:pom:0.9.3-rc2-SNAPSHOT
[WARNING] 'reporting.plugins.plugin.version' for 
org.apache.maven.plugins:maven-surefire-report-plugin is missing. @ line 661, 
column 21
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten 
the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support 
building such malformed projects.
[WARNING]



PATCH:

diff --git a/pom.xml b/pom.xml
index b63d332..37a5acd 100644
--- a/pom.xml
+++ b/pom.xml
@@ -656,10 +656,12 @@
 plugin
 groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-javadoc-plugin/artifactId
+version2.9/version
 /plugin
 plugin
 groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-surefire-report-plugin/artifactId
+ version2.16/version
 configuration
 reportsDirectories
 file${project.build.directory}/test-reports/file
@@ -694,6 +696,7 @@
 plugin
 groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-javadoc-plugin/artifactId
+version2.9/version
 /plugin
 plugin
 groupIdorg.apache.rat/groupId




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] storm pull request: STORM-513 check heartbeat from multilang subpr...

2014-10-24 Thread ptgoetz
Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/286#issuecomment-60438969
  
Since there were additional commits added to the pull request, we need to 
give it more time for others to review before merging, but I am still +1 for 
the patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (STORM-513) ShellBolt keeps sending heartbeats even when child process is hung

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183361#comment-14183361
 ] 

ASF GitHub Bot commented on STORM-513:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/286#issuecomment-60438795
  
@itaifrenkel No, not that's available for use via the bolt API, but it's an 
interesting idea. You could effectively do the same by making the scheduler 
static (1 per worker/JVM), but that feels kind of hacky.


 ShellBolt keeps sending heartbeats even when child process is hung
 --

 Key: STORM-513
 URL: https://issues.apache.org/jira/browse/STORM-513
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating
 Environment: Linux: 2.6.32-431.11.2.el6.x86_64 (RHEL 6.5)
Reporter: Dan Blanchard
Priority: Blocker
 Fix For: 0.9.3-rc2


 If I'm understanding everything correctly with how ShellBolts work, the Java 
 ShellBolt executor is the part of the topology that sends heartbeats back to 
 Nimbus to let it know that a particular multilang bolt is still alive.  The 
 problem with this is that if the multilang subprocess/bolt severely hangs 
 (i.e., it will not even respond to {{SIGALRM}} and the like), the Java 
 ShellBolt does not seem to notice or care. Simply having the tuple get 
 replayed when it times out will not suffice either, because the subprocess 
 will still be stuck.
 The most obvious way to handle this seem to be to add heartbeating to the 
 multilang protocol itself, so that the ShellBolt expects a message of some 
 kind every {{timeout}} seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-513) ShellBolt keeps sending heartbeats even when child process is hung

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183363#comment-14183363
 ] 

ASF GitHub Bot commented on STORM-513:
--

Github user ptgoetz commented on the pull request:

https://github.com/apache/storm/pull/286#issuecomment-60438969
  
Since there were additional commits added to the pull request, we need to 
give it more time for others to review before merging, but I am still +1 for 
the patch.


 ShellBolt keeps sending heartbeats even when child process is hung
 --

 Key: STORM-513
 URL: https://issues.apache.org/jira/browse/STORM-513
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating
 Environment: Linux: 2.6.32-431.11.2.el6.x86_64 (RHEL 6.5)
Reporter: Dan Blanchard
Priority: Blocker
 Fix For: 0.9.3-rc2


 If I'm understanding everything correctly with how ShellBolts work, the Java 
 ShellBolt executor is the part of the topology that sends heartbeats back to 
 Nimbus to let it know that a particular multilang bolt is still alive.  The 
 problem with this is that if the multilang subprocess/bolt severely hangs 
 (i.e., it will not even respond to {{SIGALRM}} and the like), the Java 
 ShellBolt does not seem to notice or care. Simply having the tuple get 
 replayed when it times out will not suffice either, because the subprocess 
 will still be stuck.
 The most obvious way to handle this seem to be to add heartbeating to the 
 multilang protocol itself, so that the ShellBolt expects a message of some 
 kind every {{timeout}} seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Could JStorm project collaborate with Storm?

2014-10-24 Thread 刘键(夏蒅)
Hi Andy,

 

Thanks for your quick response. What you concern is reasonable and
understandable. 

 

But maybe there is some misunderstanding on our proposal. According to the
current development situation in China, java core is indeed to bring more
contributors here. 

That is reason why we begin to think if it is also a way for us to provide
our contribution to Storm project. It is true that it is a risk to switch
the core in a short period. 

Actually, if possible, maybe we can consider to take JStorm as a subproject
of Storm(A branch or some other ways…). We will be responsible for the
maintain of it and try to propagate the features from Storm to JStorm, and
continue to develop our features and improvements on this. You guys can try
the new core and take a long time to estimate if it is worth doing the
switch in the future or taking JStorm as official optional core of Storm for
user.

 

Regards

Basti

 

发件人: Andrew Feng [mailto:af...@yahoo-inc.com] 
发送时间: 2014年10月24日 0:24
收件人: 封仲淹(纪君祥); nat...@nathanmarz.com; 徐明明(护城); ja...@cvk.ca;
f...@infochimps.com; david...@microsoft.com; ptgo...@gmail.com;
cutt...@apache.org; tdunn...@maprtech.com; arv...@apache.org;
d...@hortonworks.com; m.ben.frank...@gmail.com; benjamin.hind...@gmail.com
抄送: aloha-dev; scott.z...@vipshop.com; yannia...@tencent.com;
mlc...@iflytek.com; z...@tencent.com
主题: Re: 答复: Could JStorm project collaborate with Storm?

 

Zhongyan:

 

Yahoo almost took your path 2 years ago. After some discussion with Nathan,
we decided to work with community on Storm, and some of us learned Clojure
quickly. We are very glad that we did that.

 

Under the assumption that Storm provides all features of JStorm, my personal
vote will be NO for your proposal. 

*   Storm has enjoyed its current success before Nathan built a very
solid core in Clojure. We should not replace that core until we are 100%
sure that alternative implementation is at least as good as our Clojure
implementation.
*   Clojure has not prevent Storm attract contributors (currently 108).
We have contributors who write code in Clojure and Java. It doesn’t take
much time for one to be able to understand Clojure code.
*   The convergence of streaming processing and batch processing will
occur at higher level of abstractions. Trident API, for example, is very
much similar to batch API such as Pig or Cascading.
*   Including Jstorm under Storm will only create confusion to our user
community at this stage. 

 

Why don’t you have discussion with 徐明明 to figure out an alternative path
for JStorm?

 

Thanks,

 

Andy

 

From: 封仲淹 (纪君祥) zhongyan.f...@alibaba-inc.com
Reply-To: 封仲淹(纪君祥) zhongyan.f...@alibaba-inc.com
Date: Wednesday, October 22, 2014 at 11:37 PM
To: Andy Feng af...@yahoo-inc.com, nat...@nathanmarz.com
nat...@nathanmarz.com, 徐明明(护城) mingming.x...@alibaba-inc.com,
ja...@cvk.ca ja...@cvk.ca, f...@infochimps.com f...@infochimps.com,
david...@microsoft.com david...@microsoft.com, ptgo...@gmail.com
ptgo...@gmail.com, Doug Cutting cutt...@apache.org,
tdunn...@maprtech.com tdunn...@maprtech.com, arv...@apache.org
arv...@apache.org, d...@hortonworks.com d...@hortonworks.com,
m.ben.frank...@gmail.com m.ben.frank...@gmail.com,
benjamin.hind...@gmail.com benjamin.hind...@gmail.com
Cc: aloha-dev aloha-...@list.alibaba-inc.com, scott.z...@vipshop.com
scott.z...@vipshop.com, yannia...@tencent.com yannia...@tencent.com,
mlc...@iflytek.com mlc...@iflytek.com, z...@tencent.com zeus@tencent.
com
Subject: 答复: Could JStorm project collaborate with Storm? 

 

Andrew, Thanks for suggest.

 

The problem is that the core of Storm is implemented by Clojure, if it is
java, we are glad to merge all our commit into the Strom Trunk. 

If the core of Storm is implemented by Java, I think the contributor of
Storm would be double.

 

 

May I put forward 2 proposal:

1.  Could Jstorm be the son project of Storm?

2.  Task a long time to change Storm-core as Jstorm-core.

a)   If the core of Storm is implemented by Java, I think the
contributor of Storm would be double. Once one user found one bug, maybe he
is able to fix it by himself, in this method, Storm resolve problem will be
double

b)   In the next one or two years, batch handling and stream handling
will merge into one solution, Spark/Flink are doing this job. If we still
use clojure, I am a little afraid that we can’t follow the steps of other
community and this direction.

 

 

Thanks

Longda

 

发件人: Andrew Feng [mailto:af...@yahoo-inc.com] 
发送时间: 2014年10月23日 13:31
收件人: 封仲淹(纪君祥); nat...@nathanmarz.com; 徐明明(护城); ja...@cvk.ca;
f...@infochimps.com; david...@microsoft.com; ptgo...@gmail.com;
cutt...@apache.org; tdunn...@maprtech.com; arv...@apache.org;
d...@hortonworks.com; m.ben.frank...@gmail.com; benjamin.hind...@gmail.com
抄送: aloha-dev; scott.z...@vipshop.com; yannia...@tencent.com;
mlc...@iflytek.com; z...@tencent.com
主题: Re: Could JStorm project collaborate with Storm? 

 

Zhongyan:

 

I would suggest you 

configuring kryo instances

2014-10-24 Thread Patrick Raeuber
Hey guys,

The storm Config offers some convenience methods for registering
serializers. (I'm talking about storm 0.9.2-incubating..)

registerSerialization(Class)
registerSerialization(Class, Class? extends Serializer)
registerSerialization(Map, Class, Class? extends Serializer)

If I understand correctly, the 3rd signature is for the case when you want
to add a new serializer to a given config.

But how to configure serializer instances ? There are decorators, but you
only can provide decorator classes, no instances.

What I'm looking for is a way to pass in configuration parameters to the
serializers itself in my java code.

Any help is appreciated

Thanks,

Patrick


[jira] [Commented] (STORM-378) SleepSpoutWaitStrategy.emptyEmit should use the variable streak

2014-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183698#comment-14183698
 ] 

ASF GitHub Bot commented on STORM-378:
--

Github user HeartSaVioR commented on the pull request:

https://github.com/apache/storm/pull/295#issuecomment-60460916
  
@caofangkun 
It's mk-threads in executor.clj.

```
(if (and (= curr-count (.get emitted-count)) active?)
  (do (.increment empty-emit-streak)
  (.emptyEmit spout-wait-strategy (.get empty-emit-streak)))
  (.set empty-emit-streak 0)
  ))   
```

You can find that streak gets increased by 1, so I think it's for 
alternative implementation of ISpoutWaitStrategy, not SleepSpoutWaitStrategy.
(@nathanmarz Could you please confirm it?)
If it is for, just adding it to sleepMillis barely affects sleep time.
Streak should be multiplied by 10 or something bigger, maybe we can apply 
exponential value of already multiplied streak.


 SleepSpoutWaitStrategy.emptyEmit should use  the variable streak
 --

 Key: STORM-378
 URL: https://issues.apache.org/jira/browse/STORM-378
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating
Reporter: caofangkun
Priority: Minor

 {code:java}
 Index: src/jvm/backtype/storm/spout/SleepSpoutWaitStrategy.java
 ===
 --- src/jvm/backtype/storm/spout/SleepSpoutWaitStrategy.java  (revision 2868)
 +++ src/jvm/backtype/storm/spout/SleepSpoutWaitStrategy.java  (working copy)
 @@ -18,6 +18,8 @@
  package backtype.storm.spout;
  
  import backtype.storm.Config;
 +import backtype.storm.utils.Utils;
 +
  import java.util.Map;
  
  
 @@ -27,13 +29,14 @@
  
  @Override
  public void prepare(Map conf) {
 -sleepMillis = ((Number) 
 conf.get(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS)).longValue();
 +sleepMillis = Utils.getLong(
 +conf.get(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS), 
 500);
  }
  
  @Override
  public void emptyEmit(long streak) {
  try {
 -Thread.sleep(sleepMillis);
 +Thread.sleep(Math.abs(sleepMillis + streak));
  } catch (InterruptedException e) {
  throw new RuntimeException(e);
  }
 Index: src/jvm/backtype/storm/utils/Utils.java
 ===
 --- src/jvm/backtype/storm/utils/Utils.java   (revision 2888)
 +++ src/jvm/backtype/storm/utils/Utils.java   (working copy)
 @@ -325,6 +325,24 @@
throw new IllegalArgumentException(Don't know how to convert  + 
 o +  + to int);
}
  }
 +
 +public static Long getLong(Object o, long defaultValue) {
 +
 +  if (o == null) {
 +return defaultValue;
 +  }
 +
 +  if (o instanceof String) {
 +return Long.valueOf(String.valueOf(o));
 +  } else if (o instanceof Integer) {
 +Integer value = (Integer) o;
 +return Long.valueOf((Integer) value);
 +  } else if (o instanceof Long) {
 +return (Long) o;
 +  } else {
 +return defaultValue;
 +  }
 +}
  
  public static boolean getBoolean(Object o, boolean defaultValue) {
if (null == o) {
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Could JStorm project collaborate with Storm?

2014-10-24 Thread Lian, Li
Dear all,

This seems an interesting discussion and I want to just give my 2 cents here. 

1. I disagree with this statement 'According to the current development 
situation in China, java core is indeed to bring more contributors here'. In 
recent QCon in Shanghai, I saw folks from China local internet companies did 
quite a lot contribution to Storm project without requiring it to have a Java 
core. Also, technically, with the minimum core code in Clojure and right 
interfaces in place, it's easy to extend Storm features and functions with Java.

2. If the proposal of using JStrom replacing Storm is accepted, then the 
community will spend a lot time migrating and the global users of Storm needs 
to do a lot regression testing to verify no large issues, no matter how good 
the unit testing will be done. The impact is too large for Storm user community 
to afford.

3. I firm believe the combination of some Java together with uprising JVM 
compatible language like Clojure and Scala, is the future for complex 
distributed application development. There are quite some successful example of 
new open source projects like Kafka and Spark, both developed using both Scala 
and expose APIs and extensions in Java. If we are going to replace all non-Java 
projects with a Java version, will it be a worth our precious time?

4. I understand Alibaba has certain special requirements and some 
implementation to satisfy the specific needs of shared cluster resource 
management, etc. As JStorm already has implementation of these kind of 
requirements, I suggest that folks from Alibaba to start some good 
communication with Storm committers and ask them to update just the extension 
points of Storm, to enable specific extension, but without replacing the core 
entirely.

Lex
eBay

-Original Message-
From: 刘键(夏�G) [mailto:basti...@alibaba-inc.com] 
Sent: 2014年10月24日 11:20
To: af...@yahoo-inc.com; nat...@nathanmarz.com; 徐明明(护城); ja...@cvk.ca; 
f...@infochimps.com; david...@microsoft.com; ptgo...@gmail.com; 
cutt...@apache.org; tdunn...@maprtech.com; arv...@apache.org; 
d...@hortonworks.com; m.ben.frank...@gmail.com; benjamin.hind...@gmail.com; 
封仲淹(纪君祥)
Cc: aloha-dev; scott.z...@vipshop.com; yannia...@tencent.com; 
mlc...@iflytek.com; z...@tencent.com; 陈昱(叔至); 姬风; 周小帆(承嗣); dev@storm.apache.org
Subject: Re: Could JStorm project collaborate with Storm?

Hi Andy,

 

Thanks for your quick response. What you concern is reasonable and 
understandable. 

 

But maybe there is some misunderstanding on our proposal. According to the 
current development situation in China, java core is indeed to bring more 
contributors here. 

That is reason why we begin to think if it is also a way for us to provide our 
contribution to Storm project. It is true that it is a risk to switch the core 
in a short period. 

Actually, if possible, maybe we can consider to take JStorm as a subproject of 
Storm(A branch or some other ways…). We will be responsible for the maintain of 
it and try to propagate the features from Storm to JStorm, and continue to 
develop our features and improvements on this. You guys can try the new core 
and take a long time to estimate if it is worth doing the switch in the future 
or taking JStorm as official optional core of Storm for user.

 

Regards

Basti

 

发件人: Andrew Feng [mailto:af...@yahoo-inc.com]
发送时间: 2014年10月24日 0:24
收件人: 封仲淹(纪君祥); nat...@nathanmarz.com; 徐明明(护城); ja...@cvk.ca; 
f...@infochimps.com; david...@microsoft.com; ptgo...@gmail.com; 
cutt...@apache.org; tdunn...@maprtech.com; arv...@apache.org; 
d...@hortonworks.com; m.ben.frank...@gmail.com; benjamin.hind...@gmail.com
抄送: aloha-dev; scott.z...@vipshop.com; yannia...@tencent.com; 
mlc...@iflytek.com; z...@tencent.com
主题: Re: 答复: Could JStorm project collaborate with Storm?

 

Zhongyan:

 

Yahoo almost took your path 2 years ago. After some discussion with Nathan, we 
decided to work with community on Storm, and some of us learned Clojure 
quickly. We are very glad that we did that.

 

Under the assumption that Storm provides all features of JStorm, my personal 
vote will be NO for your proposal. 

*   Storm has enjoyed its current success before Nathan built a very
solid core in Clojure. We should not replace that core until we are 100% sure 
that alternative implementation is at least as good as our Clojure 
implementation.
*   Clojure has not prevent Storm attract contributors (currently 108).
We have contributors who write code in Clojure and Java. It doesn’t take much 
time for one to be able to understand Clojure code.
*   The convergence of streaming processing and batch processing will
occur at higher level of abstractions. Trident API, for example, is very much 
similar to batch API such as Pig or Cascading.
*   Including Jstorm under Storm will only create confusion to our user
community at this stage. 

 

Why don’t you have discussion with 徐明明 to figure out an alternative path for 
JStorm?

 

Thanks,

 

Andy