[jira] [Updated] (STORM-2842) Fixed links for YARN&Kubernetes Integration

2017-12-04 Thread Xin Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin Wang updated STORM-2842:

Priority: Minor  (was: Major)

> Fixed links for YARN&Kubernetes Integration
> ---
>
> Key: STORM-2842
> URL: https://issues.apache.org/jira/browse/STORM-2842
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Xin Wang
>Assignee: Xin Wang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2842) Fixed links for YARN&Kubernetes Integration

2017-12-04 Thread Xin Wang (JIRA)
Xin Wang created STORM-2842:
---

 Summary: Fixed links for YARN&Kubernetes Integration
 Key: STORM-2842
 URL: https://issues.apache.org/jira/browse/STORM-2842
 Project: Apache Storm
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.0
Reporter: Xin Wang
Assignee: Xin Wang






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (STORM-2666) Storm-kafka-client spout can sometimes emit messages that were already committed.

2017-12-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/STORM-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16278152#comment-16278152
 ] 

Stig Rohde Døssing commented on STORM-2666:
---

[~hmclouro] I think you are right and the example is wrong. It can't happen 
that partitions are reassigned before committing. It was still possible to 
produce the bad state though. Here's a modified sequence without the _Reassign 
partitions before commit happens. Keep partition 0 assigned to this spout._ 
part.

Say there are partition 0 - 10, and this spout is assigned partition 0, and has 
currently committed up to offset 0.
Emit offset 0-100
Ack offset 0-100, except offset 50
Commit and reassign partitions. Keep partition 0 assigned to this spout. 
Offsets 0-49 are committed (i.e. we call commitSync with offset 50 so the 
consumer will restart there).
Reassignment logic does not remove the offset manager for partition 0, so 
51-100 are still acked.
Reassignment logic seeks the consumer back to the committed offset for all 
assigned partitions, including partition 0. The consumer position for partition 
0 is now 50.
Ack offset 50
NextTuple is called, and commit of offset 50-100 happens.
Offsets 50 - 100 are emitted again, because the consumer position was 50. Since 
50-100 were committed, they're no longer considered emitted/acked, so the spout 
will emit them. The spout is now in a bad state, and when 50-100 are acked, the 
offset manager will complain in the log.

> Storm-kafka-client spout can sometimes emit messages that were already 
> committed. 
> --
>
> Key: STORM-2666
> URL: https://issues.apache.org/jira/browse/STORM-2666
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka-client
>Affects Versions: 1.0.0, 2.0.0, 1.1.0, 1.1.1, 1.2.0
>Reporter: Guang Du
>Assignee: Stig Rohde Døssing
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.2.0, 1.1.2
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Under a certain heavy load, for failed/timeout tuples, the retry service will 
> ack tuple for failed max times. Kafka Client Spout will commit after reached 
> the commit interval. However seems some 'on the way' tuples will be failed 
> again, the retry service will cause Spout to emit again, and acked eventually 
> to OffsetManager.
> In some cases such offsets are too many, exceeding the max-uncommit, causing 
> org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset 
> unable to find next commit point, and Spout for this partition will not poll 
> any more.
> By the way I've applied STORM-2549 PR#2156 from Stig Døssing to fix 
> STORM-2625, and I'm using Python Shell Bolt as processing bolt, if this 
> information helps.
> resulting logs like below. I'm not sure if the issue has already been 
> raised/fixed, glad if anyone could help to point out existing JIRA. Thank you.
> 2017-07-27 22:23:48.398 o.a.s.k.s.KafkaSpout Thread-23-spout-executor[248 
> 248] [INFO] Successful ack for tuple message 
> [{topic-partition=kafka_bd_trigger_action-20, offset=18204, numFails=0}].
> 2017-07-27 22:23:49.203 o.a.s.k.s.i.OffsetManager 
> Thread-23-spout-executor[248 248] [WARN] topic-partition 
> [kafka_bd_trigger_action-18] has unexpected offset [16002]. Current committed 
> Offset [16003]
> Edit:
> See 
> https://issues.apache.org/jira/browse/STORM-2666?focusedCommentId=16125893&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16125893
>  for the current best guess at the root cause of this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2834) getOwnerResourceSummaries not working properly because scheduler is wrapped as BlacklistScheduler

2017-12-04 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-2834.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~ethanli], I merged into master.

> getOwnerResourceSummaries not working properly because scheduler is wrapped 
> as BlacklistScheduler
> -
>
> Key: STORM-2834
> URL: https://issues.apache.org/jira/browse/STORM-2834
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java#L4101
> {code:java}
>  if (clusterSchedulerConfig.containsKey(theOwner)) {
> if (scheduler instanceof ResourceAwareScheduler) {
> Map schedulerConfig = (Map) 
> clusterSchedulerConfig.get(theOwner);
> if (schedulerConfig != null) {
> 
> ownerResourceSummary.set_memory_guarantee((double)schedulerConfig.getOrDefault("memory",
>  0));
> 
> ownerResourceSummary.set_cpu_guarantee((double)schedulerConfig.getOrDefault("cpu",
>  0));
> 
> ownerResourceSummary.set_memory_guarantee_remaining(ownerResourceSummary.get_memory_guarantee()
> - 
> ownerResourceSummary.get_memory_usage());
> 
> ownerResourceSummary.set_cpu_guarantee_remaining(ownerResourceSummary.get_cpu_guarantee()
> - ownerResourceSummary.get_cpu_usage());
> }
> } else if (scheduler instanceof  MultitenantScheduler) {
> 
> ownerResourceSummary.set_isolated_node_guarantee((int) 
> clusterSchedulerConfig.getOrDefault(theOwner, 0));
> }
> }
> {code}
> Because scheduler is wrapped as BlackListScheduler 
> (https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java#L474),
>  these two "instanceof" will never be true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (STORM-2666) Storm-kafka-client spout can sometimes emit messages that were already committed.

2017-12-04 Thread Hugo Louro (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16277904#comment-16277904
 ] 

Hugo Louro commented on STORM-2666:
---

[~Srdo] [~GuangDu] Can you please clarify the comment made at 14/Aug/17 16:10. 
I think I have found another bug related to this.

In the presence of this [piece of 
code|https://github.com/apache/storm/blob/1.x-branch/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L158],
 how is it possible this to happen "_Reassign partitions before commit happens. 
Keep partition 0 assigned to this spout._" ?



> Storm-kafka-client spout can sometimes emit messages that were already 
> committed. 
> --
>
> Key: STORM-2666
> URL: https://issues.apache.org/jira/browse/STORM-2666
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-kafka-client
>Affects Versions: 1.0.0, 2.0.0, 1.1.0, 1.1.1, 1.2.0
>Reporter: Guang Du
>Assignee: Stig Rohde Døssing
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.2.0, 1.1.2
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Under a certain heavy load, for failed/timeout tuples, the retry service will 
> ack tuple for failed max times. Kafka Client Spout will commit after reached 
> the commit interval. However seems some 'on the way' tuples will be failed 
> again, the retry service will cause Spout to emit again, and acked eventually 
> to OffsetManager.
> In some cases such offsets are too many, exceeding the max-uncommit, causing 
> org.apache.storm.kafka.spout.internal.OffsetManager#findNextCommitOffset 
> unable to find next commit point, and Spout for this partition will not poll 
> any more.
> By the way I've applied STORM-2549 PR#2156 from Stig Døssing to fix 
> STORM-2625, and I'm using Python Shell Bolt as processing bolt, if this 
> information helps.
> resulting logs like below. I'm not sure if the issue has already been 
> raised/fixed, glad if anyone could help to point out existing JIRA. Thank you.
> 2017-07-27 22:23:48.398 o.a.s.k.s.KafkaSpout Thread-23-spout-executor[248 
> 248] [INFO] Successful ack for tuple message 
> [{topic-partition=kafka_bd_trigger_action-20, offset=18204, numFails=0}].
> 2017-07-27 22:23:49.203 o.a.s.k.s.i.OffsetManager 
> Thread-23-spout-executor[248 248] [WARN] topic-partition 
> [kafka_bd_trigger_action-18] has unexpected offset [16002]. Current committed 
> Offset [16003]
> Edit:
> See 
> https://issues.apache.org/jira/browse/STORM-2666?focusedCommentId=16125893&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16125893
>  for the current best guess at the root cause of this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (STORM-2406) [Storm SQL] Change underlying API to Streams API (for 2.0.0)

2017-12-04 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned STORM-2406:
---

Assignee: Jungtaek Lim

> [Storm SQL] Change underlying API to Streams API (for 2.0.0)
> 
>
> Key: STORM-2406
> URL: https://issues.apache.org/jira/browse/STORM-2406
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-sql
>Affects Versions: 2.0.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>
> Since we dropped features which conform to the Trident semantic, Storm SQL 
> doesn't need to rely on Trident, which is micro-batch.
> Both core API and Streams API are candidates, but we should implement some 
> bolts when we decide to rely on core API, whereas we don't need to do that 
> for Streams API. (If we need to, that's the point to improve Streams API.)
> Streams API also provides windowing feature via tuple-to-tuple semantic, so 
> it's ready for STORM-2405 too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)