[jira] [Updated] (STORM-2712) accept arbitrary number of rows per tuple in storm-cassandra

2017-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2712:
--
Labels: pull-request-available  (was: )

> accept arbitrary number of rows per tuple in storm-cassandra
> 
>
> Key: STORM-2712
> URL: https://issues.apache.org/jira/browse/STORM-2712
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-cassandra
>Affects Versions: 2.0.0, 1.1.1
>Reporter: Yohei Kishimoto
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current implementation in `TridentResultSetValuesMapper::map` restricts a 
> SELECT query to return one row. In `StateQueryProcessor::finishBatch`, it 
> checks the equality between the result size of `batchRetrieve` and input 
> tuple size. When the number of result rows is less than 1 or greater than 1, 
> it breaks the condition and an exception is thrown.
> We should accept arbitrary number of rows by adjusting List dimensions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2743) Add logging to monitor how long scheduling is taking

2017-09-19 Thread Ethan Li (JIRA)
Ethan Li created STORM-2743:
---

 Summary: Add logging to monitor how long scheduling is taking
 Key: STORM-2743
 URL: https://issues.apache.org/jira/browse/STORM-2743
 Project: Apache Storm
  Issue Type: Improvement
  Components: storm-server
Reporter: Ethan Li
Assignee: Ethan Li
Priority: Trivial


Add logging to monitor how long scheduling is taking



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2743) Add logging to monitor how long scheduling is taking

2017-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2743:
--
Labels: pull-request-available  (was: )

> Add logging to monitor how long scheduling is taking
> 
>
> Key: STORM-2743
> URL: https://issues.apache.org/jira/browse/STORM-2743
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-server
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Trivial
>  Labels: pull-request-available
>
> Add logging to monitor how long scheduling is taking



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2744) Add in "restart timeout" for backpressure

2017-09-19 Thread Ethan Li (JIRA)
Ethan Li created STORM-2744:
---

 Summary: Add in "restart timeout" for backpressure
 Key: STORM-2744
 URL: https://issues.apache.org/jira/browse/STORM-2744
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Ethan Li
Assignee: Ethan Li
Priority: Minor


Instead of stopping indefinitely we want to add a timeout value to the 
backpressure mechanism so that spouts won't get stuck if bolts fail to switch 
back on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2722) JMSSpout test fails way too often

2017-09-19 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved STORM-2722.

   Resolution: Fixed
Fix Version/s: 2.0.0

Checked this into master

> JMSSpout test fails way too often
> -
>
> Key: STORM-2722
> URL: https://issues.apache.org/jira/browse/STORM-2722
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-jms
>Affects Versions: 2.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:92)
>   at org.junit.Assert.assertTrue(Assert.java:43)
>   at org.junit.Assert.assertTrue(Assert.java:54)
>   at 
> org.apache.storm.jms.spout.JmsSpoutTest.testFailure(JmsSpoutTest.java:62)
> {code}
> Which corresponds to 
> https://github.com/apache/storm/blob/d6e5e6d4e0a20c4c9f0ce0e3000e730dcb4700da/external/storm-jms/src/test/java/org/apache/storm/jms/spout/JmsSpoutTest.java?utf8=%E2%9C%93#L62



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2745) Hdfs Open Files problem

2017-09-19 Thread Shoeb (JIRA)
Shoeb created STORM-2745:


 Summary: Hdfs Open Files problem
 Key: STORM-2745
 URL: https://issues.apache.org/jira/browse/STORM-2745
 Project: Apache Storm
  Issue Type: New Feature
  Components: storm-hdfs
Affects Versions: 2.0.0, 1.x
Reporter: Shoeb
 Fix For: 2.0.0, 1.x


Issue:

Problem exists when there are multiple HDFS writers in writersMap. Each writer 
keeps an open hdfs handle to the file. Incase of Inactive writer(i.e. one which 
is not consuming any data from long period), the files are not closed and 
always remain in open state.

Ideally, these files should get closed and Hdfs writers removed from the 
WritersMap.

Solution:
Implement a ClosingFilesPolicy that is based on Tick tuple intervals. At each 
tick tuple all Writers are checked and closed if they exist for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2742) Logviewer leaking file descriptors

2017-09-19 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved STORM-2742.

   Resolution: Fixed
Fix Version/s: 2.0.0

I merged this into master

> Logviewer leaking file descriptors
> --
>
> Key: STORM-2742
> URL: https://issues.apache.org/jira/browse/STORM-2742
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Kyle Nusbaum
>Assignee: Kyle Nusbaum
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The logviewer leaks file descriptors from the search module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2745) Hdfs Open Files problem

2017-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2745:
--
Labels: features pull-request-available starter  (was: features starter)

> Hdfs Open Files problem
> ---
>
> Key: STORM-2745
> URL: https://issues.apache.org/jira/browse/STORM-2745
> Project: Apache Storm
>  Issue Type: New Feature
>  Components: storm-hdfs
>Affects Versions: 2.0.0, 1.x
>Reporter: Shoeb
>  Labels: features, pull-request-available, starter
> Fix For: 2.0.0, 1.x
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Issue:
> Problem exists when there are multiple HDFS writers in writersMap. Each 
> writer keeps an open hdfs handle to the file. Incase of Inactive writer(i.e. 
> one which is not consuming any data from long period), the files are not 
> closed and always remain in open state.
> Ideally, these files should get closed and Hdfs writers removed from the 
> WritersMap.
> Solution:
> Implement a ClosingFilesPolicy that is based on Tick tuple intervals. At each 
> tick tuple all Writers are checked and closed if they exist for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2713) when the connection to the first zkserver is timeout,storm-kafka's kafkaspout will throw a exception

2017-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2713:
--
Labels: pull-request-available  (was: )

> when the connection to the first zkserver is timeout,storm-kafka's kafkaspout 
> will throw a exception
> 
>
> Key: STORM-2713
> URL: https://issues.apache.org/jira/browse/STORM-2713
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: liuzhaokun
>Assignee: liuzhaokun
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> when the connection to the first zkserver is timeout,storm-kafka's kafkaspout 
> will throw a exception without attempting to connect other zkserver,even zk 
> can also work with one node down.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2731) Simple checks in Storm Windowing

2017-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2731:
--
Labels: pull-request-available  (was: )

> Simple checks in Storm Windowing
> 
>
> Key: STORM-2731
> URL: https://issues.apache.org/jira/browse/STORM-2731
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Boyang Jerry Peng
>Assignee: Boyang Jerry Peng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2738) The number of ackers should default to the number of actual running workers on RAS cluster

2017-09-19 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved STORM-2738.

   Resolution: Fixed
Fix Version/s: 2.0.0

Thanks [~ethanli],

I merged this into master

> The number of ackers should default to the number of actual running workers 
> on RAS cluster
> --
>
> Key: STORM-2738
> URL: https://issues.apache.org/jira/browse/STORM-2738
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
> Attachments: Screen Shot 2017-09-13 at 11.13.41 AM.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Problem*:
> If topology.acker.executors is not set,  the number of ackers will be equal 
> to topology.workers. But on RAS cluster, we don't set topology.workers 
> because the number of workers will be determined by the scheduler. So in this 
> case, the number of ackers will always be 1 (see attached screenshot)
> *Analysis*:
> The number of ackers has to be computed before scheduling happens, so it 
> knows how to schedule the topology. The number of workers is not set until 
> the topology is scheduled, so it is a bit of a chicken and egg problem.
> *Solution*:
> We could probably use the total amount of requested memory when the topology 
> is submitted divided by the memory per worker to get an estimate that is 
> better than 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2746) Max Open Files does not close files for the oldest entry

2017-09-19 Thread Shoeb (JIRA)
Shoeb created STORM-2746:


 Summary: Max Open Files does not close files for the oldest entry
 Key: STORM-2746
 URL: https://issues.apache.org/jira/browse/STORM-2746
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 2.0.0, 1.x
Reporter: Shoeb
Priority: Minor


Description:

AbstractHDFSBolt has WritersMap. This evicts least recently used 
AbstractHDFSWriter out of the writers map, however, does not close the file in 
open state by the oldest entry.

Steps to reproduce  error: 

1) Use new Max open files feature and set the value to 1.
2) Write data to two or three different files in hdfs using AvroBolt.
3) Check output directory using fsck in hdfs.
   
Expected: only one file open in output directory.
Actual: > 1 files are in open state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2712) accept arbitrary number of rows per tuple in storm-cassandra

2017-09-19 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-2712.
-
   Resolution: Fixed
 Assignee: Yohei Kishimoto
Fix Version/s: 1.1.2
   1.2.0
   2.0.0

Thanks [~morokosi], I merged into master, 1.x, 1.1.x branches.
Keep up the good work.

> accept arbitrary number of rows per tuple in storm-cassandra
> 
>
> Key: STORM-2712
> URL: https://issues.apache.org/jira/browse/STORM-2712
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-cassandra
>Affects Versions: 2.0.0, 1.1.1
>Reporter: Yohei Kishimoto
>Assignee: Yohei Kishimoto
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.2.0, 1.1.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Current implementation in `TridentResultSetValuesMapper::map` restricts a 
> SELECT query to return one row. In `StateQueryProcessor::finishBatch`, it 
> checks the equality between the result size of `batchRetrieve` and input 
> tuple size. When the number of result rows is less than 1 or greater than 1, 
> it breaks the condition and an exception is thrown.
> We should accept arbitrary number of rows by adjusting List dimensions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2534) Visualization API missing stats/instances for "system" components

2017-09-19 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved STORM-2534.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

PR for STORM-2533 covered this.

> Visualization API missing stats/instances for "system" components
> -
>
> Key: STORM-2534
> URL: https://issues.apache.org/jira/browse/STORM-2534
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0
>Reporter: Stephen Powis
>Assignee: Jungtaek Lim
> Fix For: 2.0.0
>
>
> The topology visualization api end point ( 
> /api/v1/topology/TOPOLOGY-ID/visualization ) does not return correct "stats" 
> values for "system" components __system and __acker.
> See the following example *correct* response for a spout or bolt within a 
> topology, shorten for brevity.  Under the stats key it lists all of the 
> instances of that component that is deployed.
> {code}
> {
>   "spout": {
>   ...
>   ":stats": [{
>   ":host": 
> "e54bb273-2a8a-4320-b23f-7c7ace52c961-10.153.0.30",
>   ":port": 6700,
>   ":uptime_secs": 0,
>   ":transferred": {
>   ...
>   }
>   }],
>   ...
>   },
> {code}
> See the following response for the __system and __acker components.  They do 
> *not* correctly list any entries under the stats key.
> {code}
> {
>   "__system": {
>   ":type": "spout",
>   ":capacity": 0,
>   ":latency": null,
>   ":transferred": null,
>   ":stats": [],
>   ":link": 
> "\/component.html?id=__system&topology_id=test-1-1495630798",
>   ":inputs": []
>   },
>   "__acker": {
>   ":type": "spout",
>   ":capacity": 0,
>   ":latency": null,
>   ":transferred": null,
>   ":stats": [],
>   ":link": 
> "\/component.html?id=__acker&topology_id=test-1-1495630798",
>   ":inputs": [...]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)