[GitHub] storm pull request #2109: STORM-2506: Print mapping between Task ID and Kafk...

2017-05-11 Thread srishtyagrawal
GitHub user srishtyagrawal opened a pull request:

https://github.com/apache/storm/pull/2109

STORM-2506:  Print mapping between Task ID and Kafka Partitions

[Link to the ticket](https://issues.apache.org/jira/browse/STORM-2506)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srishtyagrawal/storm DATA-3766

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2109.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2109


commit 10e0b70d0b54e46af505c83870deae645b10557f
Author: Srishty Agrawal 
Date:   2017-03-29T19:35:57Z

DATA-3766:  Print mapping between Task ID and Kafka Partitions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-12 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw @revans2 @harshach Please review and let me know if any changes are 
required.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #2109: STORM-2506: Print mapping between Task ID and Kafk...

2017-05-15 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2109#discussion_r116602280
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java
 ---
@@ -139,8 +139,8 @@ public void 
onPartitionsRevoked(Collection partitions) {
 
 @Override
 public void onPartitionsAssigned(Collection 
partitions) {
-LOG.info("Partitions reassignment. [consumer-group={}, 
consumer={}, topic-partitions={}]",
-kafkaSpoutConfig.getConsumerGroupId(), kafkaConsumer, 
partitions);
+LOG.info("Partitions reassignment. [task-ID={}, 
consumer-group={}, consumer={}, topic-partitions={}]",
--- End diff --

This is how I have derived the various styles : 
* taskId: 
[Task.java](https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/Task.java#L127)
 has component ID named as `componentId`, so used `taskId` as the variable name 
for task ID. 
* Task-ID: This style is only being used in [the print 
statement](https://github.com/apache/storm/blob/master/external/storm-kafka/src/jvm/org/apache/storm/kafka/KafkaUtils.java#L290)
 and is consistent with the existing style.   
* task-ID: Only used once, consistent with the other variable names in the 
[log statement 
here](https://github.com/apache/storm/blob/master/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L142).
 This can be renamed to `task-Id`. 
* I have used `taskID` as the variable name in rest of the files because 
`taskId` is the name of a function in same set of files.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-16 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw @vinodkc @revans2 @srdo 

I am encountering the following checkstyle error while building my code :
```

```

This error, I figured is because of the [AbbreviationAsWordInName setting 
in 
storm_checkstyle.xml](https://github.com/apache/storm/blob/7043dea8e487d55510c120ada39d18b5bd08451a/storm-buildtools/storm_checkstyle.xml#L197).
 The variable taskID has 2 consecutive capital letters and this creates a 
warning (which is a violation and hence the build does not go through). [The 
default setting according to the checkstyle documentation is 
3](http://checkstyle.sourceforge.net/config_naming.html#AbbreviationAsWordInName).
 Is there a specific reason we are setting it to 1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-17 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw Thanks for pointing out the defaults in google_checks.xml. I am not 
sure what the reason is behind deviating from the checkstyle defaults.

@revans2 variable `taskID` has been used in the following three files:
1. **KafkaUtils.java:** This file already has a function named `taskId`. I 
can rename `taskID` to be `taskid` or `task-id`. Although the variables in this 
file follow [Camel case](https://en.wikipedia.org/wiki/Camel_case) naming 
standards.
2. **StaticCoordinator.java:** `taskID` can be changed to `taskId` in this 
file.
3. **ZkCoordinator.java:** uses `taskId` function, can change the variable 
name to `taskid` or `task-id`.

An alternative would be to increase the number of violations. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-17 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw's suggestion on renaming the function `taskId()` to `taskPrefix()` 
seems reasonable to me.  I can then rename the variable name from `taskID` to 
`taskId`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-17 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
Thanks @erikdw for clarifying that. I was assuming that it must be a bad 
coding practice to have same name for a method as well as variable. Even 
`taskPrefix` is a method as well as variable name in 
[KafkaUtils.java](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/external/storm-kafka/src/jvm/org/apache/storm/kafka/KafkaUtils.java#L281).
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-17 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw @srdo fixed the code according to your suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-18 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@erikdw thanks for pointing that out. Rebasing on top of the recent changes 
in master helped with the integration tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-18 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@srdo let me know if further changes are required.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-23 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@revans2 can you please merge this PR?

@srdo do we need task-ID values in logs while printing partition to task 
mapping for trident? I can file another STORM ticket for that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-24 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@srdo I was thinking of it from the perspective of consistency (between the 
trident KafkaSpout and normal KafkaSpout), not sure how useful it will be 
(don't know much about trident).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-05-24 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
[Filed a new ticket to track this change for 
trident.](https://issues.apache.org/jira/browse/STORM-2530)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-06-07 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@harshach can you please help with merging this PR? It has been sitting for 
2 weeks now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2109: STORM-2506: Print mapping between Task ID and Kafka Parti...

2017-06-20 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2109
  
@HeartSaVioR thanks for the approval. I have rebased the changes on top of 
the latest master branch. Can you please merge this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #2505: STORM-2877: Add an option to configure pagination ...

2018-01-05 Thread srishtyagrawal
GitHub user srishtyagrawal opened a pull request:

https://github.com/apache/storm/pull/2505

STORM-2877: Add an option to configure pagination in Storm UI

The current pagination default value for Storm UI is hard-coded to be 20. 
Pagination has been introduced in Storm 1.x. Having 20 items in the list 
restricts searching through the page. It will be beneficial to have a 
configuration option, `ui.pagination`, which will set the pagination value when 
Storm UI loads. This option can be added to storm.yaml along with other 
configurations.

Changed the value of `ui.pagination` in `storm.yaml` to test the following 
cases :



| Value in `storm.yaml`| Result
| - |:-:| 
|  No `ui.pagination` | defaults to 20 entries |
|  `“All”` | Throws error, expects integer |
|  `-10` | List doesn't have any items in it |
|  `30` | Populates list with 30 entries as default |
|  `10` | Populates list with 10 entries as default|
|  `10` | Populates list with 10 entries as default|
|  `-1`  | Equivalent of All from the drop-down options|

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srishtyagrawal/storm STORM-2877

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2505.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2505


commit 3b1bfdbdbe98a141e20f475c8bc08d562a7bda0a
Author: Srishty Agrawal 
Date:   2018-01-03T22:51:25Z

STORM-2877: Add an option to configure pagination in Storm UI




---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-05 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@revans2 


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-05 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@revans2 if I understand correctly your concern is that every time a table 
is paginated for a given page, there will be a call to get the cluster 
configuration. I verified that for multiple paginated tables within the same 
page there will be only 1 call to fetch cluster configuration. 

https://user-images.githubusercontent.com/564198/34634933-04284538-f23e-11e7-865b-a473c2af50d7.png";>
 
In the above picture, there are 2 tables which have pagination (because of 
the low minEntriesToShowPagination value) and there is just 1 call to 
configuration (cluster/configuration endpoint).

I noticed that `index.html` already has a getJSON call to fetch 
cluster/configuration for `Nimbus Configuration` table so I can make changes to 
not introduce another one for fetching the pagination value.


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-08 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@HeartSaVioR, I agree that saving pagination for a user makes more sense. 
Currently, this information is saved per session for a user. [This is being 
done using stateSave 
functionality](https://datatables.net/examples/basic_init/state_save.html). 
Do we still want a cookie to store the pagination value?


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-08 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@HeartSaVioR and @revans2 thanks for your comments. I had not noticed that 
the pagination values are saved per topology. It was good to understand the 
motive behind storing these values temporarily rather than in a cookie. I will 
make the changes in index.html to eliminate the additional call and update the 
PR. 



---


[GitHub] storm pull request #2505: STORM-2877: Add an option to configure pagination ...

2018-01-09 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2505#discussion_r160570956
  
--- Diff: storm-server/src/main/java/org/apache/storm/DaemonConfig.java ---
@@ -307,6 +307,12 @@
 @isString
 public static final String UI_CENTRAL_LOGGING_URL = 
"ui.central.logging.url";
 
+/**
+ * Storm UI drop-down pagination value.
--- End diff --

Done!


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-09 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@revans2 I have made the changes suggested by you. I have also built and 
tested the code against values mentioned in the first comment (the 
functionality is same as mentioned in the table).


---


[GitHub] storm pull request #2505: STORM-2877: Add an option to configure pagination ...

2018-01-10 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2505#discussion_r160776471
  
--- Diff: storm-core/src/ui/public/index.html ---
@@ -106,92 +106,92 @@ Nimbus Configuration
 $.blockUI({ message: ' Loading 
summary...'});
 });
 $(document).ready(function() {
-$.extend( $.fn.dataTable.defaults, {
-  stateSave: true,
-  lengthMenu: [[20,40,60,100,-1], [20, 40, 60, 100, "All"]],
-  pageLength: 20
-});
-
-$.ajaxSetup({
-"error":function(jqXHR,textStatus,response) {
-var errorJson = jQuery.parseJSON(jqXHR.responseText);
-getStatic("/templates/json-error-template.html", 
function(template) {
-
$("#json-response-error").append(Mustache.render($(template).filter("#json-error-template").html(),errorJson));
-});
-}
-});
-var uiUser = $("#ui-user");
-var clusterSummary = $("#cluster-summary");
-var clusterResources = $("#cluster-resources");
-var nimbusSummary = $("#nimbus-summary");
-var ownerSummary = $("#owner-summary");
-var topologySummary = $("#topology-summary");
-var supervisorSummary = $("#supervisor-summary");
-var config = $("#nimbus-configuration");
-
-getStatic("/templates/index-page-template.html", 
function(indexTemplate) {
-
$.getJSON("/api/v1/cluster/summary",function(response,status,jqXHR) {
-getStatic("/templates/user-template.html", function(template) {
-
uiUser.append(Mustache.render($(template).filter("#user-template").html(),response));
-$('#ui-user [data-toggle="tooltip"]').tooltip()
-});
-
-
clusterSummary.append(Mustache.render($(indexTemplate).filter("#cluster-summary-template").html(),response));
-$('#cluster-summary [data-toggle="tooltip"]').tooltip();
-
-
clusterResources.append(Mustache.render($(indexTemplate).filter("#cluster-resources-template").html(),response));
-$('#cluster-resources [data-toggle="tooltip"]').tooltip();
+
$.getJSON("/api/v1/cluster/configuration",function(response,status,jqXHR) {
--- End diff --

Done!


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-10 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
nit has been addressed and tested.
@revans2, @HeartSaVioR I have also squashed the commits into 1. 


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-12 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
@revans2 @HeartSaVioR can you merge this PR?


---


[GitHub] storm issue #2505: STORM-2877: Add an option to configure pagination in Stor...

2018-01-17 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2505
  
Thanks @revans2, @HeartSaVioR and whoever merged this PR! I will create 
backport PRs soon.


---


[GitHub] storm pull request #2535: STORM-2877: Add an option to configure pagination ...

2018-01-25 Thread srishtyagrawal
GitHub user srishtyagrawal opened a pull request:

https://github.com/apache/storm/pull/2535

STORM-2877: Add an option to configure pagination in Storm UI

Backporting this Storm UI change which has already been merged in master. 

JIRA ticket: https://issues.apache.org/jira/browse/STORM-2877
Merged PR in master branch: https://github.com/apache/storm/pull/2505

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srishtyagrawal/storm STORM-2877-1.0.x

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2535.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2535


commit 86e28ad5c21666a0da3d7ad38c7bda75bc673062
Author: Srishty Agrawal 
Date:   2018-01-25T20:20:17Z

STORM-2877: Add an option to configure pagination in Storm UI




---


[GitHub] storm pull request #2536: STORM-2877: Backport for Storm-1.x-branch

2018-01-25 Thread srishtyagrawal
GitHub user srishtyagrawal opened a pull request:

https://github.com/apache/storm/pull/2536

STORM-2877: Backport for Storm-1.x-branch

Backporting this Storm UI change which has already been merged in master.

JIRA ticket: https://issues.apache.org/jira/browse/STORM-2877
Merged PR in master branch: #2505

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srishtyagrawal/storm STORM-2877-1.x

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2536.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2536


commit 075f9e1666eac72534ba7fdf4fff718fd52b4c3d
Author: Srishty Agrawal 
Date:   2018-01-25T20:45:11Z

STORM-2877: Add an option to configure pagination in Storm UI




---


[GitHub] storm issue #2536: STORM-2877: Backport for 1.x-branch

2018-02-01 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2536
  
Thanks @HeartSaVioR!


---


[GitHub] storm issue #2535: STORM-2877: Backport for 1.0.x-branch

2018-02-01 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2535
  
Definitely @HeartSaVioR! Thanks for merging this one.


---


[GitHub] storm issue #2535: STORM-2877: Backport for 1.0.x-branch

2018-02-01 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2535
  
@HeartSaVioR I see STORM-2877 in the [commit history for 1.1.x 
branch](https://github.com/apache/storm/commits/1.1.x-branch). Did you 
cherry-pick the changes already? I don't need to create a backport PR now, 
right?


---


[GitHub] storm issue #2535: STORM-2877: Backport for 1.0.x-branch

2018-02-01 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2535
  
Thanks for confirming that @HeartSaVioR. We have  the changes 
for`STROM-2877-Introduce an option to configure pagination in Storm UI` in all 
the development branches which are 1+ 


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-04-18 Thread srishtyagrawal
GitHub user srishtyagrawal opened a pull request:

https://github.com/apache/storm/pull/2637

Map of Spout configurations from `storm-kafka` to `storm-kafka-client`

As per @srdo and @ptgoetz's replies on the Storm Dev mailing list, I am 
adding the spout configuration map in the `storm-kafka-client` document . 
[The 
gist](https://gist.github.com/srishtyagrawal/850b0c3f661cf3c620c27f314791224b), 
with initial changes, had comments from @srdo and questions from me which I am 
pasting here for convenience:

Last comment by @srdo:
Thanks, I think this is nearly there. The maxOffsetBehind section says that 
"If a failing tuple's offset is less than maxOffsetBehind, the spout stops 
retrying the tuple.". Shouldn't it be more than? i.e. if the latest offset is 
100, and you set maxOffsetBehind to 50, and then offset 30 fails, 30 is more 
than maxOffsetBehind behind the latest offset, so it is not retried.
Regarding the links, I think we should try to use links that automatically 
point at the right release. There's some documentation about it here 
https://github.com/apache/storm-site#how-release-specific-docs-work, and 
example usage "The allowed values are listed in the FirstPollOffsetStrategy 
javadocs" (from 
https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md). It 
would be great if you fix any broken links you find, or any links that are hard 
coded to point at a specific release.


My reply:
I copied the [maxOffsetBehind 
documentation](https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_storm-component-guide/content/storm-kafkaspout-config-core.html)
 from here. It is confusing because from your earlier example the value 30 
itself is lesser than 100-50, but I like the idea of adding behind to make it 
more clear. As there are more than 1 scenarios where maxOffsetBehind is used, I 
have modified the documentation to specify the fail scenario as an example.
Thanks for the documentation on links, I will fix all the existing links 
and the ones which are currently broken in storm-kafka-client documentation.

Question:
Seems like all the release related links in 
[storm-kafka-client.md](https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md)
 don't work. I looked at other docs as well, for example 
[Hooks.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Hooks.md),
 
[Concepts.md](https://github.com/apache/storm/blob/09e01231cc427004bab475c9c70f21fa79cfedef/docs/Concepts.md),
 
[Configuration.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Configuration.md),
 
[Common-patterns.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Common-patterns.md)
 (the first 4 documents I looked into for relative links) where these links 
gave a 404. Yet to figure out why these links don't work.


 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srishtyagrawal/storm migrateSpoutConfigs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2637


commit 2ca4fc851c17e1cb8a4208fe5cb0c3916551080b
Author: Srishty Agrawal 
Date:   2018-04-19T00:13:57Z

Map of Spout configurations from storm-kafka to storm-kafka-client




---


[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

2018-04-19 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@srdo thanks for explaining that. I was looking at the GitHub file links 
that's why they were giving 404s.

I have also modified the `socketTimeoutMs` setting map to not be supported 
in `storm-kafka-client`. The previous translation was from [Kafka's 
ConsumerConfig.scala](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/consumer/ConsumerConfig.scala#L30),
 but looks like the scala file is deprecated, also this configuration is only 
present in [old Kafka Consumer configs 
table](https://kafka.apache.org/documentation/#oldconsumerconfigs) and not in 
[new Kafka Consumer configs 
table](https://kafka.apache.org/documentation/#newconsumerconfigs).

According to your suggestion the Storm links have been modified to be 
generic hence not tied to a particular release version, but I also have links 
to Kafka properties which are still release specific (last release - 1.1). This 
is because kafka-site repo doesn’t do release agnostic links. 

Let me know if the table looks fine, I will then publish them to Storm docs.



---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-04-20 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r183161344
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,25 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
`forceFromStart`  **Default:** `false`  `startOffsetTime` & 
`forceFromStart` together determine where consumer offsets are being read from, 
ZooKeeper, beginning, or end of Kafka stream | **Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to this 
gist](https://gist.github.com/srishtyagrawal/e8f512b2b7f61f239b1bb5dc15d2437b) 
for choosing the right `FirstPollOffsetStrategy` based on your 
`startOffsetTime` & `forceFromStart` settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** 
[`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffse
 
tStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
--- End diff --

@srdo thanks for the comments, I have addressed both of them.


---


[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

2018-04-23 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@srdo squashed the commits.


---


[GitHub] storm issue #2637: Map of Spout configurations from `storm-kafka` to `storm-...

2018-04-30 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@srdo can you merge this if it looks ok?


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185606593
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` 
  **Usage:** 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Kafka config:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| **Import package:** `import 
org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `bufferSizeBytes` Buffer size (in bytes) for 
network requests. The buffer size which consumer has for pulling data from 
producer  **Default:** `1MB`| **Kafka config:** 
[`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG)
  The size of the TCP receive buffer (SO_RCVBUF) to use when reading 
data. If the value is -1, the OS default will be used | **Import package:** 
`import org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `socketTimeoutMs` **Default:** `1` | 
Discontinued in `storm-kafka-client` ||
--- End diff --

Addressed.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185606682
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` 
  **Usage:** 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Kafka config:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| **Import package:** `import 
org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `bufferSizeBytes` Buffer size (in bytes) for 
network requests. The buffer size which consumer has for pulling data from 
producer  **Default:** `1MB`| **Kafka config:** 
[`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG)
  The size of the TCP receive buffer (SO_RCVBUF) to use when reading 
data. If the value is -1, the OS default will be used | **Import package:** 
`import org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `socketTimeoutMs` **Default:** `1` | 
Discontinued in `storm-kafka-client` ||
+| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange` **Default:** 
`true` | **Kafka config:** 
[`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185608174
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
--- End diff --

Addressed. Removed Storm version from the heading. @hmcl I had initially 
thought of structuring this the way as you have suggested above, but was unable 
to nest columns in markdown. Hence dropped the idea.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185609727
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` 
  **Usage:** 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Kafka config:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| **Import package:** `import 
org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
--- End diff --

Have substituted `Kafka Config` with `Setting`, this is because I did not 
succeed in restructuring the table as suggested in earlier comment. I have also 
changed the heading from  `KafkaSpoutConfig Name` to 
`KafkaSpoutConfig/ConsumerConfig Name` because the column is a mix of both kind 
of configs.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185623513
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` 
  **Usage:** 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Kafka config:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| **Import package:** `import 
org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
--- End diff --

In the [FetchRequest Kafka 
code](https://github.com/apache/kafka/blob/0.10.1.0/core/src/main/scala/kafka/api/FetchRequest.scala#L53-L61)
 `fetchSize` and `maxBytes` are different variables. Does `val fetchSize = 
buffer.getInt` mean that `fetchSize` will fetch as many bytes as possible to 
fit in the buffer and hence putting the `max` in `fetchSizeByte`'s description 
is appropriate? Just trying to understand this parameter better before we make 
the change. 


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185625951
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 **Import package:** `import org.apache.kafka.clients.consumer.ConsumerConfig;` 
  **Usage:** 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Kafka config:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| **Import package:** `import 
org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `bufferSizeBytes` Buffer size (in bytes) for 
network requests. The buffer size which consumer has for pulling data from 
producer  **Default:** `1MB`| **Kafka config:** 
[`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG)
  The size of the TCP receive buffer (SO_RCVBUF) to use when reading 
data. If the value is -1, the OS default will be used | **Import package:** 
`import org.apache.kafka.clients.consumer.ConsumerConfig;`   **Usage:** 
[`.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `socketTimeoutMs` **Default:** `1` | 
Discontinued in `storm-kafka-client` ||
+| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange` **Default:** 
`true` | **Kafka config:** 
[`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185631774
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
--- End diff --

Good point @hmcl. Calling the heading "Translation from `storm-kafka` to 
`storm-kafka-client` spout properties".
@srdo I will submit a PR to link this table in the `storm-kafka-migration` 
README.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185643262
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/SpoutConfig.java)
 and

+[KafkaConfig](https://github.com/apache/storm/blob/v0.9.6/external/storm-kafka/src/jvm/storm/kafka/KafkaConfig.java).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://github.com/apache/storm/blob/v1.0.6/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java).
+
+| Storm-0.9.6 SpoutConfig   | Storm-1.0.6 KafkaSpoutConfig name | 
KafkaSpoutConfig usage help |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | **Import package:** 
`org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy.`
  **Usage:** [`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
--- End diff --

Addressed.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-02 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r185644653
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,37 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Migrating a `storm-kafka` spout to use `storm-kafka-client`
--- End diff --

@srdo I suggested the other way because the name `storm-kafka-migration` 
suggests that the  module will help with migrating storm-kafka, but either way 
is fine. I can add the link to `storm-kafka-migration` here.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-07 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r186570927
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,39 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Translation from `storm-kafka` to `storm-kafka-client` spout properties
--- End diff --

Addressed.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-07 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r186570995
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,39 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Translation from `storm-kafka` to `storm-kafka-client` spout properties
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html)
 and

+[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html)
 
+and Kafka 0.10.1.0 
[ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
+
+| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig 
Usage |
--- End diff --

Addressed.


---


[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-07 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r186572103
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,39 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Translation from `storm-kafka` to `storm-kafka-client` spout properties
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html)
 and

+[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html)
 
+and Kafka 0.10.1.0 
[ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
+
+| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig 
Usage |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | 
[`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 
l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | **Setting:** 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Setting:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `bufferSizeBytes` Buffer size (in bytes) for 
network requests. The buffer size which consumer has for pulling data from 
producer  **Default:** `1MB`| **Setting:** 
[`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG)
  The size of the TCP receive buffer (SO_RCVBUF) to use when reading 
data. If the value is -1, the OS default will be used | 
[`.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `socketTimeoutMs` **Default:** `1` | **N/A** ||
+| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange` **Default:** 
`true` | **Setting:** 
[`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)
 **Possible values:** `"latest"`, `"earliest"`, `"none"` **Default:** 
`latest`. Exception: `earliest` if 
[`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html)
 i

[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...

2018-05-07 Thread srishtyagrawal
Github user srishtyagrawal commented on a diff in the pull request:

https://github.com/apache/storm/pull/2637#discussion_r186582501
  
--- Diff: docs/storm-kafka-client.md ---
@@ -313,4 +313,39 @@ KafkaSpoutConfig kafkaConf = 
KafkaSpoutConfig
   .setTupleTrackingEnforced(true)
 ```
 
-Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
\ No newline at end of file
+Note: This setting has no effect with AT_LEAST_ONCE processing guarantee, 
where tuple tracking is required and therefore always enabled.
+
+# Translation from `storm-kafka` to `storm-kafka-client` spout properties
+
+This may not be an exhaustive list because the `storm-kafka` configs were 
taken from Storm 0.9.6

+[SpoutConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/SpoutConfig.html)
 and

+[KafkaConfig](https://svn.apache.org/repos/asf/storm/site/releases/0.9.6/javadocs/storm/kafka/KafkaConfig.html).
+`storm-kafka-client` spout configurations were taken from Storm 1.0.6

+[KafkaSpoutConfig](https://storm.apache.org/releases/1.0.6/javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.html)
 
+and Kafka 0.10.1.0 
[ConsumerConfig](https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/ConsumerConfig.html).
+
+| SpoutConfig   | KafkaSpoutConfig/ConsumerConfig Name | KafkaSpoutConfig 
Usage |
+| - |  | 
--- |
+| **Setting:** `startOffsetTime` **Default:** 
`EarliestTime`  
**Setting:** `forceFromStart`  **Default:** `false`  
`startOffsetTime` & `forceFromStart` together determine the starting offset. 
`forceFromStart` determines whether the Zookeeper offset is ignored. 
`startOffsetTime` sets the timestamp that determines the beginning offset, in 
case there is no offset in Zookeeper, or the Zookeeper offset is ignored | 
**Setting:** 
[`FirstPollOffsetStrategy`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.FirstPollOffsetStrategy.html)
 **Default:** `UNCOMMITTED_EARLIEST`  [Refer to the helper 
table](#helper-table-for-setting-firstpolloffsetstrategy) for picking 
`FirstPollOffsetStrategy` based on your `startOffsetTime` & `forceFromStart` 
settings | 
[`.setFirstPollOffsetStrategy()`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.htm
 
l#setFirstPollOffsetStrategy-org.apache.storm.kafka.spout.KafkaSpoutConfig.FirstPollOffsetStrategy-)|
+| **Setting:** `scheme` The interface that specifies how a 
`ByteBuffer` from a Kafka topic is transformed into Storm tuple 
**Default:** `RawMultiScheme` | **Setting:** 
[`Deserializers`](https://kafka.apache.org/11/javadoc/org/apache/kafka/common/serialization/Deserializer.html)|
 
[`.setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)
 
[`.setProp(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `fetchSizeBytes` Message fetch size -- the number 
of bytes to attempt to fetch in one request to a Kafka server  **Default:** 
`1MB` | **Setting:** 
[`max.partition.fetch.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#MAX_PARTITION_FETCH_BYTES_CONFIG)
 **Default:** `1MB`| 
[`.setProp(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG,
 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `bufferSizeBytes` Buffer size (in bytes) for 
network requests. The buffer size which consumer has for pulling data from 
producer  **Default:** `1MB`| **Setting:** 
[`receive.buffer.bytes`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#RECEIVE_BUFFER_CONFIG)
  The size of the TCP receive buffer (SO_RCVBUF) to use when reading 
data. If the value is -1, the OS default will be used | 
[`.setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 
)`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.Builder.html#setProp-java.lang.String-java.lang.Object-)|
+| **Setting:** `socketTimeoutMs` **Default:** `1` | **N/A** ||
+| **Setting:** `useStartOffsetTimeIfOffsetOutOfRange` **Default:** 
`true` | **Setting:** 
[`auto.offset.reset`](https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/consumer/ConsumerConfig.html#AUTO_OFFSET_RESET_CONFIG)
 **Possible values:** `"latest"`, `"earliest"`, `"none"` **Default:** 
`latest`. Exception: `earliest` if 
[`ProcessingGuarantee`](javadocs/org/apache/storm/kafka/spout/KafkaSpoutConfig.ProcessingGuarantee.html)
 i

[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

2018-05-07 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@hmcl thanks for the comments. I have edited the Kafka property links to 
point to the [Kafka New Consumer Configs 
table](http://kafka.apache.org/10/documentation.html#newconsumerconfigs), and 
have removed the description and defaults for those configs (included in the 
new link).  

I have created a JIRA ticket, 
[STORM-3060](https://issues.apache.org/jira/browse/STORM-3060), and added it to 
the PR and the commit title. 

The commits have been squashed and the PR is ready to be merged.


---


[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

2018-05-11 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@hmcl @srdo how often do we publish the docs for `2.0.0-SNAPSHOT`? Is it 
every time a document update is made to this repo? How do I request for this PR 
to be published in the 2.0.0 Storm docs?

Also currently the [2.0.0-SNAPSHOT storm-kafka-client documentation 
link](http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-kafka-client.html) 
is broken.


---


[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

2018-05-14 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@srdo thanks a lot for publishing the docs, I see the changes but 
unfortunately the table is badly formatted. I will look into how to make this 
better whenever I have time next. 


---


[GitHub] storm issue #2637: STORM-3060: Map of Spout configurations from storm-kafka ...

2018-05-14 Thread srishtyagrawal
Github user srishtyagrawal commented on the issue:

https://github.com/apache/storm/pull/2637
  
@srdo thanks for taking care of this!


---