reductionista commented on a change in pull request #467: DL: Improve
performance of mini-batch preprocessor
URL: https://github.com/apache/madlib/pull/467#discussion_r363002661
##########
File path:
src/ports/postgres/modules/deep_learning/input_data_preprocessor.py_in
##########
@@ -59,9 +77,9 @@ class InputDataPreprocessorDL(object):
self.dependent_varname = dependent_varname
self.independent_varname = independent_varname
self.buffer_size = buffer_size
- self.normalizing_const = normalizing_const if normalizing_const is not
None else DEFAULT_NORMALIZING_CONST
+ self.normalizing_const = normalizing_const
self.num_classes = num_classes
- self.distribution_rules = distribution_rules if distribution_rules
else DEFAULT_GPU_CONFIG
Review comment:
We had a series of if-else blocks that check for different values of
`self.distribution_rules`. It looked something like this:
```
if self.distribution_rules == 'all_segments':
...
elseif self.distribution_rules == DEFAULT_GPU_CONFIG
...
else:
...
```
where 'all_segments' was hard-coded in a lot of places but 'gpu_segments'
for some reason was defined as this odd constant. It made it very difficult to
find where the `gpu_segments` block was... now you can just search for
`all_segments` or `gpu_segments` to see where they're used, instead of trying
to remember what the constant is named that evaluates to `gpu_segments`.
Also, seems unlikely we would ever change this so I don't see any value in
defining it to be a constant.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services