mboehm7 commented on pull request #1336:
URL: https://github.com/apache/systemds/pull/1336#issuecomment-886244883


   Unfortunately, there are too many problems with this PR (I spent two hours 
fixing the remaining issues but there is no end in sight). Please split this PR 
into one new PR only for model averaging (and later, once the model averaging 
is really working and cleaned up, one PR for nbatches because otherwise all 
fixes have to be applied in even more locations). In this process, please 
revert all changes of existing tests and formatting changes, only add targeted 
new tests for your new features, and try to limit code changes to what is 
really needed for integrating model averaging as an optional feature.  
   
   The best path forward is to keep this PR open, and cherry pick code changes 
into new PRs and fix the issues. Note that Statement.PS_MODELS is unnecessary 
and should be removed, the gradient accumulation should only be done if no 
model averaging is used, the division of model averaging did not take affect 
(because it was not applied in place and new blocks got discarded in the lambda 
function), computeBatch always pushed the gradients, the federated control 
thread (per worker) always pushed the gradients. Also try to keep the core code 
path as simple as possible: e.g., you can do something like the following:
   
   ```
   if (localUpdate | _modelAvg )
      params = updateModel(params, gradients, i, j, batchIter);
   ...                          
   // Push the gradients or model to ps
   pushGradients(_modelAvg ? params : accGradients.get());
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to