khannaekta opened a new pull request #392: DL: Improve performance for predict URL: https://github.com/apache/madlib/pull/392 JIRA: MADLIB-1343 Performance improvements 1. Using SD to cache the model and set the weights only once for the first row for each segment. This also meant that we had to clear the SD for the last row for each segment. 2. We replaced `PythonFunctionBodyOnly` with `PythonFunctionBodyOnlyNoSchema` in the internal keras predict sql. Using `PythonFunctionBodyOnly` made the query much slower because it added the overhead of executing the schema query for every row in the test table. We don't really need to know the schema name for the internal UDF so now we use `PythonFunctionBodyOnlyNoSchema` instead. Additionally: 1. Replace the use of predict_classes and proba with predict since non sequential models do not support predict_classes. 2. Modify the internal keras predict query to not join the test table and the model table because it caused weird inconsistencies with the segment id due to which SD was not getting set/cleared properly. 3. Add try catch in the internal predict UDF so that we can clear out the SD in case of an error. 4. Reorder arguments for fit and evaluate UDA
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
