dadanielniel opened a new pull request #523: URL: https://github.com/apache/madlib/pull/523
### Module name: Linear-Regression ### JIRA: MADlib-1460 ### Description: Linear regression training results in 2 output tables (**neither are optional**): The **primary** output table, that includes the computed coefficients. A **summary** output table, that contains a single line. #### Scenario Running the linear regression training in postgresql on an input table which has **more than 2^31 records** within it (even if a grouping column is specified), fails due to an "**integer out of range**" exception. #### Source **The summary table** has a column that stores **the total number of records** involved in the computation. The column's data type is a **singed integer**. However, the total number of records is computed as a **BIGINT**. Therefore, when the total number of records in the input table is beyond the range of a signed integer (i.e., 2^31), an "integer out of range" exception is thrown. ### Solution A simple solution is to change the data type of the column from a **signed integer** into a **BIGINT**. ### Test We have executed the linear regression training function with and without the suggested modification on an input table having between 2^31-2^32 records. Without the modification, an integer out of range exception was thrown. After modifying the code as suggested, it worked perfectly. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
