Github user fmcquillan99 commented on the issue: https://github.com/apache/madlib/pull/223 Started testing, some early observations: (1) class_size default should be âuniformâ, it seems to be set to âundersampleâ currently (2) ` SELECT madlib.balance_sample( 'flags', -- Source table 'output_table', -- Output table 'mainhue', -- Class column 'red=7, blue=7'); -- Want 7 reds and 7 blues ` results in an error, not sure why it does not like this class size: ` InternalError: (psycopg2.InternalError) plpy.Error: Sample: Invalid class size (red=7, blue=7)! (plpython.c:4648) CONTEXT: Traceback (most recent call last): PL/Python function "balance_sample", line 23, in <module> return balance_sample.balance_sample(**globals()) PL/Python function "balance_sample", line 62, in balance_sample PL/Python function "balance_sample", line 851, in _validate_strs PL/Python function "balance_sample", line 77, in _assert PL/Python function "balance_sample" [SQL: "SELECT madlib.balance_sample(\n 'flags', -- Source table\n 'output_table', -- Output table\n 'mainhue', -- Class column\n 'red=7, blue=7'); -- Want 7 reds and 7 blues"] ` (3) ` SELECT madlib.balance_sample( 'flags', -- Source table 'output_table', -- Output table 'mainhue', -- Class column 'red=.25', -- Want 25% red flags 20); -- Desire output table size ` results in an error, not sure why it does not like this class size: ` InternalError: (psycopg2.InternalError) plpy.Error: Sample: Invalid class size (red=.25)! (plpython.c:4648) CONTEXT: Traceback (most recent call last): PL/Python function "balance_sample", line 23, in <module> return balance_sample.balance_sample(**globals()) PL/Python function "balance_sample", line 62, in balance_sample PL/Python function "balance_sample", line 851, in _validate_strs PL/Python function "balance_sample", line 77, in _assert PL/Python function "balance_sample" [SQL: "SELECT madlib.balance_sample(\n 'flags', -- Source table\n 'output_table', -- Output table\n 'mainhue', -- Class column\n 'red=.25', -- Want 25%% red flags\n 20); -- Desire output table size"] `
---