[ https://issues.apache.org/jira/browse/MADLIB-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Orhan Kislal closed MADLIB-1392. -------------------------------- Resolution: Fixed > DL: Preprocessor support for asymmetric segment distribution > ------------------------------------------------------------ > > Key: MADLIB-1392 > URL: https://issues.apache.org/jira/browse/MADLIB-1392 > Project: Apache MADlib > Issue Type: New Feature > Components: Deep Learning > Reporter: Ekta Khanna > Priority: Major > Fix For: v1.17 > > > Add asymmetric segment redistribution support to the deep learning > preprocessor. Applies to {{training_preprocessor_dl()}} and > {{validation_preprocessor_dl()}} > {code:java} > training_preprocessor_dl(source_table, > output_table, > dependent_varname, > independent_varname, > buffer_size, > normalizing_const, > num_classes, > distribution_rules -- new optional param > ) > {code} > Following are the possible values for the new optional > param({{distribution_rules}}) > # TEXT, *default*: {{all_segments}}. Specifies how to distribute the > {{output_table}}. This is important for how the fit function will use > resources on the cluster. The default {{all_segments}} means the > {{output_table}} will be distributed to all segments in the database cluster. > # If you specify {{gpu_segments}} then the {{output_table}} will be > distributed to all segments that are on hosts that have GPUs attached. This > will make maximum use of GPU resources. > # You can also specify the name of a resources table containing the segments > to use for training. This table is typically created and maintained by the > database administrator. Must contain a column called {{dbid}} that specifies > the segment id from the {{gp_segment_configuration}} table. > Sample {{segments_to_use}} table: > {code:java} > dbid | notes > -----|-------------- > 2 | comment here > 3 | comment here > 4 | comment here > 5 | comment here > {code} > Same deal as above ^^^ for validation preprocessor. > This change adds a new column to the output summary table {{gpu_config}}, > contains the following values: > # if {{distribution_policy}} = {{all_segments}}, then {{all_segments}} > # if {{distribution_policy}} = {{gpu_segments}}, then array of segments ids > all segments that are on hosts that have GPUs attached > # if {{distribution_policy}} = {{segments_to_use_table}}, then array of > segments ids, for the above sample {{segments_to_use}} table -> [2,3,4,5] -- This message was sent by Atlassian Jira (v8.3.4#803005)