[ https://issues.apache.org/jira/browse/HADOOP-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949576#comment-15949576 ]
Andrew Wang commented on HADOOP-13200: -------------------------------------- Hi Kai, Would be great to file JIRAs to handle the separate small issues noticed. I also spent some more time thinking about this. High-level problem: determine what raw encoder and decoder to use for a coder. Our current system: {noformat} coder --------> rawcoder factory method ---------> factory ---------> raw encoder / decoder rawcoder reflection hardcode factory class name {noformat} If we replace the reflection step with a registry, we can save the per-rawcoder factory classes: {noformat} coder ----------> rawcoder factory registry --------> factory ----------> raw encoder + decoder rawcoder lookup hardcode factory name {noformat} * Raw coder factories would be identified by an additional getName() interface. * The registry is a singleton that maps coders to a map of rawcoder factories, keyed by getName() * Registry is prepopulated with the built-in factories; these can be private nested classes of the registry, or held in a new class. * The list of pluggable raw coder factory classes are specified in a config key. We classload these at startup and trigger their static initializers, which register them with the registry. We could enforce namespacing of pluggable raw coder names to future-proof. Since nothing in the registry is config-dependent, I think it's a safe singleton. Config-specific logic is handled outside the Registry or in static methods. I think this might also help with implementing caching later, since it's centralized and avoids reflection after initialization. Thoughts? > Seeking a better approach allowing to customize and configure erasure coders > ---------------------------------------------------------------------------- > > Key: HADOOP-13200 > URL: https://issues.apache.org/jira/browse/HADOOP-13200 > Project: Hadoop Common > Issue Type: Sub-task > Reporter: Kai Zheng > Assignee: Kai Zheng > Priority: Blocker > Labels: hdfs-ec-3.0-must-do > > This is a follow-on task for HADOOP-13010 as discussed over there. There may > be some better approach allowing to customize and configure erasure coders > than the current having raw coder factory, as [~cmccabe] suggested. Will copy > the relevant comments here to continue the discussion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org