Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/4460#issuecomment-75170753
  
    Call it `AttributeType` maybe?
    
    So if an `AttributeGroup` contains both `Attribute`s but also vector-valued 
columns, which sound like `AttributeGroup`s within themselves. That's why it 
seemed like `AttributeGroup` should be an `Attribute` or at least share a 
common superclass? then I didn't know what to call it and it seemed like 
overkill. That was the logic behind `AttributeGroup extends Attribute` -- WDYT?
    
    As for hierarchy that's all I can think of. Ordinal extends discrete 
extends continuous; binary extends, well, discrete and categorical I suppose.
    
    Hm, I'd imagine most categorical features come in as strings. This feels 
like just the kind of thing a framework can accommodate if it has the type 
information. I don't think it's more or less complex to say that a string 
column can be categorical? It would take some work to inject a translation to 
integers where that's needed but that's great if the framework can do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to