Shea Parkes created SPARK-13842:
-----------------------------------

             Summary: Consider __iter__ and __getitem__ methods for 
pyspark.sql.types.StructType
                 Key: SPARK-13842
                 URL: https://issues.apache.org/jira/browse/SPARK-13842
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 1.6.1
            Reporter: Shea Parkes
            Priority: Minor


It would be nice to consider adding \_\_iter\_\_ and \_\_getitem\_\_ to 
{{pyspark.sql.types.StructType}}.  Here are some simplistic suggestions:

{code}
def __iter__(self):
    """Iterate the fields upon request."""
    return iter(self.fields)

def __getitem__(self, key):
    """Return the corresponding StructField"""
    _fields_dict = dict(zip(self.names, self.fields))
    try:
        return _fields_dict[key]
    except KeyError:
        raise KeyError('No field named {}'.format(key))
{code}

I realize the latter might be a touch more controversial since there could be 
name collisions.  Still, I doubt there are that many in practice and it would 
be quite nice to work with.

Privately, I have more extensive metadata extraction methods overlaid on this 
class, but I imagine the rest of what I have done might go too far for the 
common user.  If this request gains traction though, I'll share those other 
layers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to