I am writing a unit test to compare that a Pandas DataFrame made by Arrow
is equal to one constructed directly with data.  The timestamp values are a
Python datetime object with a timezone tzinfo object.  When I compare the
results, the values are equal but the schema is not.  Using arrow the type
is "datetime64[ns]" and without it is "object."  Without a tzinfo, the
types match but I do need it there for the conversion with Arrow data.  I
could just replace the tzinfo for the Pandas DataFrame, it is a naive
timezone with utcoffset=None.  Does anyone know another way to produce
compatible types?  I do need the data to be compatible with Spark too.
Hopefully this makes sense, I could attach some code if that would help,
thanks! Here is a sample of the data:

class NaiveTZ(tzinfo):
    def utcoffset(self, date_time):
        return None

    def dst(self, date_time):
        return None

data = {"timestamp_t": [datetime(2011, 1, 1, 1, 1, 1, tzinfo=NaiveTZ())]}

pd.DataFrame(data)

Reply via email to