Re: [sqlalchemy] Generic File Loader issue with Booleans

Mike Bayer Wed, 28 Oct 2020 05:46:24 -0700


On Tue, Oct 27, 2020, at 11:56 PM, Andrew Martin wrote:
> This is probably a weirdly specific question, but I have a workflow that 
> involves loading lots of CSVs into Postgres. Some of them are very large, so 
> I want to cut overhead and not use CSV Dictreader. Wanted to use named tuples 
> instead.
> 
> Here's a header and a line from a file:
> 
> date_dim_id,date_actual,epoch,day_suffix,day_name,day_of_week,day_of_month,day_of_quarter,day_of_year,week_of_month,week_of_year,week_of_year_iso,month_actual,month_name,month_name_abbreviated,quarter_actual,quarter_name,year_actual,first_day_of_week,last_day_of_week,first_day_of_month,last_day_of_month,first_day_of_quarter,last_day_of_quarter,first_day_of_year,last_day_of_year,mmyyyy,mmddyyyy,weekend_indr
> 
> 
> Here's a line:
> 
> 20150101,1/1/15,1420070400,1st,Thursday ,4,1,1,1,1,1,2015-W01-4,1,January  
> ,Jan,1,First,2015,12/29/14,1/4/15,1/1/15,1/31/15,1/1/15,3/31/15,1/1/15,12/31/15,12015,1012015,0
> 
> 
> The idea was a generic file loader that takes a filename and a class and then 
> loads it. Here's the class:
> 
> 
> class DimDate(Base):
>     __tablename__ = "dim_date"
> 
>     date_dim_id = Column(Integer, primary_key=True)
>     date_actual = Column(Date, nullable=False, index=True)
>     epoch = Column(BigInteger, nullable=False)
>     day_suffix = Column(String(4), nullable=False)
>     day_name = Column(String(9), nullable=False)
>     day_of_week = Column(Integer, nullable=False)
>     day_of_month = Column(Integer, nullable=False)
>     day_of_quarter = Column(Integer, nullable=False)
>     day_of_year = Column(Integer, nullable=False)
>     week_of_month = Column(Integer, nullable=False)
>     week_of_year = Column(Integer, nullable=False)
>     week_of_year_iso = Column(CHAR(10), nullable=False)
>     month_actual = Column(Integer, nullable=False)
>     month_name = Column(String(9), nullable=False)
>     month_name_abbreviated = Column(CHAR(3), nullable=False)
>     quarter_actual = Column(Integer, nullable=False)
>     quarter_name = Column(String(9), nullable=False)
>     year_actual = Column(Integer, nullable=False)
>     first_day_of_week = Column(Date, nullable=False)
>     last_day_of_week = Column(Date, nullable=False)
>     first_day_of_month = Column(Date, nullable=False)
>     last_day_of_month = Column(Date, nullable=False)
>     first_day_of_quarter = Column(Date, nullable=False)
>     last_day_of_quarter = Column(Date, nullable=False)
>     first_day_of_year = Column(Date, nullable=False)
>     last_day_of_year = Column(Date, nullable=False)
>     mmyyyy = Column(CHAR(6), nullable=False)
>     mmddyyyy = Column(CHAR(10), nullable=False)
>     weekend_indr = Column(Boolean, nullable=False)
> 
> 
> What I ran into is a couple of things. 1. Python True/False throws errors 
> when writing to a Boolean. 2. You seem to have to int() a 1 or 0 to avoid a 
> dbapi error. 
> 
> Is this expected?


no, it's not expected, boolean True/False works on all backends without issue, 
assuming the target column is using the Boolean datatype at the SQLAlchemy 
level as well as that the actual column in the database is of the appropriate 
type, for PostgreSQL this would be BOOLEAN.   when you say "throws errors" 
please share complete stack traces and complete error messages, thanks.

can't do much more here without more specifics.  


> 
> 
> 
> 
> And does it have more to do with Postgres than it does SQLA?
> 
> Here's the code I came up with for generic file loader that actually works. 
> I'm sure it's garbage, and I'd like to refactor it, but it works.
> 
> def load_file(file_name, cls):
>     """assumes that csv file has headers and that headers match the
>     names of the columns for a given sqla class. Also assumes that the
>     sqla class is inheriting from meta.Base"""
>     inst = inspect(cls)
>     # attr_names = sorted([c_attr.key for c_attr in inst.mapper.column_attrs])
>     attr_vals = sorted([(c.name, c.type) for c in inst.c])
> 
> 
>     s = get_sqla_session()
> 
> 
>     with open(file_name, newline="") as infile:
>         reader = csv.reader(infile)
>         Data = namedtuple("Data", next(reader))
>         tuple_fields = Data._fields
>         for data in map(Data._make, [(c.strip() for c in row) for row in 
> reader]):
>             new_obj = cls()
>             for attr in attr_vals:
>                 for field in tuple_fields:
>                     if attr[0] == field:
>                         if str(attr[1]) == "BOOLEAN":
>                             setattr(new_obj, field, int(getattr(data, field)))
>                         else:
>                             # print(data, attr[1])
>                             setattr(new_obj, field, getattr(data, field))
> 
> 
>             new_obj.id = IDService.create_snowflake_id()
>             new_obj.is_deleted = 0
> 
> 
>             s.add(new_obj)
>     s.commit()
> 
> 
> This isn't really a big deal, and maybe not even appropriate, but I had an 
> absolute shit 12 hours trying to figure out why SQLA booleans are such a 
> hassle.
> 
> If anyone has thoughts, I'm happy to hear them.
> 
> Thank you!
> 

> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/43b9c281-978a-4def-a279-f518201bd884o%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/43b9c281-978a-4def-a279-f518201bd884o%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/7d8e7e38-9adb-4892-9f0c-65e598ce3d4d%40www.fastmail.com.

Re: [sqlalchemy] Generic File Loader issue with Booleans

Reply via email to