Re: [sqlalchemy] Generic File Loader issue with Booleans

Andrew Martin Wed, 28 Oct 2020 07:34:24 -0700

I will reproduce this when I get done with work this evening and give the 
specifics.


On Wednesday, October 28, 2020 at 7:46:26 AM UTC-5, Mike Bayer wrote:
>
>
>
> On Tue, Oct 27, 2020, at 11:56 PM, Andrew Martin wrote:
>
> This is probably a weirdly specific question, but I have a workflow that 
> involves loading lots of CSVs into Postgres. Some of them are very large, 
> so I want to cut overhead and not use CSV Dictreader. Wanted to use named 
> tuples instead.
>
> Here's a header and a line from a file:
>
> date_dim_id,date_actual,epoch,day_suffix,day_name,day_of_week,day_of_month
> ,day_of_quarter,day_of_year,week_of_month,week_of_year,week_of_year_iso,
> month_actual,month_name,month_name_abbreviated,quarter_actual,quarter_name
> ,year_actual,first_day_of_week,last_day_of_week,first_day_of_month,
> last_day_of_month,first_day_of_quarter,last_day_of_quarter,
> first_day_of_year,last_day_of_year,mmyyyy,mmddyyyy,weekend_indr
>
>
> Here's a line:
>
> 20150101,1/1/15,1420070400,1st,Thursday ,4,1,1,1,1,1,2015-W01-4,1,January 
>  
> ,Jan,1,First,2015,12/29/14,1/4/15,1/1/15,1/31/15,1/1/15,3/31/15,1/1/15,12/31/15,12015,1012015,0
>
>
> The idea was a generic file loader that takes a filename and a class and 
> then loads it. Here's the class:
>
>
> class DimDate(Base):
>     __tablename__ = "dim_date"
>
>     date_dim_id = Column(Integer, primary_key=True)
>     date_actual = Column(Date, nullable=False, index=True)
>     epoch = Column(BigInteger, nullable=False)
>     day_suffix = Column(String(4), nullable=False)
>     day_name = Column(String(9), nullable=False)
>     day_of_week = Column(Integer, nullable=False)
>     day_of_month = Column(Integer, nullable=False)
>     day_of_quarter = Column(Integer, nullable=False)
>     day_of_year = Column(Integer, nullable=False)
>     week_of_month = Column(Integer, nullable=False)
>     week_of_year = Column(Integer, nullable=False)
>     week_of_year_iso = Column(CHAR(10), nullable=False)
>     month_actual = Column(Integer, nullable=False)
>     month_name = Column(String(9), nullable=False)
>     month_name_abbreviated = Column(CHAR(3), nullable=False)
>     quarter_actual = Column(Integer, nullable=False)
>     quarter_name = Column(String(9), nullable=False)
>     year_actual = Column(Integer, nullable=False)
>     first_day_of_week = Column(Date, nullable=False)
>     last_day_of_week = Column(Date, nullable=False)
>     first_day_of_month = Column(Date, nullable=False)
>     last_day_of_month = Column(Date, nullable=False)
>     first_day_of_quarter = Column(Date, nullable=False)
>     last_day_of_quarter = Column(Date, nullable=False)
>     first_day_of_year = Column(Date, nullable=False)
>     last_day_of_year = Column(Date, nullable=False)
>     mmyyyy = Column(CHAR(6), nullable=False)
>     mmddyyyy = Column(CHAR(10), nullable=False)
>     weekend_indr = Column(Boolean, nullable=False)
>
>
> What I ran into is a couple of things. 1. Python True/False throws errors 
> when writing to a Boolean. 2. You seem to have to int() a 1 or 0 to avoid a 
> dbapi error. 
>
> Is this expected?
>
>
> no, it's not expected, boolean True/False works on all backends without 
> issue, assuming the target column is using the Boolean datatype at the 
> SQLAlchemy level as well as that the actual column in the database is of 
> the appropriate type, for PostgreSQL this would be BOOLEAN.   when you say 
> "throws errors" please share complete stack traces and complete error 
> messages, thanks.
>
> can't do much more here without more specifics.  
>
>
>
>
>
>
> And does it have more to do with Postgres than it does SQLA?
>
> Here's the code I came up with for generic file loader that actually 
> works. I'm sure it's garbage, and I'd like to refactor it, but it works.
>
> def load_file(file_name, cls):
>     """assumes that csv file has headers and that headers match the
>     names of the columns for a given sqla class. Also assumes that the
>     sqla class is inheriting from meta.Base"""
>     inst = inspect(cls)
>     # attr_names = sorted([c_attr.key for c_attr in 
> inst.mapper.column_attrs])
>     attr_vals = sorted([(c.name, c.type) for c in inst.c])
>
>
>     s = get_sqla_session()
>
>
>     with open(file_name, newline="") as infile:
>         reader = csv.reader(infile)
>         Data = namedtuple("Data", next(reader))
>         tuple_fields = Data._fields
>         for data in map(Data._make, [(c.strip() for c in row) for row in 
> reader]):
>             new_obj = cls()
>             for attr in attr_vals:
>                 for field in tuple_fields:
>                     if attr[0] == field:
>                         if str(attr[1]) == "BOOLEAN":
>                             setattr(new_obj, field, int(getattr(data, 
> field)))
>                         else:
>                             # print(data, attr[1])
>                             setattr(new_obj, field, getattr(data, field))
>
>
>             new_obj.id = IDService.create_snowflake_id()
>             new_obj.is_deleted = 0
>
>
>             s.add(new_obj)
>     s.commit()
>
>
> This isn't really a big deal, and maybe not even appropriate, but I had an 
> absolute shit 12 hours trying to figure out why SQLA booleans are such a 
> hassle.
>
> If anyone has thoughts, I'm happy to hear them.
>
> Thank you!
>
>
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlal...@googlegroups.com <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/43b9c281-978a-4def-a279-f518201bd884o%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/43b9c281-978a-4def-a279-f518201bd884o%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/bae8f31f-8447-49e0-827c-30221c6b6af1o%40googlegroups.com.

Re: [sqlalchemy] Generic File Loader issue with Booleans

Reply via email to