Re: [sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread Richard Damon
On 4/19/20 4:44 PM, Benjamin Taub wrote:
> Thanks for taking the time to respond, Richard. I may be thinking
> about the command line option that you mentioned. However, I do see
> that MySQL has a LOAD DATA statement
> (https://dev.mysql.com/doc/refman/8.0/en/load-data.html) that, I
> think, does what I'm thinking about. Similarly, Postgres has COPY
> (https://www.postgresql.org/docs/9.2/sql-copy.html) and SQL Server has
> BULK INSERT
> (https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15).
>
> These seem to be embedded in the related SQL implementations but are
> clearly not ANSI standard. I'm not sure if that makes it disqualifying
> for a SQLAlchemy feature request, or if anyone else could even use it,
> but functionality like this is something that, at least for me, would
> make my implementation more DB independent.
>
> Anyhow, thanks again for your note and your work on SQLAlchemy. I
> appreciate it.
>
> Ben

I will admit that wasn't a command I was familiar with, but being DB
Specific it would be something I tend to try to minimize the use of.

-- 
Richard Damon

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/18784b2d-85c7-80ab-c268-a9fdd0b21d4b%40Damon-Family.org.


Re: [sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread Benjamin Taub
Thanks for taking the time to respond, Richard. I may be thinking about the
command line option that you mentioned. However, I do see that MySQL has a
LOAD DATA statement (https://dev.mysql.com/doc/refman/8.0/en/load-data.html)
that, I think, does what I'm thinking about. Similarly, Postgres has COPY (
https://www.postgresql.org/docs/9.2/sql-copy.html) and SQL Server has BULK
INSERT (
https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15
).

These seem to be embedded in the related SQL implementations but are
clearly not ANSI standard. I'm not sure if that makes it disqualifying for
a SQLAlchemy feature request, or if anyone else could even use it, but
functionality like this is something that, at least for me, would make my
implementation more DB independent.

Anyhow, thanks again for your note and your work on SQLAlchemy. I
appreciate it.

Ben
__
[image: light logo] 
Benjamin Taub, CEO
Direct: (734) 585-3503
Check out Talentspace  to see
a list of prescreened analytics and data engineering candidates
Blog: http://www.dataspace.com/insights/blog/
Twitter: twitter.com/dataspaceinc

*NOTE: I usually check email only 2 - 3 times / day.  For issues needing
immediate attention, please call.*


On Sun, Apr 19, 2020 at 3:59 PM Richard Damon 
wrote:

>
> I can't think of any SQL engines, where the engine itself can read a CSV
> file to load a database (In many cases, the actual SQL engine is off on
> another machine with the database, and has no direct link to the local
> file system). Like SQLite, many have a command line interface that can
> read the file and insert it into the database.
>
> Also, some higher end wrappers might provide such a feature, but that is
> NOT part of the base SQL language.
>
> --
> Richard Damon
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/babc0ee4-d95d-af02-54cc-0691ab9ce6da%40Damon-Family.org
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAOeAApPT%2B%3DPawO6QJX0KaW%2Bkw-BuUGa0Cy-MrB%2BYbatDYhgZQA%40mail.gmail.com.


Re: [sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread Richard Damon
On 4/19/20 3:01 PM, Ben wrote:
> Hi, James, thank you for the info. I'll give this a try. Still, it
> does seem to require that the data be read into Python and then
> written out to the DB, albeit quickly. What I'd prefer is a way to
> issue a call to the DB to have it 'suck in' some csv file directly,
> rather than having to read it into Python first. Most databases have a
> bulk loader to handle such jobs so I would think it would be possible
> to provide a vendor-neutral way to call this functionality but I could
> be missing something. Perhaps I should look at adding this to the
> SQLAlchemy github project but I'm not sure that the world would want
> to rely on my code :) 
>
> In any case, thank you so much for taking the time to reply.
>
> Ben

I can't think of any SQL engines, where the engine itself can read a CSV
file to load a database (In many cases, the actual SQL engine is off on
another machine with the database, and has no direct link to the local
file system). Like SQLite, many have a command line interface that can
read the file and insert it into the database.

Also, some higher end wrappers might provide such a feature, but that is
NOT part of the base SQL language.

-- 
Richard Damon

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/babc0ee4-d95d-af02-54cc-0691ab9ce6da%40Damon-Family.org.


Re: [sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread Ben
Hi, James, thank you for the info. I'll give this a try. Still, it does 
seem to require that the data be read into Python and then written out to 
the DB, albeit quickly. What I'd prefer is a way to issue a call to the DB 
to have it 'suck in' some csv file directly, rather than having to read it 
into Python first. Most databases have a bulk loader to handle such jobs so 
I would think it would be possible to provide a vendor-neutral way to call 
this functionality but I could be missing something. Perhaps I should look 
at adding this to the SQLAlchemy github project but I'm not sure that the 
world would want to rely on my code :) 

In any case, thank you so much for taking the time to reply.

Ben

On Sunday, April 19, 2020 at 1:23:06 PM UTC-4, James Fennell wrote:
>
> Hi Ben,
>
> Have you checked out bulk operations? 
> https://docs.sqlalchemy.org/en/13/orm/persistence_techniques.html#bulk-operations
>
> Bulk operations provide a DB-agnostic API for doing large numbers of 
> inserts and/or updates. The speed up you see, compared to session.add, 
> depends on your database and the database connection arguments. With 
> Postgres, I've needed to enable batch mode to see the fullest benefit: 
> https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#psycopg2-batch-mode
>
> Personally I've seen huge performance impovements in my applications after 
> migrating big operations from session.add over to the bulk API.
>
> Of course, as you say, you can do more low level SQL calls to get it even 
> faster, but then you run into a bunch of other issues.
>
> James
>
>
> On Sun, Apr 19, 2020, 12:46 PM Ben > 
> wrote:
>
>> I hope this is the right place for this... I need to load large files 
>> into my database. As I understand it, I can do this in one of two ways with 
>> SQLAlchemy Core: 1) Bring the data into Python and then write it out with 
>> the *add* method or, alternatively, 2) Use SQLAlchemy to issue a command 
>> to the DB to use it's native bulk loader to read data from my file. I would 
>> expect this second approach to be faster, to require less code, and to 
>> avoid issues such as trying to put too much in memory at one time. However, 
>> it is DB-vendor-specific (i.e. I believe the command I send to a MySQL DB 
>> will differ from that I send to a Postgres DB). 
>>
>> So, 
>>
>>- Do I properly understand SQLAlchemy's capabilities here or am I 
>>missing something?
>>- If I do have this right, is generic access to bulk loaders 
>>something that is on the upgrade / new development list?
>>
>> Sorry if this isn't the right place for this.
>> Thanks!
>> Ben
>>
>> -- 
>> SQLAlchemy - 
>> The Python SQL Toolkit and Object Relational Mapper
>>  
>> http://www.sqlalchemy.org/
>>  
>> To post example code, please provide an MCVE: Minimal, Complete, and 
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> description.
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to sqlal...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/sqlalchemy/fd55ee9a-6918-4b60-88bb-961787e3c53e%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/abc915b7-f856-4c72-9898-d1ec34322860%40googlegroups.com.


Re: [sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread James Fennell
Hi Ben,

Have you checked out bulk operations?
https://docs.sqlalchemy.org/en/13/orm/persistence_techniques.html#bulk-operations

Bulk operations provide a DB-agnostic API for doing large numbers of
inserts and/or updates. The speed up you see, compared to session.add,
depends on your database and the database connection arguments. With
Postgres, I've needed to enable batch mode to see the fullest benefit:
https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#psycopg2-batch-mode

Personally I've seen huge performance impovements in my applications after
migrating big operations from session.add over to the bulk API.

Of course, as you say, you can do more low level SQL calls to get it even
faster, but then you run into a bunch of other issues.

James


On Sun, Apr 19, 2020, 12:46 PM Ben  wrote:

> I hope this is the right place for this... I need to load large files into
> my database. As I understand it, I can do this in one of two ways with
> SQLAlchemy Core: 1) Bring the data into Python and then write it out with
> the *add* method or, alternatively, 2) Use SQLAlchemy to issue a command
> to the DB to use it's native bulk loader to read data from my file. I would
> expect this second approach to be faster, to require less code, and to
> avoid issues such as trying to put too much in memory at one time. However,
> it is DB-vendor-specific (i.e. I believe the command I send to a MySQL DB
> will differ from that I send to a Postgres DB).
>
> So,
>
>- Do I properly understand SQLAlchemy's capabilities here or am I
>missing something?
>- If I do have this right, is generic access to bulk loaders something
>that is on the upgrade / new development list?
>
> Sorry if this isn't the right place for this.
> Thanks!
> Ben
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sqlalchemy/fd55ee9a-6918-4b60-88bb-961787e3c53e%40googlegroups.com
> 
> .
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CALDF6i1nvcNQQi46WqrqDyYk2RJa0EwRQVK%2B010VXwJ1Fk8i1w%40mail.gmail.com.


[sqlalchemy] SQLAlchemy Access to Native Bulk Loaders

2020-04-19 Thread Ben
I hope this is the right place for this... I need to load large files into 
my database. As I understand it, I can do this in one of two ways with 
SQLAlchemy Core: 1) Bring the data into Python and then write it out with 
the *add* method or, alternatively, 2) Use SQLAlchemy to issue a command to 
the DB to use it's native bulk loader to read data from my file. I would 
expect this second approach to be faster, to require less code, and to 
avoid issues such as trying to put too much in memory at one time. However, 
it is DB-vendor-specific (i.e. I believe the command I send to a MySQL DB 
will differ from that I send to a Postgres DB). 

So, 

   - Do I properly understand SQLAlchemy's capabilities here or am I 
   missing something?
   - If I do have this right, is generic access to bulk loaders something 
   that is on the upgrade / new development list?

Sorry if this isn't the right place for this.
Thanks!
Ben

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/fd55ee9a-6918-4b60-88bb-961787e3c53e%40googlegroups.com.