subject:"\[sqlalchemy\] Difficulties with parallel data insertion into the same table"

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Jonathan Vanasco


On Friday, January 23, 2015 at 1:10:37 PM UTC-5, SF Markus Elfring wrote:
>
> How often do you need to fill these data structures in 
> a concurrent way? 


> Does parallel table creation become more interesting then? 
>
>
Often and Not at All.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

> Should the Python class be sufficient for the definition
> of a table structure?
> 
> If you're using the declarative syntax, yes.

Thanks for your acknowledgement.


> It's common to have a `models.py` file that simply defines
> the classes in one place; then that is imported and metadata
> associated to the engine.

How often do you need to fill these data structures in
a concurrent way?

Does parallel table creation become more interesting then?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Jonathan Vanasco

On Friday, January 23, 2015 at 12:48:46 PM UTC-5, SF Markus Elfring wrote:
>
> Should the Python class be sufficient for the definition 
> of a table structure? 
>

If you're using the declarative syntax, yes.   It's common to have a 
`models.py` file that simply defines the classes in one place; then that is 
imported and metadata associated to the engine.  

> Will the mapping interface work also without tables 
> that were created by other SQL scripts before? 

SqlAlchemy is one of the very few ORMs that does not impose any design 
requirements.  Classes/Tables can be easily mapped to any legacy scenario. 
 The only issue that arises is if there are complex relationships/joins -- 
then special mappers may be needed.  There is an extended section in the 
docs for that, but it is a rare and advanced topic.

There are also 3rd party packages like sqlacodegen 
(https://pypi.python.org/pypi/sqlacodegen) that can query a legacy database 
for you and generate the corresponding SqlAlchemy python classes.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

>  How are you currently specifying the meta-data?

Should the Python class be sufficient for the definition
of a table structure?

Will the mapping interface work also without tables
that were created by other SQL scripts before?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Jonathan Vanasco



On Friday, January 23, 2015 at 12:39:02 PM UTC-5, SF Markus Elfring wrote:
>
> Is the reason good enough to avoid the repeated specification 
> of corresponding meta-data?  

Is it safer to maintain and manage column attributes for some 
> tables only at a single place? 
>

 How are you currently specifying the meta-data?  Most SqlAlchemy 
implementations will only specify this once in the application.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

> If the table structure/name is known and expected to be used
> -- there's not really a good reason to defer creating it .  

Is the reason good enough to avoid the repeated specification
of corresponding meta-data?
Is it safer to maintain and manage column attributes for some
tables only at a single place?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

> I wonder why you’re asking of the SQLAlchemy list about
> a specific developmental goal of the Postgresql project?

I hoped that some more corresponding experiences could
already be shared here.


> Wouldn’t you ask them about this?

That might follow ...


How should I add the parameter "IF NOT EXISTS" to Python
classes in the meantime eventually?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Jonathan Vanasco

On Friday, January 23, 2015 at 11:30:33 AM UTC-5, SF Markus Elfring wrote:
>
> Are there any chances that this database software 
> implementation will become robust and safe against 
> the discussed race condition? 

I would not count on this happening in the near future as it doesn't seem 
to be slated as a bug to be worked on, but you can ask on one of the 
postgres lists (http://www.postgresql.org/list/). 

As others noted, dynamic table creation within an application's runtime 
isn't really a standard approach -- it usually only happens in specific 
situations (multi-tenancy platforms, admin interfaces, automated plugin 
installations [ie, wordpress, drupal, etc]).   Having multiple processes 
attempt to dynamically create tables in a high-concurrency situation is 
even less standard.  The status-quo is to have some sort of setup script, 
or a setup routine that happens before forking threads (or whatever).  If 
the table structure/name is known and expected to be used -- there's not 
really a good reason to defer creating it .  

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Michael Bayer



SF Markus Elfring  wrote:

> 
> 
>> There would still be a race condition within the
>> Postgres internal functions.
> 
> Are there any chances that this database software
> implementation will become robust and safe against
> the discussed race condition?


I wonder why you’re asking of the SQLAlchemy list about a specific
developmental goal of the Postgresql project? Wouldn’t you ask them about
this?

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

> Using "IF NOT EXISTS" would not solve this problem
> in a high concurrency scenario.

Thanks for your feedback.


> There would still be a race condition within the
> Postgres internal functions.

Are there any chances that this database software
implementation will become robust and safe against
the discussed race condition?


> Have you tried using savepoints?

I am going to use a serial database preparation step
instead for my application so that the corresponding
tables will be explicitly deleted and created by
a small SQL script.

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Jonathan Vanasco

Using "IF NOT EXISTS" would not solve this problem in a high concurrency 
scenario.  

There would still be a race condition within the Postgres internal 
functions.  This is because of how Postgres checks for existing tables and 
creates new ones with its internal bookkeeping.  It's explained in the link 
that Mike shared above.

Have you tried using savepoints?  SqlAlchemy handles them within the 
"nested transaction" concept.  I'm not sure how Postgres would handle race 
conditions of table creation within a savepoint, but it might get you past 
this issue.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Simon King

On Fri, Jan 23, 2015 at 12:17 PM, SF Markus Elfring
 wrote:
>>> The parameter "IF NOT EXISTS" could be passed to the SQL statement "CREATE 
>>> TABLE".
>>>
>>> * Is this setting already used by the class library "SQLAlchemy 0.9.8-78.1"?
>>
>> From a quick scan of the docs it appears not.
>
> How can parameter additions be achieved for this software?
>

The general mechanism for generating SQL that isn't directly supported
by SQLAlchemy is to use the "compiler" module:

http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html

I guess it's the CreateTable object you'd need to adapt:

http://docs.sqlalchemy.org/en/rel_0_9/core/ddl.html

Although this sounds to me like a reasonable flag to add to core (it's
supported by multiple database implementations), so perhaps if you
added support for it directly into SQLAlchemy, Mike might accept that
patch.

>
>> Are you asking if "IF NOT EXISTS" will work in a high-concurrency context?
>
> Yes.
>
> How many database implementations support the simultaneous table creation
> according to the rules from transaction management?
>

No idea. You'd need to ask the makers of all the databases you are
interested in.

Simon

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

>> The parameter "IF NOT EXISTS" could be passed to the SQL statement "CREATE 
>> TABLE".
>>
>> * Is this setting already used by the class library "SQLAlchemy 0.9.8-78.1"?
> 
> From a quick scan of the docs it appears not.

How can parameter additions be achieved for this software?


> Are you asking if "IF NOT EXISTS" will work in a high-concurrency context?

Yes.

How many database implementations support the simultaneous table creation
according to the rules from transaction management?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Simon King

On Fri, Jan 23, 2015 at 11:45 AM, SF Markus Elfring
 wrote:
>>> I would expect that database implementations will provide functionality
>>> for parallel updates including concurrent creation of each table.
>>
>> What would you expect a database to do if it receives 2 "CREATE TABLE
>> my_table(...)" instructions simultaneously?
>
> This depends on the passed parameters.
>

That makes the rules complicated. At the moment, I imagine the rules
are very simple - a "CREATE TABLE" statement will fail if the table
already exists, unless "IF NOT EXISTS" is passed. Changing the rules
to allow an identical CREATE TABLE statement to silently do nothing
sounds like a lot of work for very little benefit to me.

>
>> What if the table definitions are different?
>
> I am going to pass the same settings for the application I am developing
> at the moment.

(I was talking about the general case, not your specific case)

>
>
>> One of the instructions would *have* to fail.
>
> Not in every case.
>
> The parameter "IF NOT EXISTS" could be passed to the SQL statement "CREATE 
> TABLE".
>
> * Is this setting already used by the class library "SQLAlchemy 0.9.8-78.1"?

>From a quick scan of the docs it appears not.

>
> * Does it really work in the current praxis?
>

Are you asking if "IF NOT EXISTS" will work in a high-concurrency
context? I've no idea, but this is a very PostgreSQL-specific
question, so you'd probably get a better answer from them.

Simon

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread SF Markus Elfring

>> I would expect that database implementations will provide functionality
>> for parallel updates including concurrent creation of each table.
> 
> What would you expect a database to do if it receives 2 "CREATE TABLE
> my_table(...)" instructions simultaneously?

This depends on the passed parameters.


> What if the table definitions are different?

I am going to pass the same settings for the application I am developing
at the moment.


> One of the instructions would *have* to fail.

Not in every case.

The parameter "IF NOT EXISTS" could be passed to the SQL statement "CREATE 
TABLE".

* Is this setting already used by the class library "SQLAlchemy 0.9.8-78.1"?

* Does it really work in the current praxis?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-23 Thread Simon King

On Fri, Jan 23, 2015 at 7:20 AM, SF Markus Elfring
 wrote:
>>> I would appreciate if I can fill these data structures in parallel
>>> without a serial database preparation step (table creation with
>>> repeated meta-data specification).
>>
>> You’d need to implement checks for this concurrency.
>
> I would expect that database implementations will provide functionality
> for parallel updates including concurrent creation of each table.
>

What would you expect a database to do if it receives 2 "CREATE TABLE
my_table(...)" instructions simultaneously? What if the table
definitions are different? One of the instructions would *have* to
fail.

>
>> A create table would need to be preceded by a check to see that it exists,
>> for example, and that would likely need to be mutexed on that name so that
>> no race condition occurs in between the time that the name is checked
>> vs. the create table is emitted.
>
> I am curious to clarify more software development challenges.
> How many open issues are already known around such implementation details?
>
> Regards,
> Markus
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread SF Markus Elfring

>> I would appreciate if I can fill these data structures in parallel
>> without a serial database preparation step (table creation with
>> repeated meta-data specification).
> 
> You’d need to implement checks for this concurrency.

I would expect that database implementations will provide functionality
for parallel updates including concurrent creation of each table.


> A create table would need to be preceded by a check to see that it exists,
> for example, and that would likely need to be mutexed on that name so that
> no race condition occurs in between the time that the name is checked
> vs. the create table is emitted.

I am curious to clarify more software development challenges.
How many open issues are already known around such implementation details?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread Michael Bayer

SF Markus Elfring  wrote:

> 
>> So there is no valid use case for parallel generation of tables
>> except in particular kinds of multi-tenancy situations.
> 
> I find my use case simple enough (and therefore very valid).
> I am going to manage two database tables by corresponding
> Python classes with SQLAlchemy services.
> 
> I would appreciate if I can fill these data structures in parallel
> without a serial database preparation step (table creation with
> repeated meta-data specification).

You’d need to implement checks for this concurrency.   A create table would 
need to be preceded by a check to see that it exists, for example, and that 
would likely need to be mutexed on that name so that no race condition occurs 
in between the time that the name is checked vs. the create table is emitted.   
if the mutex is on the database end, then you’d need to use Postgresql’s 
facilities; pg_advisory_lock() appears to be session level 
(http://www.postgresql.org/docs/9.1/static/functions-admin.html) , which is 
probably not sufficient if you are using multiple processes.  

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread SF Markus Elfring


> So there is no valid use case for parallel generation of tables
> except in particular kinds of multi-tenancy situations.

I find my use case simple enough (and therefore very valid).
I am going to manage two database tables by corresponding
Python classes with SQLAlchemy services.

I would appreciate if I can fill these data structures in parallel
without a serial database preparation step (table creation with
repeated meta-data specification).

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread Jonathan Vanasco

>
> On Thursday, January 22, 2015 at 2:40:19 PM UTC-5, SF Markus Elfring wrote:
> I am surprised that this database software show such (unexpected) 
> behaviour.
>

Aside from what Michael wrote...

The reason why you're seeing an IntegrityError like that, is because 
Postgres is raising an integrity error on it's internal tables.  The `pg_` 
prefix is well known to relate to internal postgres system data. 
 PostgreSQL explicitly reserves the `pg_` prefix for it's own schemas and 
advises users against creating tables that begin with `pg_` as well 
(http://www.postgresql.org/docs/9.2/static/ddl-schemas.html).  So to those 
with experience in Postgres, this wouldn't be unexpected error and 
accurately explains what the underlying problem is.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread Michael Bayer

SF Markus Elfring  wrote:

> 
>> The concurrent creation of tables is an extremely unusual use case
>> that I would not recommend, but especially when two different
>> threads/processes are contending to create the same table,
> 
> Should this use case become more known after the number of processor
> cores grew in various computers through the years?

not at all.

The schema of a relational database is fixed in a similar way as to the 
structure of the software itself.  Just like it’s not a great idea to 
dynamically produce and compile new code within an application in order to 
solve an issue, it’s not a great idea to produce new schema structures on the 
fly. Decent relational database design will typically have as many as a few 
dozen tables in order to handle a particular component of a reasonably-designed 
application; databases that have hundreds of tables are typically dealing with 
multiple sub-components of an application.  A schema that has “N” tables 
with no upper bound doesn’t actually have any schema design, the app is just 
generating tables in response to data input.   So there is no valid use case 
for parallel generation of tables except in particular kinds of multi-tenancy 
situations.

> 
> 
>> that suggests the system attempts to encode data within the database
>> schema itself; that is, the existence of a table “XYZ” in fact
>> represents the data “XYZ” being present.
> 
> I do not try to achieve such a data encoding for my software application
> at the moment.
> 
> 
>> Otherwise, the PG developers in that thread suggest this use case
>> can be facilitated using postgres advisory locks.
> 
> I am going to try another approach out.
> 
> * Will it be more useful here to extend a serial database preparation step
>  before records will be stored by background processes simultaneously?
> 
> * How often will corresponding implementation details need to be redesigned?
> 
> Regards,
> Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread SF Markus Elfring

> However in this case, this appears to be a side effect of attempting
> to issue CREATE TABLE statements in parallel, and in particular attempting
> to create a table of the same name twice in two different processes.

It seems that I dared this approach.


> PG developers describe this expected behavior here:
> http://www.postgresql.org/message-id/ca+tgmozadyvtwbfp1fl2smzbihcwt4uprzrlnnx1nb30ku3...@mail.gmail.com

Does the dialogue on a topic like "Errors on CREATE TABLE IF NOT EXISTS"
show any remaining open issues?


> I’m not surprised that PG does not prioritize attempting to make
> this error more user friendly as this is a very odd and probably
> unnecessary use case.

I am surprised that this database software show such (unexpected) behaviour.


> this has nothing to do with the ORM nor does it really have much
> to do with SQLAlchemy as a whole.

It makes it more convenient to stress other software components,
doesn't it?


> You would have this issue in exactly the same way if you were emitting
> conflicting CREATE TABLE statements on the DBAPI cursor directly.

Thanks for your acknowledgement.


> The concurrent creation of tables is an extremely unusual use case
> that I would not recommend, but especially when two different
> threads/processes are contending to create the same table,

Should this use case become more known after the number of processor
cores grew in various computers through the years?


> that suggests the system attempts to encode data within the database
> schema itself; that is, the existence of a table “XYZ” in fact
> represents the data “XYZ” being present.

I do not try to achieve such a data encoding for my software application
at the moment.


> Otherwise, the PG developers in that thread suggest this use case
> can be facilitated using postgres advisory locks.

I am going to try another approach out.

* Will it be more useful here to extend a serial database preparation step
  before records will be stored by background processes simultaneously?

* How often will corresponding implementation details need to be redesigned?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread Michael Bayer

SF Markus Elfring  wrote:

> Hello,
> 
> I try to write some data from a source code analysis
> which is performed by a few processor cores simultaneously
> into a single table.
> Now I stumble on a message like the following again.
> 
> "…
> sqlalchemy.exc.IntegrityError: (IntegrityError) duplicate key value violates 
> unique constraint "pg_type_typname_nsp_index"
> DETAIL:  Key (typname, typnamespace)=(positions_parallel1_line_seq, 2200) 
> already exists.
> '\nCREATE TABLE positions_parallel1 (\n\tfunction […], pattern)\n)\n\n' {}
> …"
> 
> 
> The following software components were involved for this
> application test on my openSUSE Tumbleweed system.
> 
> 1. SQLAlchemy 0.9.8-78.1
> 2. Psycopg 2.5.2-3.1
> 3. PostgreSQL 9.3.5-3.1
> 4. Python 2.7.9-2.1
> 5. Coccinelle spatch 1.0.0-rc23
> 6. make 4.1-2.2
> 
> 
> I have searched a bit on the internet.

Normally, this is an integrity constraint error which is raised by your 
database, which occurs when a process attempts to INSERT or UPDATE a row to 
include a key that already exists.   When performing an INSERT or UPDATE you 
need to ensure that the row will not duplicate a uniquely-constraint value that 
is already present in the table.

However in this case, this appears to be a side effect of attempting to issue 
CREATE TABLE statements in parallel, and in particular attempting to create a 
table of the same name twice in two different processes.   PG developers 
describe this expected behavior here: 
http://www.postgresql.org/message-id/ca+tgmozadyvtwbfp1fl2smzbihcwt4uprzrlnnx1nb30ku3...@mail.gmail.com.
  I’m not surprised that PG does not prioritize attempting to make this error 
more user friendly as this is a very odd and probably unnecessary use case.

> Are there any further
> software development challenges to consider for the parallel
> creation of database tables
> with the object-relational interface?

this has nothing to do with the ORM nor does it really have much to do with 
SQLAlchemy as a whole.   SQLAlchemy can be seen as a comprehensive system to 
automate the sending and receiving of messages along a database connection.   
You would have this issue in exactly the same way if you were emitting 
conflicting CREATE TABLE statements on the DBAPI cursor directly.

The concurrent creation of tables is an extremely unusual use case that I would 
not recommend, but especially when two different threads/processes are 
contending to create the same table, that suggests the system attempts to 
encode data within the database schema itself; that is, the existence of a 
table “XYZ” in fact represents the data “XYZ” being present.   There are some 
high volume websites that do things like this, like Reddit, but I think they’re 
nuts.   But certainly, for an application that isn’t handling millions of 
simultaneous users or hundreds of billions of rows, such an architecture is 
absolutely not called for.  Data should be stored in rows, not in the names of 
tables.

Otherwise, the PG developers in that thread suggest this use case can be 
facilitated using postgres advisory locks.  I have no experience with those, 
but see that thread for more details.

> 
> Regards,
> Markus
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

[sqlalchemy] Difficulties with parallel data insertion into the same table

2015-01-22 Thread SF Markus Elfring

Hello,

I try to write some data from a source code analysis
which is performed by a few processor cores simultaneously
into a single table.
Now I stumble on a message like the following again.

"…
sqlalchemy.exc.IntegrityError: (IntegrityError) duplicate key value violates 
unique constraint "pg_type_typname_nsp_index"
DETAIL:  Key (typname, typnamespace)=(positions_parallel1_line_seq, 2200) 
already exists.
 '\nCREATE TABLE positions_parallel1 (\n\tfunction […], pattern)\n)\n\n' {}
…"


The following software components were involved for this
application test on my openSUSE Tumbleweed system.

1. SQLAlchemy 0.9.8-78.1
2. Psycopg 2.5.2-3.1
3. PostgreSQL 9.3.5-3.1
4. Python 2.7.9-2.1
5. Coccinelle spatch 1.0.0-rc23
6. make 4.1-2.2


I have searched a bit on the internet. Are there any further
software development challenges to consider for the parallel
creation of database tables with the object-relational interface?

Regards,
Markus

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

Re: [sqlalchemy] Difficulties with parallel data insertion into the same table

[sqlalchemy] Difficulties with parallel data insertion into the same table

24 matches

Site Navigation

Mail list logo

Footer information