Re: Spark SQL Transaction

2016-04-23 Thread Andrés Ivaldi
Thanks, I'll take a look to JdbcUtils regards. On Sat, Apr 23, 2016 at 2:57 PM, Todd Nist wrote: > I believe the class you are looking for is > org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala. > > By default in savePartition(...) , it will do the following:

Re: Spark SQL Transaction

2016-04-23 Thread Todd Nist
I believe the class you are looking for is org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala. By default in savePartition(...) , it will do the following: if (supportsTransactions) { conn.setAutoCommit(false) // Everything in the same db transaction. } Then at line 224, it will

Re: Spark SQL Transaction

2016-04-23 Thread Mich Talebzadeh
In your JDBC connection you can do conn.commit(); or conn.rollback() Why don't insert your data into #table in MSSQL and from there do one insert/select into the main table. That is from ETL. In that case your main table will be protected. Either it will have full data or no data. Also have

Re: Spark SQL Transaction

2016-04-23 Thread Andrés Ivaldi
Hello, so I executed Profiler and found that implicit isolation was turn on by JDBC driver, this is the default behavior of MSSQL JDBC driver, but it's possible change it with setAutoCommit method. There is no property for that so I've to do it in the code, do you now where can I access to the

Re: Spark SQL Transaction

2016-04-21 Thread Mich Talebzadeh
This statement ."..each database statement is atomic and is itself a transaction.. your statements should be atomic and there will be no ‘redo’ or ‘commit’ or ‘rollback’." MSSQL compiles with ACIDITY which requires that each transaction be "all or nothing": if one part of the transaction fails,

Re: Spark SQL Transaction

2016-04-21 Thread Michael Segel
Hi, Sometimes terms get muddled over time. If you’re not using transactions, then each database statement is atomic and is itself a transaction. So unless you have some explicit ‘Begin Work’ at the start…. your statements should be atomic and there will be no ‘redo’ or ‘commit’ or

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
* Andrés Ivaldi > *Cc:* user @spark > *Subject:* Re: Spark SQL Transaction > > > > Well Oracle will allow that if the underlying table is in NOLOOGING mode :) > > > > mtale...@mydb12.mich.LOCAL> create table testme(col1 int); > > Table created. > > mt

RE: Spark SQL Transaction

2016-04-20 Thread Strange, Nick
@spark Subject: Re: Spark SQL Transaction Well Oracle will allow that if the underlying table is in NOLOOGING mode :) mtale...@mydb12.mich.LOCAL<mailto:mtale...@mydb12.mich.LOCAL>> create table testme(col1 int); Table created. mtale...@mydb12.mich.LOCAL<mailto:mtale...@mydb12.mich.L

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Well Oracle will allow that if the underlying table is in NOLOOGING mode :) mtale...@mydb12.mich.LOCAL> create table testme(col1 int); Table created. mtale...@mydb12.mich.LOCAL> *alter table testme NOLOGGING;* Table altered. mtale...@mydb12.mich.LOCAL> insert into testme values(1); 1 row created.

Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
I think the same, and I don't think reducing batches size improves speed but will avoid loosing all data when rollback. Thanks for the help.. On Wed, Apr 20, 2016 at 4:03 PM, Mich Talebzadeh wrote: > yep. I think it is not possible to make SQL Server do a non

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
yep. I think it is not possible to make SQL Server do a non logged transaction. Other alternative is doing inserts in small batches if possible. Or write to a CSV type file and use Bulk copy to load the file into MSSQL with frequent commits like every 50K rows? Dr Mich Talebzadeh LinkedIn *

Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
Yes, I know that behavior , but there is not explicit Begin Transaction in my code, so, maybe Spark or the same driver is adding the begin transaction, or implicit transaction is configured. If spark is'n adding a Begin transaction on each insertion, then probably is database or Driver

Fwd: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
You will see what is happening in SQL Server. First create a test table called testme 1> use tempdb 2> go 1> create table testme(col1 int) 2> go -- Now explicitly begin a transaction and insert 1 row and select from table 1> *begin tran*2> insert into testme values(1) 3> select * from testme 4>

Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
Sorry I'cant answer before, I want to know if spark is the responsible to add the Begin Tran, The point is to speed up insertion over losing data, Disabling Transaction will speed up the insertion and we dont care about consistency... I'll disable te implicit_transaction and see what happens.

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Assuming that you are using JDBC for putting data into any ACID compliant database (MSSQL, Sybase, Oracle etc), you are implicitly or explicitly adding BEGIN TRAN to INSERT statement in a distributed transaction. MSSQL does not know or care where data is coming from. If your connection completes

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Are you using JDBC to push data to MSSQL? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 19 April

Re: Spark SQL Transaction

2016-04-19 Thread Andrés Ivaldi
I mean local transaction, We've ran a Job that writes into SQLServer then we killed spark JVM just for testing purpose and we realized that SQLServer did a rollback. Regards On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh wrote: > Hi, > > What do you mean by

Re: Spark SQL Transaction

2016-04-19 Thread Mich Talebzadeh
Hi, What do you mean by *without transaction*? do you mean forcing SQL Server to accept a non logged operation? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Spark SQL Transaction

2016-04-19 Thread Andrés Ivaldi
Hello, is possible to execute a SQL write without Transaction? we dont need transactions to save our data and this adds an overhead to the SQLServer. Regards. -- Ing. Ivaldi Andres