Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Mich Talebzadeh
Hi, Just to clarify I use Hive with Spark engine (default) so Hive on Spark engine as we discussed and observed. Now with regard to Spark (as an app NOT execution engine) doing the create table in Hive and populating it, I don't think Spark itself does any transactional enforcement. This means

Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Mich Talebzadeh
OK this seems to work. 1. Create the target table first 2. Populate afterwards I first created the target table with hive> create table test.dummy as select * from oraclehadoop.dummy where 1 = 2; Then did INSERT/SELECT and tried to drop the target table when DML (INSERT/SELECT) was

Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Michael Segel
> On Jun 8, 2016, at 3:35 PM, Eugene Koifman wrote: > > if you split “create table test.dummy as select * from oraclehadoop.dummy;” > into create table statement, followed by insert into test.dummy as select… > you should see the behavior you expect with Hive. > Drop

Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Mich Talebzadeh
Hive version is 2 We can discuss all sorts of scenarios. However, Hivek is pretty good at applying the locks at both the table and partition level. The idea of having a metadata is to enforce these rules. [image: Inline images 1] For example above inserting from source to target table

Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Michael Segel
Doh! It would help if I use the email address to send to the list… Hi, Lets take a step back… Which version of Hive? Hive recently added transaction support so you have to know your isolation level. Also are you running spark as your execution engine, or are you talking about a spark

Re: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Mich Talebzadeh
Hi, The idea of accessing Hive metada is to be aware of concurrency. In generall if I do the following In Hive hive> create table test.dummy as select * from oraclehadoop.dummy; We can see that hive applies the locks in Hive [image: Inline images 2] However, there seems to be an

RE: Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread David Newberger
Could you be looking at 2 jobs trying to use the same file and one getting to it before the other and finally removing it? David Newberger From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Wednesday, June 8, 2016 1:33 PM To: user; user @spark Subject: Creating a Hive table through

Creating a Hive table through Spark and potential locking issue (a bug)

2016-06-08 Thread Mich Talebzadeh
Hi, I noticed an issue with Spark creating and populating a Hive table. The process as I see is as follows: 1. Spark creates the Hive table. In this case an ORC table in a Hive Database 2. Spark uses JDBC connection to get data out from an Oracle 3. I create a temp table in Spark