Hi,
Just to clarify I use Hive with Spark engine (default) so Hive on Spark
engine as we discussed and observed.
Now with regard to Spark (as an app NOT execution engine) doing the create
table in Hive and populating it, I don't think Spark itself does any
transactional enforcement. This means
OK this seems to work.
1. Create the target table first
2. Populate afterwards
I first created the target table with
hive> create table test.dummy as select * from oraclehadoop.dummy where 1 =
2;
Then did INSERT/SELECT and tried to drop the target table when DML
(INSERT/SELECT) was
> On Jun 8, 2016, at 3:35 PM, Eugene Koifman wrote:
>
> if you split “create table test.dummy as select * from oraclehadoop.dummy;”
> into create table statement, followed by insert into test.dummy as select…
> you should see the behavior you expect with Hive.
> Drop
Hive version is 2
We can discuss all sorts of scenarios. However, Hivek is pretty good at
applying the locks at both the table and partition level. The idea of
having a metadata is to enforce these rules.
[image: Inline images 1]
For example above inserting from source to target table
Doh! It would help if I use the email address to send to the list…
Hi,
Lets take a step back…
Which version of Hive?
Hive recently added transaction support so you have to know your isolation
level.
Also are you running spark as your execution engine, or are you talking about a
spark
Hi,
The idea of accessing Hive metada is to be aware of concurrency.
In generall if I do the following In Hive
hive> create table test.dummy as select * from oraclehadoop.dummy;
We can see that hive applies the locks in Hive
[image: Inline images 2]
However, there seems to be an
Could you be looking at 2 jobs trying to use the same file and one getting to
it before the other and finally removing it?
David Newberger
From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Wednesday, June 8, 2016 1:33 PM
To: user; user @spark
Subject: Creating a Hive table through
Hi,
I noticed an issue with Spark creating and populating a Hive table.
The process as I see is as follows:
1. Spark creates the Hive table. In this case an ORC table in a Hive
Database
2. Spark uses JDBC connection to get data out from an Oracle
3. I create a temp table in Spark