hive on spark spark.executor.instances has no effect

2015-06-12 Thread ????
I based on 
http://blog.cloudera.com/blog/2015/02/download-the-hive-on-spark-beta/; and 
http://blog.cloudera.com/blog/2015/02/download-the-hive-on-spark-beta/ 
Set parameters spark.executor.instances = 12 (I have four nodes), when I 
execute hive sql, spark executors are always 3, 1 driver and 2 Executor.



Is this a bug?


jack

Re: nested join issue

2015-06-12 Thread Gopal Vijayaraghavan
Hi

 Thanks for investigating..  Trying to locate the patch that fixes this
between 1.1 and 2.0.0-SNAPSHOT. Any leads on what Jira this fix was part
of? Or what part of the code the patch is likely to be on?

git bisect is the only way usually to identify these things.

But before you hunt into the patches I suggest trying combinations of
constant propogation, null-scan and identity projection remover
optimizations to see if there¹s a workaround in there.

An explain of the query added to a new JIRA would be good, to continue the
analysis.

Cheers,
Gopal




Re: delta file compact take no effect

2015-06-12 Thread Eugene Koifman
Delta files that are no longer needed are deleted asynchronously.
For example, you may have some query using delta_002_002. A minor 
compaction, for example, can run concurrently
and create delta_001_003 but it will leave delta_001_001, 
delta_002_002, delta_003_003 to be cleaned later.
A query that starts after this, will use delta_001_003 and ignore 
delta_001_001, delta_002_002, delta_003_003, thus it 
has fewer files to read and merge.  delta_001_001, 
delta_002_002, delta_003_003 will be deleted when the system 
determines that no query can be using them.

Judging by the directory listing you sent no major or minor compactions have 
ran.


From: r7raul1...@163.commailto:r7raul1...@163.com 
r7raul1...@163.commailto:r7raul1...@163.com
Reply-To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Date: Thursday, June 11, 2015 at 12:53 AM
To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Re: delta file compact take no effect

SHOW COMPACTIONS;
I can see some info

Database Table Partition Type State Worker Start Time
default u_data_txn NULL MAJOR initiated NULL 0
Time taken: 0.024 seconds, Fetched: 2 row(s)

But after that I still see many delta file.


r7raul1...@163.commailto:r7raul1...@163.com

From: Elliot Westmailto:tea...@gmail.com
Date: 2015-06-11 15:25
To: user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: delta file compact take no effect
What do you see if you issue:

SHOW COMPACTIONS;

On Thursday, 11 June 2015, r7raul1...@163.commailto:r7raul1...@163.com 
r7raul1...@163.commailto:r7raul1...@163.com wrote:

I use hive 1.1.0 on hadoop 2.5.0
After I do some update operation on table u_data_txn.
My table create many delta file like:
drwxr-xr-x - hdfs hive 0 2015-02-06 22:52 
/user/hive/warehouse/u_data_txn/delta_001_001
-rw-r--r-- 3 hdfs supergroup 346453 2015-02-06 22:52 
/user/hive/warehouse/u_data_txn/delta_001_001/bucket_0
-rw-r--r-- 3 hdfs supergroup 415924 2015-02-06 22:52 
/user/hive/warehouse/u_data_txn/delta_001_001/bucket_1
drwxr-xr-x - hdfs hive 0 2015-02-06 22:58 
/user/hive/warehouse/u_data_txn/delta_002_002
-rw-r--r-- 3 hdfs supergroup 807 2015-02-06 22:58 
/user/hive/warehouse/u_data_txn/delta_002_002/bucket_0
-rw-r--r-- 3 hdfs supergroup 779 2015-02-06 22:58 
/user/hive/warehouse/u_data_txn/delta_002_002/bucket_1
drwxr-xr-x - hdfs hive 0 2015-02-06 22:59 
/user/hive/warehouse/u_data_txn/delta_003_003
-rw-r--r-- 3 hdfs supergroup 817 2015-02-06 22:59 
/user/hive/warehouse/u_data_txn/delta_003_003/bucket_0
-rw-r--r-- 3 hdfs supergroup 767 2015-02-06 22:59 
/user/hive/warehouse/u_data_txn/delta_003_003/bucket_1
drwxr-xr-x - hdfs hive 0 2015-02-06 23:01 
/user/hive/warehouse/u_data_txn/delta_004_004
-rw-r--r-- 3 hdfs supergroup 817 2015-02-06 23:01 
/user/hive/warehouse/u_data_txn/delta_004_004/bucket_0
-rw-r--r-- 3 hdfs supergroup 779 2015-02-06 23:01 
/user/hive/warehouse/u_data_txn/delta_004_004/bucket_1
drwxr-xr-x - hdfs hive 0 2015-02-06 23:03 
/user/hive/warehouse/u_data_txn/delta_005_005
-rw-r--r-- 3 hdfs supergroup 817 2015-02-06 23:03 
/user/hive/warehouse/u_data_txn/delta_005_005/bucket_0
-rw-r--r-- 3 hdfs supergroup 779 2015-02-06 23:03 
/user/hive/warehouse/u_data_txn/delta_005_005/bucket_1
drwxr-xr-x - hdfs hive 0 2015-02-10 21:34 
/user/hive/warehouse/u_data_txn/delta_006_006
-rw-r--r-- 3 hdfs supergroup 821 2015-02-10 21:34 
/user/hive/warehouse/u_data_txn/delta_006_006/bucket_0
drwxr-xr-x - hdfs hive 0 2015-02-10 21:35 
/user/hive/warehouse/u_data_txn/delta_007_007
-rw-r--r-- 3 hdfs supergroup 821 2015-02-10 21:35 
/user/hive/warehouse/u_data_txn/delta_007_007/bucket_0
drwxr-xr-x - hdfs hive 0 2015-03-24 01:16 
/user/hive/warehouse/u_data_txn/delta_008_008
-rw-r--r-- 3 hdfs supergroup 1670 2015-03-24 01:16 
/user/hive/warehouse/u_data_txn/delta_008_008/bucket_0
-rw-r--r-- 3 hdfs supergroup 1767 2015-03-24 01:16 
/user/hive/warehouse/u_data_txn/delta_008_008/bucket_1

I try ALTER TABLE u_data_txn COMPACT 'MAJOR';
The delta still exist.
Then I try ALTER TABLE u_data_txn COMPACT 'MINOR';
The delta still exist.
How to  merge delta file?

My config is:
property
namehive.support.concurrency/name
valuetrue/value
/property
property
namehive.enforce.bucketing/name
valuetrue/value
/property
property
namehive.exe.dynamic.partition.mode/name
valuenonstrict/value
/property
property
namehive.txn.manager/name
valueorg.apache.hadoop.hive.ql.lockmgr.DbTxnManager/value
/property
property
namehive.compactor.initiator.on/name
valuetrue/value
/property
property
namehive.compactor.worker.threads/name
value4/value

Hive transaction feature in Hive 1.0

2015-06-12 Thread Jim Green
Hi Team,

Sharing the article which explains the Hive transaction features in Hive
1.0:
Hive transaction feature in Hive 1.0
http://www.openkb.info/2015/06/hive-transaction-feature-in-hive-10.html


-- 
Thanks,
www.openkb.info
(Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)


Equal predicate on timestamp column

2015-06-12 Thread Jie Zhang
Hi,

I have a table partitioned on every hour, the partitioning column ds is
timestamp type. However, I could not locate one partition with the equal
predicate on ds, only the range predicates works. Here are the ddl and
queries:

create table test (c1 int, c2 string) partitioned by (ds timestamp) stored
as orc;

// this query with range predicates returns records from the 15:00:00 hour
partition

select ds from test where ds between '2015-06-11 15:00:00' and '2015-06-11
15:00:01';

++--+

|   ds   |

++--+

| 2015-06-11 15:00:00.0  |

| 2015-06-11 15:00:00.0  |

++--+

// all the following queries with equal predicate does not return any record

select ds from test where ds = '2015-06-11 15:00:00';

select ds from test where ds = '2015-06-11 15:00:00.0';

select ds from test where ds = '2015-06-11 15:00:00.0';

+-+--+

| ds  |

+-+--+

Does anyone know why the equal predicate on timestamp col can not find the
match as expected while range predicate works fine? Thanks very much for
the help!

Jessica


Re: nested join issue

2015-06-12 Thread Gautam
Done. https://issues.apache.org/jira/browse/HIVE-10996

On Fri, Jun 12, 2015 at 1:47 PM, Gopal Vijayaraghavan gop...@apache.org
wrote:

 Hi

  Thanks for investigating..  Trying to locate the patch that fixes this
 between 1.1 and 2.0.0-SNAPSHOT. Any leads on what Jira this fix was part
 of? Or what part of the code the patch is likely to be on?

 git bisect is the only way usually to identify these things.

 But before you hunt into the patches I suggest trying combinations of
 constant propogation, null-scan and identity projection remover
 optimizations to see if there¹s a workaround in there.

 An explain of the query added to a new JIRA would be good, to continue the
 analysis.

 Cheers,
 Gopal





-- 
If you really want something in this life, you have to work for it. Now,
quiet! They're about to announce the lottery numbers...


Re: HBase and Hive integration

2015-06-12 Thread Buntu Dev
Thanks Nick for the write up. It was quite helpful for a newbie like me.

Is there any Hive config to provide the zookeeper quorum for the HBase
cluster since I got Hive and HBase on separate clusters?

Thanks!

On Tue, Jun 9, 2015 at 12:03 AM, Nick Dimiduk ndimi...@gmail.com wrote:

 Hi there.

 I go through a complete example in this pair of blog posts [0], [1].
 Basically, create the table with the storage handler, without EXTERNAL and
 it's lifecycle will be managed by hive.

 [0]: http://www.n10k.com/blog/hbase-via-hive-pt1/
 [1]: http://www.n10k.com/blog/hbase-via-hive-pt2/

 On Fri, Jun 5, 2015 at 10:56 AM, Sean Busbey bus...@cloudera.com wrote:

 +user@hive
 -user@hbase to bcc

 Hi!

 This question is better handled by the hive user list, so I've copied
 them in and moved the hbase user list to bcc.

 On Fri, Jun 5, 2015 at 12:54 PM, Buntu Dev buntu...@gmail.com wrote:

 Hi -

 Newbie question: I got Hive and HBase on different clusters and say all
 the
 appropriate ports are open to connect Hive to HBase, then how to create a
 Hive managed HBase table?

 Thanks!




 --
 Sean