Ning is currently out on vacation; I think he'll be back to working on this
when he returns.
JVS
From: Viraj Bhat [vi...@yahoo-inc.com]
Sent: Thursday, July 01, 2010 11:40 PM
To: hive-user@hadoop.apache.org
Subject: RE: merging the size of the reduce outp
Okay I read that this is a work in progress
https://issues.apache.org/jira/browse/HIVE-1307 to deal with small files
when doing dynamic partitioning.
There was a suggestion to try:
hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
for Hadoop 20 when running queries on this p
Hi Yongqiang,
I am facing a similar situation, I am using the latest trunk of Hive. I
am using dynamic partitioning of Hive and it is a Map only job, which
converts files from compressed TXT gz to RC format.
The DDL of the task looks similar to:
FROM gztable
INSERT OVERWRITE TABLE rctab
Paul:
thanks.
currently I do not need this feature from Hive QL, just need it in metastore.
you said "There exists structures for supporting this in the
metastore", could you please give more details? I suppose the
interface to metastore is basically classes like Table, Partition, but
in the Par
There exists structures for supporting this in the metastore, but that feature
isn't in Hive yet. For example, although the metadata for partitions include
its own set of columns, parts of the code in the query processor still read
from table level metadata.
Some evolution can occur in the form
I read on the VLBD hive paper "Hive - A Warehousing Solution Over a
Map-ReduceFramework"
that Partitions could have different schemas : (section 3.1 MetaStore) "
Partition - Each partition can have its own columns
and SerDe and storage information. This can be used
in the future to support sche
Thanks John,
Can you provide me with some pointers?. My team can try to work on it.
Our workaround right now is to call the Thrift API from within Hive using a
UDF.
Thanks,
-ray
On Thu, Jul 1, 2010 at 1:19 PM, John Sichi wrote:
> On Jul 1, 2010, at 10:36 AM, Ray Duong wrote:
>
> > Is there
Take a look at [Combine]HiveInputFormat; they are what we wrap around your
input formats in order to allow Hive to access data from multiple input formats
in the same job.
JVS
On Jul 1, 2010, at 10:16 AM, yan qi wrote:
sHi, Namit,
Thanks a lot for your reply!
I checked the source code. G
On Jul 1, 2010, at 10:36 AM, Ray Duong wrote:
> Is there away to do a hbase key lookup using the Hive-Hbase integration
> without doing a full scan?
>
> Since I'm specifying the key='foo' in the where condition, shouldn't it be a
> fast lookup?
Hi Ray,
Pushing down filters to HBase is one of
Is there away to do a hbase key lookup using the Hive-Hbase integration
without doing a full scan?
Since I'm specifying the key='foo' in the where condition, shouldn't it be a
fast lookup?
thanks,
-ray
sHi, Namit,
Thanks a lot for your reply!
I checked the source code. Given a query, (select tmp7.* from tmp7 join
tmp2 on (tmp7.c2 = tmp2.c1)), there is only a MapReduce job generated. As
far as I know, the function setInputFormat would be used to set the job's
InputFormat class, in the ExecDr
That's fine
The 2 tables can have different inputformats
Sent from my iPhone
On Jul 1, 2010, at 9:51 AM, "yan qi" wrote:
> Hi,
>
> I have a question about the JOIN operation in Hive.
>
> For example, I have a query, like
>
>select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1);
>
>
Hi,
I have a question about the JOIN operation in Hive.
For example, I have a query, like
select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1);
Clearly, there is a JOIN involved in the statement.
1. tmp2 and tmp7 are two tables.
2. c2 and c1 are columns belonging to tmp7 and
13 matches
Mail list logo