Hi,
If you have lot of small files, by default Hive will group various of them
in a single mapper.
Check this property:
hive.input.format (org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
(default, if empty) => if you set it to
org.apache.hadoop.hive.ql.io.HiveInputFormat, you´ll get
Hi,
I have a Hadoop&HBase cluster, that runs Hadoop 1.1.2 and HBase 0.94.7.
I notice an issue that stops normal cluster running.
My use case: I have several MR jobs that read data from one HBase table in
map phase and write data in 3 different tables during the reduce phase. I
create table handle
You need to use globs when passing your input path, like below perhaps:
data/shard*/d1*
On Thu, Oct 3, 2013 at 1:28 AM, jamal sasha wrote:
> Hi,
> I have data in this one folder like following:
>
> data---shard1---d1_1
> | |_d2_1
> Lshard2---d1_1
>
On Fri, Sep 27, 2013 at 2:42 AM, Jitendra Yadav
wrote:
> Hi All,
>
> Since few years, I'm working as hadoop admin on Linux platform,Though we
> have majority of servers on Solaris (Sun Sparc hardware). Many times I have
> seen that hadoop is compatible with Linux. Is that right?. If yes then what
Thanks Chris.
Hope someone answers/give pointer to get clear idea about question4.
Regards,
Krishna
On Wed, Oct 2, 2013 at 1:41 PM, Chris Mawata wrote:
> Don't know about question 4 but for the first three -- the metadata is
> in the memory of the namenode at runtime but is also persisted to
Hi, I have a question related to how the mapper generated for the input files
from HDFS. I understand the split and blocks concept in the HDFS, but my
originally understanding is that one mapper will only process data from one
file in HDFS, no matter how small this file it is. Is that correct?
T
Hi,
I have data in this one folder like following:
data---shard1---d1_1
| |_d2_1
Lshard2---d1_1
| |_d2_2
Lshard3---d1_1
| |_d2_3
Lshard4---d1_1
|_d2_4
Now, I want to
Karim!
Hadoop 3.0 corresponds to trunk currently. I would recommend you to use
branch-2. Its fairly stable. hadoop-1.x is rather old and is in maintenance
mode now. You can get all the branches from
https://wiki.apache.org/hadoop/GitAndHadoop
git clone git://git.apache.org/hadoop-common.git
Pl
One more thing, Krishna, when using JounalNodes as opposed to the
native file system for the metadata storage you do get replication.
Chris
On 10/2/2013 12:52 AM, Krishna Kumaar Natarajan wrote:
Hi All,
While trying to understand federated HDFS in detail I had few doubts
and listing them d
Don't know about question 4 but for the first three -- the metadata is
in the memory of the namenode at runtime but is also persisted to disk
(otherwise it would be lost if you shut down and re-start the namenode).
The copy persisted to disk is on the native file system (not HDFS) and
no is not
Since hadoop 3.0 is 2 major versions higher, it will be significantly
different than working with hadoop 1.1.2. The hadoop-1.1 branch is
available on SVN at
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1/
On Tue, Oct 1, 2013 at 11:30 PM, Karim Awara wrote:
> Hi all,
>
> My p
11 matches
Mail list logo