Sir
I want to retrieve names of fields having a particular value.
How can it be done on a single table and across multiple tables?
G Sudha
--- On Mon, 10/1/12, Harsh J ha...@cloudera.com wrote:
From: Harsh J ha...@cloudera.com
Subject: Re: doubts reg Hive
To: common-user@hadoop.apache.org
Date:
I am out of the office until 10/08/2012.
I am out of office. I will reply you after the holiday.
Note: This is an automated response to your message
java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
sent on 01/10/2012 21:32:09.
This is the only notification you will
Hi all
How do you add a small file to distributed cache in MR program
Regards
Abhi
Sent from my iPhone
Hi Abshiek
You can find a simple example of using Distributed Cache here
http://kickstarthadoop.blogspot.co.uk/2011/05/hadoop-for-dependent-data-splits-using.html
--Original Message--
From: Abhishek
To: common-user@hadoop.apache.org
ReplyTo: common-user@hadoop.apache.org
Subject: Add file
Privet Oleg
Cloudera and Dell setup the following cluster for my company
Company receives 1.5 TB raw data per day
38 data nodes + 2 Name Nodes
Data Node:
Dell PowerEdge C2100 series
2 x XEON x5670
48 GB RAM ECC (12x4GB 1333MHz)
12 x 2 TB 7200 RPM SATA HDD (with hot swap) JBOD
Intel Gigabit
Hi Colin, Thanks for your reply.
What is mean that the patch will work on files that are in the process of
being written?
Thanks,
LiuLei
2012/10/1 Colin McCabe cmcc...@alumni.cmu.edu
I'm going to post a patch to HDFS-347 shortly. From the user's point
of view, the important thing about the
Hi all,
I have understood the Hadoop and Hadoop Ecosystem(Pig as ETL, Hive as DataWare
house, Sqoop as importing tool). I worked and learned on single node cluster
with demo data.
As Hadoop suits best on Unix platform. Please help me to understand the
requirement form start to finish to use
Nothing personal but I might be the only one to answer and I will provide
only one link
http://www.catb.org/~esr/faqs/smart-questions.html
Regards
Bertrand
On Mon, Oct 1, 2012 at 4:19 PM, yogesh dhari yogeshdh...@live.com wrote:
Hi all,
I have understood the Hadoop and Hadoop
Try using Cloudera CDH4.
http://www.cloudera.com/products-services/enterprise/
It´s a easy way, web front-ended Hadoop ecosystem manager.
Regards,
Rafael Pecin
De: yogesh dhari [mailto:yogeshdh...@live.com]
Enviada em: segunda-feira, 1 de outubro de 2012 11:19
Para: hadoop helpforoum
Assunto:
Hi,
i am kind of unsure where to post this problem, but i think it is more
related to hadoop than to pig.
By successfully executing a pig script i created a new file in my hdfs.
Sadly though, i cannot use it for further processing except for
dumping and viewing the data: every
Vinod's right, but its just waitForCompletion(true|false); you ought
to use for isSuccessful() checks afterwards, cause with submit(),
which is a non-blocking way of submission, you'll end up immediately
checking it and get a false always, cause the state will still be
mostly RUNNING at that
Hi Robert,
the exception i see in the output of the grunt shell and in the pig log
respectively is:
Backend error message
-
java.util.EmptyStackException
at java.util.Stack.peek(Stack.java:102)
at
What speed do people typically see for the copy during a reduce?
From tasktracker here is an average on:
reduce copy (500 of 504 at 1.52 MB/s)
We have seen it range from .5 to 4 MB/s.
That seems a bit slow.
Does anyone else have other benchmark numbers to share?
Hi Brandon,
On Mon, Oct 1, 2012 at 11:23 PM, Brandon bma...@upstreamsoftware.com wrote:
What speed do people typically see for the copy during a reduce?
It varies due to a few factors. But there's highly improved
netty-based transfers in Hadoop 2.x that you can use for even faster,
and more
Got it to work by empting HADOOP_CLASSPATH variable.
Andy Kartashov
MPAC
Architecture RD, Co-op
1340 Pickering Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca
From: Alexander Pivovarov
I would like to be able to resize a set of inputs, already in SequenceFile
format, to be larger.
I have tried 'hadoop distcp -Ddfs.block.size=$[64*1024*1024]' and did not
get what I expected. The outputs were exactly the same as the inputs.
I also tried running a job with an IdentityMapper and
I am out of the office until 10/08/2012.
I am out of office. I will reply you after the holiday.
Note: This is an automated response to your message HADOOP in Production
sent on 01/10/2012 21:36:58.
This is the only notification you will receive while this person is away.
The script i now want to executed looks like this:
x = load 'tag_count_ts_pro_userpair' as
(group:tuple(),cnt:int,times:bag{t:tuple(c:chararray)});
y = foreach x generate *, moins.daysFromStart('2011-06-01 00:00:00',
times);
store y into 'test_daysFromStart';
The problem is, that i do not
Hi Bejoy,
thank you for the answer, I will try it. But I still have a doubt.
How should I manage connections to HBase inside the job?
Should I open a new connection in each job? How can I set a
connectionPool inside a job?
Thank you,
Pablo
From: Bejoy Ks [mailto:bejoy.had...@gmail.com]
Sent:
Hello Anna,
If I understand correctly, you have a set of multiple sequence files, each
much smaller than the desired block size, and you want to concatenate them
into a set of fewer files, each one more closely aligned to your desired
block size. Presumably, the goal is to improve throughput of
Hi Anna
If you want to increase the block size of existing files. You can use a
Identity Mapper with no reducer. Set the min and max split sizes to your
requirement (512Mb). Use SequenceFileInputFormat and SequenceFileOutputFormat
for your job.
Your job should be done.
Regards
Bejoy KS
21 matches
Mail list logo