Yes Harsh, I haven't set dfs.namenode.name.dir anywhere in config files. My
name node has again gone into safe mode today while it was idle. I shall
try setting this value to something other than /tmp
On Tue, Jul 16, 2013 at 6:39 AM, Harsh J ha...@cloudera.com wrote:
2013-07-12 11:04:26,002
Hi,
It doesn’t consider where the map’s ran to schedule the reducers because
reducers need to contact all the mappers for the map o/p’s. It schedules
reducers wherever the slots available.
Thanks
Devaraj k
From: Felix.徐 [mailto:ygnhz...@gmail.com]
Sent: 16 July 2013 09:25
To:
Hello,
I am trying to filter out some records in a table in hive.
The number of lines in this table is 4billions+,
I make a left semi join between above table and a small table with 1k lines.
However, after 3 hours job running, it turns out a fail status.
My question are as follows,
Can you try map only join?
Your one table is just 1k records .. map join will help you run it faster
and hopefully you will not hit memory condition
On Tue, Jul 16, 2013 at 12:56 PM, kira.w...@xiaoi.com wrote:
Hello,
** **
I am trying to filter out some records in a table in hive.
Thanks for you positive answer.
From your answer, I get the key word “map join”, and realize it, do you
mean that I can do as the blog says:
http://blog.csdn.net/xqy1522/article/details/6699740
If you do mind, please scan the website.
发件人: Nitin Pawar
Hi,
In the given image, I see there are some failed/killed map reduce task
attempts. Could you check why these are failing, you can check further based on
the fail/kill reason.
Thanks
Devaraj k
From: kira.w...@xiaoi.com [mailto:kira.w...@xiaoi.com]
Sent: 16 July 2013 12:57
To:
I have check it. As datanode logs shown that,
2013-07-16 00:05:31,294 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_201307041810_0138_m_000259_0,53) failed :
org.mortbay.jetty.EofException: timeout
This may be caused by a so-called “data skew” problem.
Thanks,
Dev,
from what I learned in my past exp with running huge one table queries is
one hits reduce side memory limits or timeout limits. I will wait for Kira
to give more details on the same.
sorry i forgot to ask for the logs and suggested a different approach :(
Kira,
Page is in chinese so can't
Hi all,
I am trying to understand the process of Collect, Spill and Merge in Map,
I've referred to a few documentations but still have a few questions.
Here is my understanding about the spill phase in map:
1.Collect function add a record into the buffer.
2.If the buffer exceeds a threshold
Dear All,
Did any one faced the issue :
While Loading huge dataset into hive table , hive restricting me to
query from same table.
I have set hive.support.concurrency=true, still showing
conflicting lock present for TABLENAME mode SHARED
property
namehive.support.concurrency/name
Hi everyone,
I am trying to import data from postgresql to hdfs. But I am having some
problems, Here is the problem details:
Sqoop Version: 1.4.3
Hadoop Version:1.0.4
*1) When I use this command:*
*
*
*./sqoop import-all-tables --connect jdbc:postgresql://
192.168.194.158:5432/IMS --username
Hi Bertrand,
I guess you configured two racks totally. one IDC is a rack, and another IDC is
another rack.
so if you want to don't replicate populate during one IDC down, you had to
change the replicate placement policy,
if there are minimum blocks on one rack, then don't do anything. (here
The error is:
*Please set $HBASE_HOME to the root of your HBase installation.*
*
*
Have you checked whether it is set or not? Have you verified your HBase or
Hadoop installation?
Similarly, the following:
*Cannot run program psql: java.io.IOException: error=2, No such file or
directory *
Also
Thanks Shahab, I solved my problem, in anyother way,
Great. Can you please share, if possible, what was the problem and how you
solved it? Thanks.
Regards,
Shahab
On Tue, Jul 16, 2013 at 9:58 AM, Fatih Haltas fatih.hal...@nyu.edu wrote:
Thanks Shahab, I solved my problem, in anyother way,
Hi,
Please replace 0.0.0.0.with your ftp host ip address and try it.
Hi,
From,
Ramesh.
On Mon, Jul 15, 2013 at 3:22 PM, Hao Ren h@claravista.fr wrote:
Thank you, Ram
I have configured core-site.xml as following:
?xml version=1.0?
?xml-stylesheet type=text/xsl
Hi,
Actually, I test with my own ftp host at first, however it doesn't work.
Then I changed it into 0.0.0.0.
But I always get the can not access ftp msg.
Thank you .
Hao.
Le 16/07/2013 17:03, Ram a écrit :
Hi,
Please replace 0.0.0.0.with your ftp host ip address and try it.
Hi,
great questions, i am also looking forward to answers from expert(s) here.
2013/7/16 Felix.徐 ygnhz...@gmail.com
Hi all,
I am trying to understand the process of Collect, Spill and Merge in Map,
I've referred to a few documentations but still have a few questions.
Here is my understanding
Hi
I'm trying to figure out how to incrementally add to an existing output
directory using MapReduce.
I cannot specify the exact output path, as data in the input is sorted into
categories and then written to different directories based in the contents.
(in the examples below, token= or
Hi,
I am trying query a data set on HDFS using PIG.
Data = LOAD '/user/xx/20130523/*;
x = FOREACH Data GENERATE cookie_id;
I get below error.
line 2, column 26 Invalid field projection. Projected field [cookie_id]
does not exist
How do i find the column names in the bag Data . The developer
Hi,
I am trying to generate random data using hadoop streaming python. It's a
map only job and I need to run a number of maps. There is no input to the
map as it's just going to generate random data.
How do I specify the number of maps to run? ( I am confused here because,
if I am not wrong,
This question should be sent to u...@hive.apache.org.
Alan.
On Jul 16, 2013, at 3:23 AM, samir das mohapatra wrote:
Dear All,
Did any one faced the issue :
While Loading huge dataset into hive table , hive restricting me to query
from same table.
I have set
Samir try running the command unlock table and see if it works
On Tue, Jul 16, 2013 at 8:42 PM, Alan Gates ga...@hortonworks.com wrote:
This question should be sent to u...@hive.apache.org.
Alan.
On Jul 16, 2013, at 3:23 AM, samir das mohapatra wrote:
Dear All,
Did any one faced the
Hi Max,
It can be done by customizing the output format class for your Job according
to your expectations. You could you refer
OutputFormat.checkOutputSpecs(JobContext context) method which checks the ouput
specification. We can override this in your custom OutputFormat. You can also
see
Hi Austin,
Here number of maps for a Job depends on the splits return by
InputFormat.getSplits() API. We can have an input format which decides the
number of maps(by returning the splits) for a Job according to the need.
If we use FileInputFormat, these number of splits
25 matches
Mail list logo