HDFS Proxy in contrib provides HTTP interface over HDFS. Its not very
RESTful but we are working on a new version which will have a REST
API.
AFAIK, Oozie will provide REST API for launching MR jobs.
Venkatesh
On Wed, Jul 28, 2010 at 7:31 PM, eluharani zineellabidine
eluhar...@gmail.com wrote:
Hi,
Is there a list of configuration parameters that can be set per job.
Specifically, can one set:
- mapred.tasktracker.map.tasks.maximum
- mapred.tasktracker.reduce.tasks.maximum
- mapred.map.multithreadedrunner.threads
- mapred.child.java.opts
- mapred.task.timeout
Also, I am trying to
In Oozie are working on MR/Pig jobs submission over HTTP.
On Thu, Jul 29, 2010 at 5:09 PM, Steve Loughran ste...@apache.org wrote:
S. Venkatesh wrote:
HDFS Proxy in contrib provides HTTP interface over HDFS. Its not very
RESTful but we are working on a new version which will have a REST
API.
Which configuration key controls the number of maximum tasks per node ?
On 28 July 2010 20:40, Joe Stein charmal...@allthingshadoop.com wrote:
mapred.tasktracker.reduce.tasks.maximum is how many you want as a ceiling
per node
you need to configure *mapred.reduce.tasks* to be more than one
mapred.tasktracker.reduce.tasks.maximum
PS I found this documet of default values very useful
http://hadoop.apache.org/common/docs/r0.18.3/hadoop-default.html
however I failed to find it's new version for 0.20.2
Regards,
Vitaliy S
On Thu, Jul 29, 2010 at 2:31 PM, Abhinay Mehta
there is no setting but the max tasks would be how many you set for map
reduce tasks per node (so if you set 7 for map and 6 for reduce then you
will not have more than 13 tasks running on the node as a result of the 2
settings).
http://hadoop.apache.org/common/docs/current/cluster_setup.html
I think my question is ignored, so just post it again:
I am a bit confused of how this attribute is used.
My understanding is it's related with file read/write. And I can see, in
LineReader.java, it's used as the default buffer size for each line; in
BlockReader.newBlockReader(), it's used as
Hi all,
if I use distributed cache to send some files to all the nodes in one MR job,
can I reuse these cached files locally in my next job, or will hadoop re-sent
these files again?
Thanks,
-Gang
Hello, everybody!
I have a bunch of records. Each record has key, and two fields A,B - R(k,
A,B)
I want to build two inverted indexes, one per each field. As output I expect
two files
IndexA =(A1- [k1,k2,k3..]),(A2 -[k1,k2,k4...]) ...
IndexB =(B1- [k1,k2,k3..]),(B2-[k1,k2,k4...]) ...
Hadoop
Yes.
On Thu, Jul 29, 2010 at 7:57 AM, Alex Luya alexander.l...@gmail.com wrote:
Hi,
Run:ps -aef | grep -i tasktracker
I got this:
-
alex 2425 1 0 22:34 ?00:00:05
Vitaliy,
Here are the default values and parameters for the 0.20.2
http://hadoop.apache.org/common/docs/r0.20.2/core-default.html
http://hadoop.apache.org/common/docs/r0.20.2/mapred-default.html
http://hadoop.apache.org/common/docs/r0.20.2/hdfs-default.html
The default values in the XML
I have chnaged on my namenode and the datanodes mapred-site.xml , to include
property
namemapred.userlog.retain.hours/name
value2/value
/property
And yet my job xml retains 24 .
Am I doing anything wrong
--
View this message in context:
On Thu, Jul 29, 2010 at 2:25 PM, Devajyoti Sarkar dsar...@q-kk.com wrote:
Hi,
Is there a list of configuration parameters that can be set per job.
Specifically, can one set:
- mapred.tasktracker.map.tasks.maximum
- mapred.tasktracker.reduce.tasks.maximum
-
Hi All,
We are planning to hold the next Hadoop India User Group meet up on 31st July
2010 in Noida, India.
The registration and event details are available at -
http://hugindia-absolutezeroforum.eventbrite.com/
We currently have the following talks lined up-
-
Hadoop does not prevent you from writing key value pair multiple times in
the same map iteration if that is what is your roadblock.
You can call collector.collect() multiple times with same or distinct key
value pairs within a single map iteration.
-Rahul
On Thu, Jul 29, 2010 at 8:10 AM,
Have you restarted your cluster ?
You can actually specify this parameter in JobConf.
See the usage:
TaskLog.cleanup(job.getInt(mapred.userlog.retain.hours, 24));
./src/mapred/org/apache/hadoop/mapred/Child.java
On Thu, Jul 29, 2010 at 10:30 AM, vishalsant vishal.santo...@gmail.comwrote:
On Thu, Jul 29, 2010 at 1:30 PM, vishalsant vishal.santo...@gmail.com wrote:
I have chnaged on my namenode and the datanodes mapred-site.xml , to include
property
namemapred.userlog.retain.hours/name
value2/value
/property
And yet my job xml retains 24 .
Am I doing anything
Hi,
if I use distributed cache to send some files to all the nodes in one MR job,
can I reuse these cached files locally in my next job, or will hadoop re-sent
these files again?
Cache files are reused across Jobs. From trunk onwards, they will be
restricted to be reused across jobs of the
Hi,
Is there a list of configuration parameters that can be set per job.
I'm almost certain there's no list that documents per-job settable
parameters that well. From 0.21 onwards, I think a convention adopted
is to name all job-related or task-related parameters to include 'job'
or 'map' or
19 matches
Mail list logo