ctions.
The project team will be announcing a release vote shortly for Apache Hadoop
2.0.1-alpha, which will be comprised of the contents of Apache Hadoop
2.0.0-alpha, this security patch, and a few patches for YARN.
Best,
Aaron T. Myers
Software Engineer, Cloudera
CVE-2012-3376: Apache Hado
ael Segel wrote:
If you are going this route why not net boot the nodes in the cluster?
Sent from my iPhone
On Jan 30, 2012, at 8:17 PM, "Patrick Angeles" wrote:
Hey Aaron,
I'm still skeptical when it comes to flash drives, especially as pertains
to Hadoop. The write cycl
u don't want a large
Hive query to knock out your RegionServers thereby causing cascading
failures.
We were thinking about another cluster that would just run Hive jobs.
We do not have that flexibility at the moment.
On 01/30/2012 09:17 PM, Patrick Angeles wrote:
Hey Aaron,
I'm sti
ed Java VM overhead
across services, which comes up to be around a max of 16-20GB used.
This gives us around 4-8GB for tasks that would work with HBase. We may
also use Hive on the same cluster for queries.
On 01/30/2012 05:40 PM, Aaron Tokhy wrote:
Hi,
Our group is trying to set up a pro
Hi,
Our group is trying to set up a prototype for what will eventually
become a cluster of ~50 nodes.
Anyone have experiences with a stateless Hadoop cluster setup using this
method on CentOS? Are there any caveats with a read-only root file
system approach? This would save us from having
Using PigStorage() my pig script output gets put into partial files on the
hadoop
file system.
When I use the copyToLocal fuction from Hadoop it creates a local directory with
all the partial files.
Is there a way to copy the partial files from hadoop into a single local file?
Thanks
schedule:
- 6pm - Welcome
- 6:30pm - Introductions; start creating agenda
- Breakout sessions begin as soon as we're ready
- 8pm - Conclusion
Food and refreshments will be provided, courtesy of Splunk.
Please RSVP at http://www.meetup.com/hadoopsf/events/41427512/
Regards,
- Aaron Kimball
So, I'm trying to write out an Hbase Result object (same one I get from my
TableMapper) to a SequenceFileOutputFormat from my Reducer as the value, but
I'm getting an error when it's trying to get a serializer. It looks like the
SerializationFactory can't find a Serialization (only one listed in
up.com/hadoopsf/events/35650052/
Regards,
- Aaron Kimball
Yea, we don't want it to sit there waiting for the Job to complete, even if
it's just a few minutes.
--Aaron
-Original Message-
From: turboc...@gmail.com [mailto:turboc...@gmail.com] On Behalf Of John Conwell
Sent: Thursday, September 29, 2011 10:50 AM
To: common-user@hadoop.
obTracker which runs them all in order without the
client application needing to do anthing further.
Sounds like that doesn't really exist as part of Hadoop framework, and needs
something like Oozie (or a home-built system) to do this.
--Aaron
-Original Message-
From: Harsh J [
or is a fire &
forget, and occasionally check back to see if it's done. So client-side doesn't
need to really know anything or keep track of anything. Does something like
that exist within the Hadoop framework?
--Aaron
Are you sure you have the right port number? As you say, if it's been
reconfigured, could they have changed the port the NN runs on? Also, could they
have changed the hostname of the NN? Instead of connecting to the NN you
actually are trying to connect to one of the datanodes?
--
o disregard all permissions on HDFS, you can just set
the config value "dfs.permissions" to "false" and restart your NN. This is
still overkill, but at least you could roll back if you change your mind
later. :)
--
Aaron T. Myers
Software Engineer, Cloudera
don't really have a clue why we're seeing this behavior.
We're running on FreeBSD with the Diablo-JVM (Java 1.6), which a guy on their
list feels is a pretty unusual configuration that people aren't really running.
--Aaron
-Original Message-
From: john smith [mailt
n you attach a KVM
to a machine when it becomes unreachable and take a look? Or add some
monitoring to keep an eye on the network mbufs? Don't know if this is your
problem as well or not.
--Aaron
-Original Message-
From: john smith [mailto:js1987.sm...@gmail.com]
Sent: Thursday, Se
s its check, it resumes
talking to the NN and the NN adds it back in.
--Aaron
-Original Message-
From: john smith [mailto:js1987.sm...@gmail.com]
Sent: Thursday, September 15, 2011 3:07 PM
To: common-user@hadoop.apache.org
Subject: Datanodes going down frequently
Hi all,
I am running a 10
n the cassandra config those appenders are not re-activated.
I'll put up a patch to make the TaskLogAppender a little safer by checking if
if it's closed before flush. But we need to keep the diff configs away from
each other.
Cheers
-----
Aaron Morton
Freelance Cass
that would keep
my code's static blocks from reconfiguring log4j.
-Original Message-----
From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Thursday, August 18, 2011 9:30 AM
To: common-user@hadoop.apache.org
Subject: Re: NPE in TaskLogAppender
An update incase anyone else has this p
this problem ?
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 15/08/2011, at 2:04 PM, aaron morton wrote:
> I'm running the Cassandra Brisk server with Haddop core 20.203 on OSX,
> everything is local.
>
> I
* setting mapred.acls.enabled to true
* setting mapred.queue.default.acl-submit-job and
mapred.queue.default.acl-administer-jobs to *
There was no discernible increase in joy though.
Any thoughts ?
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronm
I'm curious, what error could be thrown that can't be handled via try/catch by
catching Exception or Throwable?
--Aaron
-Original Message-
From: Maheshwaran Janarthanan [mailto:ashwinwa...@hotmail.com]
Sent: Tuesday, August 09, 2011 10:41 AM
To: HADOOP USERGROUP
Subject: RE: Sk
If the 3rd party library is used as part of your Map() function, you could just
catch the appropriate Exceptions, and simply not emit that record and return
from the Map() normally.
--Aaron
-Original Message-
From: Maheshwaran Janarthanan [mailto:ashwinwa...@hotmail.com]
Sent: Tuesday
It was my understanding (could easily be wrong) that 0.21.0 was never going to
be considered a stable, production version and 0.22.0 was going to be the next
big stable revision.
--Aaron
-Original Message-
From: Roger Chen [mailto:rogc...@ucdavis.edu]
Sent: Friday, July 29, 2011 10:20
Does this mean 0.22.0 has reached stable and will be released as the stable
version soon?
--Aaron
-Original Message-
From: Robert Evans [mailto:ev...@yahoo-inc.com]
Sent: Thursday, July 28, 2011 6:39 AM
To: common-user@hadoop.apache.org
Subject: Re: next gen map reduce
It has not been
when
submitting from a Windows machine.
--Aaron
-Original Message-
From: Paolo Castagna [mailto:castagna.li...@googlemail.com]
Sent: Wednesday, June 29, 2011 11:54 PM
To: common-user@hadoop.apache.org
Subject: NullPointerException when running multiple reducers with Hadoop
0.22.0-SNAPSHOT
Breakout sessions begin as soon as we're ready
* 8pm - Conclusion
Food and refreshments will be provided, courtesy of CBSi.
I hope to see you there! Please RSVP at http://bit.ly/kLpLQR so we can get
an accurate count for food and beverages.
Cheers,
- Aaron Kimball
the output, and as soon as that
is exceeded, simply return at the top of the reduce() function.
Is there any way to optimize it even more to tell the Reduce task, "stop
reading data, I don't need any more data"?
--Aaron
ons begin as soon as we're ready
- 8pm - Conclusion
Food and refreshments will be provided, courtesy of RichRelevance.
If you're going to attend, please RSVP at http://bit.ly/kxaJqa.
Hope to see you all there!
- Aaron Kimball
the NPE from the getCounters(). Stack trace is below. Anyone
have any ideas what's happening? Is the JobClient not meant to be persistent
and I should create a new one every single time?
--Aaron
java.lang.NullPointerException
at org.apache.hadoop.mapred.Counters.downgrade
Conclusion
Food and refreshments will be provided, courtesy of Cloudera. Please RSVP
at http://bit.ly/hwMCI2
Looking forward to seeing you there!
Regards,
- Aaron Kimball
I'll volunteer to proof read & test it ;)
I've been meaning to get around to using the new API, just haven't had the time
to learn it and convert all the existing MR jobs to it.
--Aaron
-Original Message-
From: Mark Kerzner [mailto:markkerz...@gmail.com]
Sent: Wednes
ntext class states that only the default filesystem and umask are pulled
from the Configuration object.
Any documentation that I'm missing? Do I need to go look through the source
code?
--Aaron
Ok, thanks. Guess I'm just having no luck getting my posts replied to.
Aaron Baff | Developer | Telescope, Inc.
email: aaron.b...@telescope.tv | office: 424 270 2913 | www.telescope.tv
AMERICAN IDOL is back and better than ever with a new team of judges on Season
10! Voting begins Tu
Does anyone see this? Can someone at least respond to this to indicate that
it's getting to the mailing list fine? I've just gotten 0 replies to a few
previous emails so I'm wondering if it's nobody is seeing these, or if people
just don't have any idea.
--Aaron
he sysadmin to restart the DFS. This will be early
tomorrow at the earliest, but I can try just about any other suggestions. Help!
--Aaron
03-21-11 15:58:17 [INFO ] Exception in createBlockOutputStream
java.io.EOFException
03-21-11 15:58:17 [WARN ] Error Recovery for block
blk_8212105008236569520_
ise unavailable, and only a specific
subset of the nodes are capable of actually receiving the writes. But in
this regard, Sqoop is no different than any other custom MapReduce program
you might write; it's not particularly more or less resilient to any
pathological conditions of the underlying
files (5, 10,
whichever, just >1), then it runs fine, no issues. Very strange, anyone have
any ideas?
--Aaron
stderr logs
log4j:ERROR Failed to flush writer,
java.io.InterruptedIOException
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOu
s begin as soon as we're ready
* 8pm - Conclusion
Regards,
- Aaron Kimball
ate memory" errors are coming from. If
they're from the OS, could it be because it needs to fork() and momentarily
exceed the ulimit before loading the native libs?
- Aaron
On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball wrote:
> I don't know if putting native-code .so files in
;t know if it is true for
native libs as well.)
- Aaron
On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) wrote:
> We are having difficulties running a Hadoop program making calls to
> external libraries - but this occurs only when we run the program on our
> cluster and not from
d volunteer to
facilitate a discussion. All members of the Hadoop community are welcome to
attend. While all Hadoop-related subjects are on topic, this month's
discussion theme is "integration."
Regards,
- Aaron Kimball
27;s it's TTL, and then that slot becomes
available for another Job. Is there a way to adjust this TTL? Or be able to
re-use the JVM for a different Job? This is all with 0.21.0.
--Aaron
r.freeHost(ShuffleScheduler.java:345)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152)
Aaron Baff | Developer | Telescope, Inc.
email: aaron.b...@telescope.tv<mailto:aaron.b...@telescope.tv> | office: 424
270 2913 | www.telescope.tv<http://www.telescope.tv>
the theme of "integration."
Yelp has asked that all attendees RSVP in advance, to comply with their
security policy. Please join the meetup group and RSVP at
http://www.meetup.com/hadoopsf/events/16678757/
Refreshments will be provided.
Regards,
- Aaron Kimball
> On Thu, Feb 17, 2011 at 12:09 AM, Aaron Baff wrote:
>> I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting until
>> it completes with RunningJob.waitForCompletion(). I then want to get how
>> long the entire MR takes, which appears t
;t give me access to the data I'm
looking for. I'm specifically looking at org.apache.hadoop.mapreduce.JobStatus
and it's getStartTime() and getFinishTime() methods. The only place I've seen
to get a JobStatus object is the JobClient getAllJobs(), getJobsFromQueue(),
and job
way I can see how to do it right
now is JobClient.getAllJobs(), which gives me an array of all the jobs that are
submitted (currently running? all previous?). Anyone know how I could go about
doing this?
--Aaron
etup announcement! Sign up at
http://www.meetup.com/hadoopsf/
Regards,
- Aaron Kimball
I think it wants you to type a capital Y, as silly as that may sound...
On Feb 4, 2011, at 7:38 AM, ahmednagy wrote:
>
> I have a cluster with a master and 7 nodes when i try to start hadoop it
> starts the mapreduce processes and the hdfs processes on all the nodes.
> formated the hdfs but
ed in multiple availability zones in the us-west and us-east
regions and the experience has been the same. For cc1.4xlarge instances I've
only tested in us-east.
On Tue, Feb 1, 2011 at 7:48 AM, Steve Loughran wrote:
> On 31/01/11 23:22, Aaron Eng wrote:
>
>> Hi all,
>>
>
Hi all,
I was wondering if any of you have had a similar experience working with
Hadoop in Amazon's environment. I've been running a few jobs over the last
few months and have noticed them taking more and more time. For instance, I
was running teragen/terasort/teravalidate as a benchmark and I'v
Start with the student's CS department's web server?
I believe the wikimedia foundation also makes the access logs to wikipedia
et al. available publicly. That is quite a lot of data though.
- Aaron
On Sun, Jan 30, 2011 at 10:54 AM, Bruce Williams
wrote:
> Does anyone know of a so
up(Context context) {
logger.setLevel(Level.DEBUG);
}
- Aaron
On Wed, Dec 15, 2010 at 2:23 PM, W.P. McNeill wrote:
> I'm running on a cluster. I'm trying to write to the log files on the
> cluster machines, the ones that are visible through the jobtracker web
> interfa
"syslog" in the right-most column.
- Aaron
On Mon, Dec 13, 2010 at 10:05 AM, W.P. McNeill wrote:
> I would like to use Hadoop's Log4j infrastructure to do logging from my
> map/reduce application. I think I've got everything set up correctly, but
> I
> am still una
Pros:
- Easier to build out and tear down clusters vs. using physical machines in
a lab
- Easier to scale up and scale down a cluster as needed
Cons:
- Reliability. In my experience I've had machines die, had machines fail to
start up, had network outages between Amazon instances, etc. These pro
they need to terminate the worker
JVM's? Or is there a setting to reduce the time that the worker JVM's hang
around before terminating?
--Aaron
Can you send the mapred-site.xml config for reference? It could be a
formatting issue. I've seen that problem when there was a type in the XML
after hand-editing.
On Tue, Nov 23, 2010 at 10:35 AM, Skye Berghel wrote:
> On 11/19/2010 10:07 PM, Harsh J wrote:
>
>> How are you starting your JobTr
Maybe try doing a "grep -R local " to see if its picking it up
from somewhere in there. Also, maybe try specifying an actual IP instead of
myserver as a test to see if name resolution is an issue.
On Fri, Nov 19, 2010 at 5:56 PM, Skye Berghel wrote:
> I'm trying to set up a Hadoop cluster. Howe
>On Thu, Nov 11, 2010 at 4:29 PM, Aaron Baff wrote:
>
>> I'm having a problem with a custom WritableComparable that I created
>> to use as a Key object. I basically have a number of identifier's with
>> a timestamp, and I'm wanting to group the Identifier&
f the 2 Identifiers. Have I made a wrong assumption
somewhere about how it's supposed to work? Did I do something wrong?
--Aaron
public class IdentifierTimestampKey implements WritableComparable {
private String identifier = "";
private long timestamp = 0L;
>bin/hadoop jar hadoop-*-examples.jar grep input
output 'dfs[a-z]+'
Have you tried specifying the actual file name instead of the using the '*'
wildcard?
On Tue, Nov 9, 2010 at 2:10 PM, Fabio A. Miranda
wrote:
> Give a fresh installation, I followed the Single Node Setup doc from
> hadoop websit
Did you set the namenode URI?
2010-11-09 15:38:38,255 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
java.lang.IllegalArgumentException: Invalid URI for NameNode address
(check fs.defaultFS): file:/// has no authority.
You should have some config defined in the core-site.xml file similar t
Hi Fabio,
I found this site extremely helpful in explaining how to do a one node setup
for a first time user:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
On Tue, Nov 9, 2010 at 10:54 AM, Fabio A. Miranda wrote:
> Hello,
>
>
> > You don't need 4 mach
in joining us, please fill out the following:
* I've created a short survey to help understand days / times that would
work for the most people: http://bit.ly/ajK26U
* Please also join the meetup group at http://meetup.com/hadoopsf -- We'll
use this to plan the event, RSVP information, et
read from Cassandra and create a file should be OK.
You can then copy it onto the HFS and read from there.
Hope that helps.
Aaron
On 20 Oct 2010, at 04:01, Mark wrote:
> As the subject implies I am trying to dump Cassandra rows into Hadoop. What
> is the easiest way for me to accomplis
. I'm pretty sure you can give it either a single file or a
directory, in which case it will show you the details for every file
under that directory.
Hope that helps.
Aaron
class that distinguishes
between the two 'types' based on an additional field?
Aaron Baff |Â Developer | Telescope, Inc.
email: aaron.b...@telescope.tv | office: 424 270 2913 | www.telescope.tv
The information contained in this email is confidential and may be legally
privileged. It is
b where you map the Count as the Key, and Item as the Value,
use 1 Reducer, and Identity Reduce it (e.g. don't do any reducing, just output
the Count,Item).
Aaron Baff | Developer | Telescope, Inc.
email: aaron.b...@telescope.tv | office: 424 270 2913 | www.telescope.tv
Bored with summer re
/**
* @return the oli
*/
public byte getOli() {
return oli;
}
/**
* @param oli the oli to set
*/
public void setOli(byte oli) {
this.oli = oli;
}
}
=
and /dataset2/part-(n) in your
mapper.
If you wanted to be more clever, it might be possible to subclass
MultiFileInputFormat to group together both datasets "file-number-wise" when
generating splits, but I don't have specific guidance here.
- Aaron
On Sat, Jul 3, 2010 at 9
Is there a reason you're using that particular interface? That's very
low-level.
See http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample for the proper
API to use.
- Aaron
On Sat, Jul 3, 2010 at 1:36 AM, Vidur Goyal wrote:
> Hi,
>
> I am trying to create a file in
ight? For data at either "edge" of your
problem--either input or final output data--you might want the greater
ubiquity of text-based files.
- Aaron
On Fri, Jul 2, 2010 at 3:35 PM, Joe Stein wrote:
> David,
>
> You can also set compression to occur of your data between your map
estion is probably more "correct," but might incur additional
work on your part.
Cheers,
- Aaron
On Thu, Jun 10, 2010 at 3:54 PM, Allen Wittenauer
wrote:
>
> On Jun 10, 2010, at 3:25 AM, Ahmad Shahzad wrote:
> > Reason for doing that is that i want all the communication to happen
urce at http://github.com/cloudera/sqoop) :)
Cheers,
- Aaron
On Thu, May 6, 2010 at 7:32 AM, Zhenyu Zhong wrote:
> Hi,
>
> I tried to use CombineFileInputFormat in 0.20.2. It seems I need to extend
> it because it is an abstract class.
> However, I need to implement getRecordRe
f you have any questions about this move process, please ask me.
Regards,
- Aaron Kimball
Cloudera, Inc.
rate package to install Sqoop independent of the rest of CDH; thus no
extra download link on our site.
I hope this helps!
Good luck,
- Aaron
On Wed, Mar 17, 2010 at 4:30 AM, Reik Schatz wrote:
> At least for MRUnit, I was not able to find it outside of the Cloudera
> distribution (CDH). Wha
rsion of Hadoop and recompile, but that might be tricky since the
filenames will most likely not line up (due to the project split).
- Aaron
On Tue, Mar 16, 2010 at 8:11 AM, Aleksandar Stupar <
stupar.aleksan...@yahoo.com> wrote:
> Hi all,
>
> I want to use CombineFileInputFormat i
If it's terminating before you even run a job, then you're in luck -- it's
all still running on the local machine. Try running it in Eclipse and use
the debugger to trace its execution.
- Aaron
On Wed, Mar 3, 2010 at 4:13 AM, Rakhi Khatwani wrote:
> Hi,
>I am ru
We've already got a lot of mailing lists :) If you send questions to
mapreduce-user, are you not getting enough feedback?
- Aaron
On Wed, Mar 3, 2010 at 12:09 PM, Michael Kintzer
wrote:
> Hi,
>
> Was curious if anyone else thought it would be useful to have a separate
> mail li
Look at implementing your own Partitioner implementation to control which
records are sent to which reduce shards.
- Aaron
On Wed, Mar 3, 2010 at 12:15 PM, Gang Luo wrote:
> Hi all,
> I want to generate some datasets with data skew to test my mapreduce jobs.
> I am using TPC-DS but i
(including the one
in the most recently-released CDH2: 0.20.1+169.56-1) include MAPREDUCE-1146
which eliminates that dependency.
- Aaron
On Tue, Feb 16, 2010 at 3:19 AM, Steve Loughran wrote:
> Thomas Koch wrote:
>
>> Hi,
>>
>> I'm working on the Debian package fo
pose of the config
file comment is to let you know that you're free to pick a path name like
"/system/mapred" here even though your local Linux machine doesn't have a
path named "/system"; this HDFS path is in a separate (HDFS-specific)
namespace from "/home",
job, that's another story, but you can accomplish that with:
$HADOOP_HOME/hadoop dfsadmin -safemode wait
... which will block until HDFS is ready for user commands in read/write
mode.
- Aaron
On Fri, Feb 12, 2010 at 8:44 AM, Sonal Goyal wrote:
> Hi
>
> I had faced a similar i
Can you post the entire exception with its accompanying stack trace?
- Aaron
On Thu, Feb 11, 2010 at 5:26 PM, Prabhu Hari Dhanapal <
dragonzsn...@gmail.com> wrote:
> @ Jeff
> I seem to have used the Mapper you are pointing to ...
>
>
> import org.apache.hadoop.mapred.Ma
There's an older mechanism called MultipleOutputFormat which may do what you
need.
- Aaron
On Fri, Feb 5, 2010 at 10:13 AM, Udaya Lakshmi wrote:
> Hi,
> MultipleOutput class is not available in hadoop 0.18.3. Is there any
> alternative for this class? Please point me useful li
dvice.
- Aaron
On Fri, Jan 29, 2010 at 8:32 AM, Jones, Nick wrote:
> A single unity reducer should enforce a merge and sort to generate one
> file.
>
> Nick Jones
>
> -Original Message-
> From: Jeff Zhang [mailto:zjf...@gmail.com]
> Sent: Friday, January 29, 20
pretty good support for parallel imports, and uses this
InputFormat instead.
- Aaron
On Thu, Jan 28, 2010 at 11:39 AM, Nick Jones wrote:
> Hi all,
> I have a use case for collecting several rows from MySQL of
> compressed/unstructured data (n rows), expanding the data set, and storin
Brian, it looks like you missed a step in the instructions. You'll need to
format the hdfs filesystem instance before starting the NameNode server:
You need to run:
$ bin/hadoop namenode -format
.. then you can do bin/start-dfs.sh
Hope this helps,
- Aaron
On Sat, Jan 30, 2010 at 12:
See http://wiki.apache.org/hadoop/HowToContribute for more step-by-step
instructions.
- Aaron
On Fri, Jan 22, 2010 at 7:36 PM, Kay Kay wrote:
> Start with hadoop-common to start building .
>
> hadoop-hdfs / hadoop-mapred pull the dependencies from apache snapshot
> repository that
Note that org.apache.hadoop.mapreduce.lib.output.MultipleOutputs is
scheduled for the next CDH 0.20 release -- ready "soon."
- Aaron
2010/1/6 Amareshwari Sri Ramadasu
> No. It is part of branch 0.21 onwards. For 0.20*, people can use old api
> only, though JobCo
You'll need to configure mapred.fairscheduler.allocation.file to point to
your fairscheduler.xml file; this file must contain at least the following:
- Aaron
On Thu, Dec 10, 2009 at 10:34 PM, Rekha Joshi wrote:
> What’s your hadoop version/distribution? In anycase, to eliminate th
You need to send a jar to the cluster so it can run your code there. Hadoop
doesn't magically know which jar is the one containing your main class, or
that of your mapper/reducer -- so you need to tell it via that call so it
knows which jar file to upload.
- Aaron
On Sun, Nov 29, 2009 at 7:
f course, by the time
you've got several hundred GB of data to work with, your current workload
imbalance issues should be moot anyway.
- Aaron
On Fri, Nov 27, 2009 at 4:33 PM, CubicDesign wrote:
>
>
> Aaron Kimball wrote:
>
>> (Note: this is a tasktracker setting, not a jo
ual records
require around a minute each to process as you claimed earlier, you're
nowhere near in danger of hitting that particular performance bottleneck.
- Aaron
On Thu, Nov 26, 2009 at 12:23 PM, CubicDesign wrote:
>
>
> Are the record processing steps bound by a local machine
fault.name and mapred.job.tracker; when the day comes that these
services are placed on different nodes, you'll then be able to just move one
of the hostnames over and not need to reconfigure all 20--40 other nodes.
- Aaron
On Thu, Nov 26, 2009 at 8:27 PM, Srigurunath Chakravarthi <
srig..
ppear in
plaintext if a human operator is inspecting the output for debugging.
- Aaron
On Thu, Nov 26, 2009 at 4:59 PM, Mark Kerzner wrote:
> It worked!
>
> But why is it "for testing?" I only have one job, so I need by related as
> text, can I use this fix all the time?
&
When you set up the Job object, do you call job.setJarByClass(Map.class)?
That will tell Hadoop which jar file to ship with the job and to use for
classloading in your code.
- Aaron
On Thu, Nov 26, 2009 at 11:56 PM, wrote:
> Hi,
> I am running the job from command line. The job runs f
(probably either 'root' or 'hadoop') will
need the ability to mkdir /home/hadoop/hadoop-root underneath of
/home/hadoop. If that directory doesn't exist, or is chown'd to someone
else, this will probably be the result.
- Aaron
On Thu, Nov 26, 2009 at 10:22
#x27; --> directory not found. When you mkdir'd
'lol', you were actually effectively doing "mkdir -p /user/hadoop/lol", so
then it created your home directory underneath of that.
- Aaron
On Tue, Nov 10, 2009 at 1:30 PM, zenkalia wrote:
> ok, things are working.. i mu
1 - 100 of 179 matches
Mail list logo