IOPS are quite relevant. Just recall that they
are not the end-all, be-all for HDFS performance measurement. It's not the
primary number I would look for! Each install will have their own requirements.
Brian
On Oct 23, 2012, at 6:01 PM, Rita rmorgan...@gmail.com wrote:
I was curious because
hadoop
hadoop = 1.0.3-1
Normally, you would expect to see something like this (using the CDH4
distribution as an example) as it contains a shared library:
[bbockelm@brian-test ~]$ rpm -q --provides hadoop-libhdfs
libhdfs.so.0()(64bit)
hadoop-libhdfs = 2.0.0+88-1.cdh4.0.0.p0.30.osg.el5
libhdfs.so
Hi Ralph,
I admit - I've only been half-following the OpenMPI progress. Do you have a
technical write-up of what has been done?
Thanks,
Brian
On May 20, 2012, at 9:31 AM, Ralph Castain wrote:
FWIW: Open MPI now has an initial cut at MR+ that runs map-reduce under any
HPC environment. We
the
same level of data-integration as Hadoop does, so tackles a much simpler
problem (i.e., bring-your-own-data-management!).
/condor-geek
Brian
smime.p7s
Description: S/MIME cryptographic signature
an estimated CPU-millenia per byte of data… they needed a general
purpose cluster for a certain value of general purpose.
Brian
On Dec 14, 2011, at 7:29 AM, Michael Segel wrote:
Aw Tommy,
Actually no. You really don't want to do this.
If you actually ran a cluster and worked in the real world
Erm - actually, the heap size you are specifying for the child is 1TB if I'm
counting numbers correctly. Is it possible that Java is bombing because Linux
isn't allowing you to overcommit that much memory?
Brian
On Nov 28, 2011, at 6:21 AM, Harsh J wrote:
Hoot,
Your settings of 10 GB per
was changed, why it worked better on
the target platform, and how the optimization will affect your target platform.
Extremely hard difficulty.
Brian
On Nov 17, 2011, at 9:02 AM, Amir Sanjar wrote:
Is there any specific development, build, and packaging guidelines to add
support for a new
Hi Deepti,
That appears to crash deep in pthread, which would scare me a bit. Are you
using a strange/non-standard platform? What Java version? What HDFS version?
Brian
On Oct 14, 2011, at 3:59 AM, Banka, Deepti wrote:
Hi,
I am trying to run FUSE and, its crashing randomly
Normal operation is a function of hardware. Giving the version without the
underlying hardware means I get to make up any answer I feel like.
I can't imagine a rational set of hardware where 10 megabits (as in, one
megabyte) a second is normal.
Brian
On Oct 8, 2011, at 3:04 AM, Bochun Bai
latency.
2) As Paul pointed out, you have to ask yourself whether the SAN is shared or
dedicated. Many SANs don't have the ability to strongly partition workloads
between users..
Brian
On Sep 27, 2011, at 9:24 AM, Vivek K wrote:
Hi Brian
Thanks for a prompt response.
The machines on cluster didn't have libhdfs.so.0 file. So I copied my
libhdfs.so (that came with cloudera vm - libhdfs0 and libhdfs0-dev) on the
cluster machine. So it should be 32-bit.
The wrong ELF
be difficult to achieve.
Brian
On Sep 27, 2011, at 9:33 AM, Vivek K wrote:
Thanks Brian.
A quick question: can we have both 32bit and 64bit jvms on the cluster
machines ?
Vivek
--
On Tue, Sep 27, 2011 at 10:28 AM, Brian Bockelman bbock...@cse.unl.eduwrote:
On Sep 27, 2011, at 9:24
workflows, it doesn't measure up against specialized systems.
You really want to make sure that Hadoop is the best tool for your job.
Brian
definitely don't want to shrug off data loss / downtime.
However, there's many people who simply don't need this.
If I'm told that I can buy a 10% larger cluster by accepting up to 15 minutes
of data loss, I'd do it in a heartbeat where I work.
Brian
On Sep 17, 2011, at 6:38 PM, Tom Deutsch wrote
:) I think we can agree to that point. Hopefully a plethora of viewpoints is
good for the community!
(And when we run into something that needs higher availability, I'll drop by
and say hi!)
On Sep 17, 2011, at 8:32 PM, Tom Deutsch wrote:
Not trying to give you a hard time Brian - we just
Hi Kuro,
A 100MB file should take 1 second to read; typically, MR jobs get scheduled on
the order of seconds. So, it's unlikely you'll see any benefit.
You'll probably want to have a look at Amdahl's law:
http://en.wikipedia.org/wiki/Amdahl%27s_law
Brian
On Aug 31, 2011, at 3:48 AM
Hi Elena,
FUSE-DFS is extremely picky about hostnames. All of the following should have
the exact same string:
- Output of hostname on the namenode.
- fs.default.name
- Primary reverse-DNS of the namenode's IP.
localhost is almost certainly not what you want.
Brian
On Jun 7, 2011, at 9:47
doing it. In all likelihood, if
you're going to be working with a piece of software (Hadoop-based or not!),
you'll re-install it a few times.
The install of HDFS should take roughly the same amount of time on 2, 20, or
200 nodes.
Brian
On Jun 3, 2011, at 6:47 AM, Andrew Purtell wrote
CFQ scheduler is
inappropriate for batch workloads.
Finally, if you don't have enough host-level monitoring to indicate the current
bottleneck (CPU, memory, network, or I/O?), you likely won't ever be able to
solve this riddle
Brian
multiple processes, but it beats any
api-overcomplications, imho.
Simple doesn't imply scalable, unfortunately.
Brian
Dieter
On Wed, 18 May 2011 11:39:36 -0500
Patrick Angeles patr...@cloudera.com wrote:
kinda clunky but you could do this via shell:
for $FILE in $LIST_OF_FILES ; do
Hi Rita,
An open file in HDFS doesn't take up any resources in the NN, so there is no
corresponding close operation.
Probably you want to increase the logging in the datanodes, which will print
out activity per client.
Brian
On May 3, 2011, at 6:58 AM, Rita wrote:
I am trying to acquire
Check the overcommit VM settings on your kernel. These prevent swap from being
used on older JVMs, and cause out-of-memory errors to be given by Java even
when there is free memory.
Brian
On May 2, 2011, at 11:51 AM, Steve Loughran wrote:
On 29/04/2011 03:37, stanley@emc.com wrote:
Hi
Hi Adarsh,
It appears you don't have the JVM libraries in your LD_LIBRARY_PATH. Try this:
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64:$JAVA_HOME/jre/lib/amd64/server
Brian
On Apr 27, 2011, at 11:31 PM, Adarsh Sharma wrote:
Dear all,
Today I am trying to run a simple
Much quicker, but less safe: data might become inaccessible between boots if
you simultaneously lose another node. Probably not an issue at 3 replicas, but
definitely an issue at 2.
Brian
On Apr 25, 2011, at 7:58 PM, James Seigel wrote:
Quicker:
Shut off power
Throw hard drive out put
Hi Chris,
One thing we've found helping in ext3 is examining your I/O scheduler. Make
sure it's set to deadline, not CFQ. This will help prevent nodes from
being overloaded; when du -sk is performed and the node is already
overloaded, things quickly roll downhill.
Brian
On Mar 29, 2011
In .20 and later, user and group information is taken from the NN's OS.
There is no useradd or groupadd.
Brian
On Mar 23, 2011, at 1:19 AM, springring wrote:
Hi,
There are chmod、chown、chgrp in HDFS,
is there some command like useradd -g to add a
user in a group,? Even more
Hi W.P.,
Hadoop does apply permissions taken from the shell. So, if the directory is
owned by user brian and user ted does a rmr /user/brian, then you get a
permission denied error.
By default, this is not safeguarded against malicious users. A malicious user
will do whatever they want
Hi,
Sounds like an issue with your Hadoop runtime environment.
What does ldd /path/to/libhdfs.so say? What happens if you try one of the
libhdfs test applications?
Brian
On Mar 8, 2011, at 12:50 PM, yxxtdc wrote:
Hi,
I am a new user to HDFS and am trying to following the instruction
On Mar 8, 2011, at 1:04 PM, yxxtdc wrote:
Hi,
xsn95:/ # ldd /root/build_src/hadoop-0.20.2/lib/libhdfs.so.0
linux-vdso.so.1 = (0x7413c000)
libjvm.so = /root/build_src/hadoop-0.20.2/lib/libjvm.so
(0x2adbd5442000)
This is likely problematic. Why is libjvm.so
$JVM_LIB`:/usr/lib/
fi
Brian
On Mar 8, 2011, at 1:38 PM, yxxtdc wrote:
I copied the libjvm.so from
/usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so.
I added /usr/java/jdk1.6.0_24/jre/lib/amd64/server to the LD_LIBRARY_PATH
and it did not work so I made a manual copy to $HADOOP_HOME/lib
On Mar 8, 2011, at 3:41 PM, yxxtdc wrote:
Thanks Brian.
Got over that by fixing a typo in LD_LIBRARY_PATH and now fuse_dfs is
mounting albeit very very very slowly, like one inode a minute.
What do you mean by mounting? Do you mean listing?
Anything on the order of a minute is out
Try living in Nebraska... By time the fun stuff gets here, it's COBOL.
:)
On Mar 4, 2011, at 7:59 AM, Habermaas, William wrote:
How come all the Hadoop jobs are in the Bay area? Doesn't anybody use Hadoop
in NY?
-Original Message-
From: Brady Banks
, but the point is that MR will feed you
whole records regardless of whether they are stored on one or two blocks.
Brian
On Mar 4, 2011, at 2:24 PM, Kelly Burkhart wrote:
On Fri, Mar 4, 2011 at 1:42 PM, Harsh J qwertyman...@gmail.com wrote:
HDFS does not operate with records in mind.
So does
Hi,
Check your kernel's overcommit settings. This will prevent the JVM from
allocating memory even when there's free RAM.
Brian
On Mar 4, 2011, at 3:55 PM, Ratner, Alan S (IS) wrote:
Aaron,
Thanks for the rapid responses.
* ulimit -u unlimited is in .bashrc
for.
Brian
On Mar 2, 2011, at 9:39 PM, Ted Dunning wrote:
It will be very difficult to do. If you have n machines running 4 different
things, you will probably get better results segregating tasks as much as
possible. Interactions can be very subtle and can have major impact on
performance
be silly to not consider
Hadoop. If you currently run a bag full of shell scripts and C++ code, it's a
tougher decision to make.
Brian
smime.p7s
Description: S/MIME cryptographic signature
whatever you do for LZO and Gzip/Hadoop
has a large startup overhead?
Again, sounds like you'll be spending an hour or so with a profiler.
Brian
On Mar 2, 2011, at 2:16 PM, Niels Basjes wrote:
Question: Are you 100% sure that nothing else was running on that
system during the tests?
No cron jobs
Hi Mike,
You want to take things out of safemode before you can make these changes.
hadoop dfsadmin -safemode leave
Then you can do the hadoop fsck / -delete
Brian
On Jan 21, 2011, at 2:12 PM, mike anderson wrote:
Also, here's the output of dfsadmin -report. What seems weird is that it's
at my gmail address and I'd be very
happy to help all I can.
kind regards and best of luck with the book!
Brian
On Fri, Jan 14, 2011 at 6:02 AM, Mark Kerzner markkerz...@gmail.com wrote:
Brian,
I read with fascination your thread on MySQL and Hadoop. I enjoyed your
polite answers to every
or not using
hadoop for this would make sense in order to parallelize the task if
it gets too slow.
Thanks again,
Brian
On 10 Jan 2011, at 13:21, Black, Michael (IS)
michael.bla...@ngc.com wrote:
I had no idea the kimono comment would be so applicable to your
problem...
Everything makes
Thanks Michael,
As you say, I'll give your suggestion a try and see how it performs.
thanks for all your help.
I really appreciate it,
Brian
On Mon, Jan 10, 2011 at 8:46 PM, Black, Michael (IS) michael.bla...@ngc.com
wrote:
You need to stop looking at this as an all-or-nothing...and look
Thanks Ted,
Good to know that hadoop can help. I'll look more into it also.
really appreciate it.
Brian
On Mon, Jan 10, 2011 at 9:51 PM, Ted Dunning tdunn...@maprtech.com wrote:
Yes. Hadoop can definitely help with this.
On Mon, Jan 10, 2011 at 12:00 PM, Brian brian.mcswee...@gmail.com
thanks Sonal,
I'll check it out
On Sun, Jan 9, 2011 at 2:57 AM, Sonal Goyal sonalgoy...@gmail.com wrote:
Hi Brian,
You can check HIHO at https://github.com/sonalgoyal/hiho which can help
you
load data from any JDBC database to the Hadoop file system. If your table
has a date or id field
that some of this functionality
is perhaps now in the main api. I suppose any experience people have is
welcome. I would want to run a batch job to export every day, perform my map
reduce, and then import the results back into mysql afterwards.
cheers,
Brian
On Sun, Jan 9, 2011 at 3:18 AM, Konstantin
of the values in the
rows have to be multiplied together, some have to be compared, some have to
have a function run against them etc.
cheers,
Brian
On Sun, Jan 9, 2011 at 8:55 AM, Ted Dunning tdunn...@maprtech.com wrote:
It is, of course, only quadratic, even if you compare all rows to all other
rows
processing way.
cheers,
Brian
On Sun, Jan 9, 2011 at 12:20 PM, Black, Michael (IS) michael.bla...@ngc.com
wrote:
What kind of compare do you have to do?
You should be able to compute a checksum or such for each row when you
insert them and only have to look at the subset that matches
and I hope I have opened up my kimono enough for you
to get a sense of what I'm talking about :)
thanks very much,
Brian
On Sun, Jan 9, 2011 at 1:51 PM, Black, Michael (IS)
michael.bla...@ngc.comwrote:
All you're doing is delaying the inevitable by going to hadoop. There's no
magic to hadoop
Hi Ted,
I agree about reducing the quadratic cost and hopefully my reply to Michael
will show what my idea has been in this regard.
I really appreciate the pointers on LSH and Mahoot and I'll read up on it
and see if it helps out.
thanks very much for your help.
cheers,
Brian
On Sun, Jan 9
Thanks Jeff,
Great info and I really appreciate it.
cheers,
Brian
On Mon, Jan 10, 2011 at 12:00 AM, Jeff Hammerbacher ham...@cloudera.comwrote:
Hey Brian,
One final point about Sqoop: it's a part of Cloudera's Distribution for
Hadoop, so it's Apache 2.0 licensed and tightly integrated
Hi Arvind,
thanks very much for that. Very good to know. Sounds like Sqoop is just what
I'm looking for.
cheers,
Brian
On Sun, Jan 9, 2011 at 9:37 PM, arv...@cloudera.com arv...@cloudera.comwrote:
Hi Brian,
Sqoop supports incremental imports that can be run against a live database
system
://architects.dzone.com/articles/tools-moving-sql-database
any advice on what approach to use?
cheers,
Brian
in and figure out what's
wrong - or just keep that node dead.
Brian
On Jan 3, 2011, at 10:40 PM, Allen Wittenauer wrote:
On Jan 3, 2011, at 2:22 AM, Otis Gospodnetic wrote:
I see over on http://search-hadoop.com/?q=monit+daemontools that people *do*
use
tools like monit and daemontools
are
using.
Brian
On Dec 22, 2010, at 2:40 AM, Zhenhua Guo wrote:
I know there is a configuration parameter that can be used to specify
number of replicas.
I wonder whether I can specify different values for some files in my
program by using HDFS APIs.
Thanks
Gerald
smime.p7s
Description: S
, and I typically remount
things after a month or two of *heavy* usage.
Across all the nodes in our cluster, we probably do a few billion HDFS
operations per day over FUSE.
Brian
smime.p7s
Description: S/MIME cryptographic signature
On Dec 2, 2010, at 8:52 AM, Mark Kerzner wrote:
Thank you, Brian.
I found your paper Using Hadoop as grid storage, and it was very useful.
One thing I did not understand in it is your file usage pattern - do you
deal with small or large files, and do you delete them often enough? My
On Dec 2, 2010, at 9:22 AM, Mark Kerzner wrote:
Brian,
that almost answers my question. Still, are you saying that the problem of
Hadoop hates small files does not exist?
Well, I'd say hates is too strong of a word. Several of the costs (NN
memory, latency, efficiency) in HDFS
To be clear,
You only need to use SSH if you don't have any other way to start processes on
your worker nodes. Lots of larger production sites have ways to manage this
without SSH, but this really gets down to whatever the site prefers (and their
security team allows).
Brian
On Nov 16, 2010
Ganglia31Context.
Brian
On Nov 8, 2010, at 8:34 AM, Shuja Rehman wrote:
Hi
I have cluster of 4 machines and want to configure ganglia for monitoring
purpose. I have read the wiki and add the following lines to
hadoop-metrics.properties on each machine.
dfs.class
is very sensitive to latency, so HDFS is likely
not idea for your application. However, don't take my word for it, feel free
to explore for yourself.
Brian
On Nov 8, 2010, at 6:29 AM, ranga_balim...@dell.com wrote:
Hi,
I've MPI-BLAST application to run on HDFS and evaluate Parallel I/O. Can
are the incorrect solution to separate the services if
performance is the issue (they may be the solution if the issue is migration,
future growth, complicated deployment, or somewhat security). You're just
adding another layer that obfuscates what's happening on the hardware.
Brian
On Nov 3, 2010
.
However, let's say your cluster is corrupting data at a network level at a
large scale. Then, why would you see it only with the balancer running?
It's hard to see this as a plausible scenario, but, on the other hand,
something happened. It's possible it's just an outright coincidence.
Brian
On Oct
are limited by the latency of spinning disk and random reads, we don't
particularly hurt by going only 60MB/s on our nodes. If we wanted to go
faster, we use the native clients.
Of course, if anyone wants to donate a lowly university 1.5PB of SSDs, I'm all
ears :)
Brian
On Oct 26, 2010, at 12
of the NN classes?
Brian
On Oct 25, 2010, at 6:12 PM, phil young wrote:
Wow. I could use help quickly...
My name node is reporting a null BV. All the data nodes report the same
Build Version.
We were not upgrading the DFS, but did stop, restart, after adding a jar to
$HADOOP_HOME/lib.
So, we
Hi Gautam,
Yup - that's one possible way to configure Ganglia and is common at many sites.
That's why I usually recommend the telnet trick to determine what IP address
your configuration is using.
Brian
On Aug 25, 2010, at 5:53 AM, Gautam wrote:
Brian,
Works for me now.. one should
configuration,
it is set up to listen on UDP and write on TCP of the same port.
A third thing to test is to switch the hadoop-metrics back to the file output,
and make sure something gets written to the log file. The issue might be
upstream.
Brian
This is what most of my hadoop-metrics looks
for minutes while it spent an
increasing amount of time in GC routines.
Brian
On Aug 16, 2010, at 4:49 AM, Steve Loughran wrote:
On 13/08/10 22:24, Allen Wittenauer wrote:
On Aug 13, 2010, at 11:41 AM, Jinsong Hu wrote:
and run the namenode with the following jvm config
-Xmx1000m -XX
.
This is useful in sites like ours where we have 24/7 usage and try to avoid any
unnecessary downtime.
Brian
On Aug 10, 2010, at 8:42 AM, Allen Wittenauer wrote:
On Aug 10, 2010, at 3:51 AM, Erik Forsberg wrote:
Hi!
Due to network reconfigurations, I need to change the hostnames of some
of my
by your operating
system or some other service management tool accepted by your organization (for
example, SmartFrog from HP Labs goes above and beyond Linux's somewhat
antiquated system). This statement does not change if X=Hadoop.
Brian
On Aug 10, 2010, at 1:13 PM, Gokulakannan M wrote:
Hi
sites that roughly follow the same rules.
We haven't discovered any fatal software bugs that cause data loss since the
various ones in 0.19 were ironed out.
Brian
On Jul 21, 2010, at 8:29 PM, Bobby Dennett wrote:
The team that manages our Hadoop clusters is currently being pressured
? And is HBase
a suitable storage mechanism for this type of data? I know it's a
total newbie question so any help is greatly appreciated.
Cheers,
Brian
Sent from my iPhone
On Jul 7, 2010, at 2:56 AM, Christian Baun wrote:
Hi Brian,
I wanted to test HDFS against several distributed filesystems.
Do you know any popular performance benchmarks that run with HDFS?
I can't think of anything off the top of my head. Any ideas out there on the
list?
The issue
of harddrives / network file systems / cluster file systems, you
will find it doesn't capture well the performance aspects of a distributed file
system.
In other words, you are performance testing an apple with a test suite designed
for oranges.
Brian
On Jul 6, 2010, at 1:14 PM, Christian
outsmarts you (and I don't know about you, but it often outsmarts
me...).
Brian
On Jun 14, 2010, at 9:35 AM, Owen O'Malley wrote:
Indeed. On the terasort benchmark, I had to run intermediate jobs that
were larger than ram on the cluster to ensure that the data was not
coming from the file cache
Uh...
So you want a batch system? Look up PBS (Torque/Maui), SGE, or Condor.
Brian
On May 29, 2010, at 8:17 PM, Michael Robinson wrote:
Thanks for your answers.
I have read hadoop streaming and I think it is great, however what I am
trying to do is to run a C program that I have
Hey Pierre,
These are not traditional filesystem blocks - if you save a file smaller than
64MB, you don't lose 64MB of file space..
Hadoop will use 32KB to store a 32KB file (ok, plus a KB of metadata or so),
not 64MB.
Brian
On May 18, 2010, at 7:06 AM, Pierre ANCELOT wrote:
Hi,
I'm
dfsadmin -report
to determine precisely which IP address your datanode is listening on.
Brian
On May 17, 2010, at 11:32 PM, Scott White wrote:
I followed the steps mentioned here:
http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to
decommission a data node. What I see from
reading the HDFS design document for background issues like this:
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_design.html
Brian
On Tue, May 18, 2010 at 2:34 PM, Brian Bockelman bbock...@cse.unl.eduwrote:
Hey Pierre,
These are not traditional filesystem blocks - if you save a file
heavily upon your
implementation and hardware. Our HDFS routinely serves 5-10 Gbps.
Brian
On May 18, 2010, at 10:29 AM, Nyamul Hassan wrote:
This is a very interesting thread to us, as we are thinking about deploying
HDFS as a massive online storage for a on online university, and then
serving
.
Brian
On May 18, 2010, at 12:02 PM, Scott White wrote:
Dfsadmin -report reports the hostname for that machine and not the ip. That
machine happens to be the master node which is why I am trying to
decommission the data node there since I only want the data node running on
the slave nodes. Dfs
On May 17, 2010, at 5:25 AM, Steve Loughran wrote:
Brian Bockelman wrote:
On May 14, 2010, at 8:27 PM, Todd Lipcon wrote:
Hey Brian,
Yep, excessive GC definitely sounds like a likely culprit. I'm surprised you
didn't see OOMEs in the log, though.
We didn't until the third restart today
restarted this cluster a few hours ago and made the following changes:
1) Increased the number of datanode handlers from 10 to 40.
2) Increased ipc.server.listen.queue.size from 128 to 256.
If nothing else, I figure a deadlocked NN might be interesting to devs...
Brian
2010-05-14 17:11:30
Full thread
the built-in utilities is that it will give you better terminal
feedback.
Alternately, I find myself mounting things in debug mode to see the Hadoop
issues printed out to the terminal.
Brian
On Apr 23, 2010, at 8:30 AM, Christian Baun wrote:
Brian,
You got it!!! :-)
It works (partly)!
i
Hey Christian,
I've run into this before.
Make sure that the hostname/port you give to fuse is EXACTLY the same as listed
in hadoop-site.xml.
If these aren't the same text string (including the :8020), then you get
those sort of issues.
Brian
On Apr 22, 2010, at 5:00 AM, Christian Baun
an iptables firewall on it.
Try to see if you can open the port manually (telnet server name-node-A 4600)
from the client node and namenode to see if there's any difference. This will
allow you to distinguish between two possible error cases.
Brian
smime.p7s
Description: S/MIME cryptographic
everything. No idea if
this would work.
Brian
On Apr 6, 2010, at 9:51 AM, Patrick Donnelly wrote:
Hi,
I have a distributed file server front end to Hadoop that uses the
libhdfs C API to talk to Hadoop. Normally the file server will fork on
a new client connection but this does not work
what hardware or user we
throw at it. Our scientists love it. However, there's a damn good reason that
transactions were invented, especially for accounting/billing matters...
Brian
On Mar 23, 2010, at 11:30 AM, Allen Wittenauer wrote:
On 3/23/10 4:04 AM, Marcos Medrado Rubinelli marc
Alex Kozlov wrote:
Hi Brian,
Is your namenode running? Try 'hadoop fs -ls /'.
Alex
On Mar 12, 2010, at 5:20 PM, Brian Wolf brw...@gmail.com wrote:
Hi Alex,
I am back on this problem. Seems it works, but I have this issue
with connecting to server.
I can connect 'ssh localhost' ok
Hi Alex,
seems to:
$ bin/hadoop fs -ls /
Found 1 items
drwxr-xr-x - brian supergroup 0 2010-03-13 10:45 /tmp
However, I think this might be the source of the problems, whenever I
invoke any of the scripts, I get always get these issues:
localhost: /usr/bin/bash: /usr/local
Hi Alex,
I am back on this problem. Seems it works, but I have this issue with
connecting to server.
I can connect 'ssh localhost' ok.
Thanks
Brian
$ bin/hadoop jar hadoop-*-examples.jar pi 2 2
Number of Maps = 2
Samples per Map = 2
10/03/12 17:16:17 INFO ipc.Client: Retrying connect
was
selected because it is common in our field, and we already have the certificate
infrastructure well setup.
GridFTP is fast too - many Gbps is not too hard.
YMMV
Brian
On Mar 2, 2010, at 1:30 AM, jiang licht wrote:
I am considering a basic task of loading data to hadoop cluster
for
that matter?)
Thanks,
Brian
as a
common protocol, (b) we have a long history with using GridFTP, and (c) we need
to transfer many TB on a daily basis.
Brian
On Mar 2, 2010, at 12:10 PM, jiang licht wrote:
Hi Brian,
Thanks a lot for sharing your experience. Here I have some questions to
bother you for more help :)
So
On Mar 2, 2010, at 3:51 PM, jiang licht wrote:
Thanks, Brian.
There is no certificate/grid infrastructure as of now yet for us. But I guess
I can still use gridftp by noticing the following from its FAQ page: GridFTP
can be run in a mode using standard
SSH security credentials. It can
since i'm more or less in the same boat, this is the best I've seen, and the
2009
book is also very good:
http://developer.yahoo.com/hadoop/
Brian
On Thu, Feb 18, 2010 at 12:26 PM, Amogh Vasekar am...@yahoo-inc.com wrote:
Hi,
The hadoop meet last year has some very interesting business
Hey Abhishek,
Why would you want to fully invert a matrix that large?
How is it preconditioned? What is the condition number of the matrix?
Why not just use ScaLAPACK? It's a hairy beast, but you should definitely
consider it.
Brian
On Feb 3, 2010, at 9:57 PM, aa...@buffalo.edu wrote:
Hi
Alex Kozlov wrote:
Live Nodes http://localhost:50070/dfshealth.jsp#LiveNodes : 0
You datanode is dead. Look at the logs in the $HADOOP_HOME/logs directory
(or where your logs are) and check the errors.
Alex K
On Mon, Feb 1, 2010 at 1:59 PM, Brian Wolf brw...@gmail.com wrote
Now there's a deal! Thanks
Sirota, Peter wrote:
Hi Brian,
AWS has Elastic MapReduce service where you can run Hadoop starting at
10 cents per hour. Check it out at
http://aws.amazon.com/ank
Disclamer: I work at AWS
Sent from my phone
On Feb 2, 2010, at 11:09 PM, Brian Wolf brw
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit:
ugi=brian,None,Administrators,Usersip=/127.0.0.1cmd=create
src=/cygwin/tmp/hadoop-SYSTEM/mapred/system/job_201002011323_0001/job.jar
dst=nullperm=brian:supergroup:rw-r--r--
2010-02-01 13:26:30,045 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2010-01-30 00:03:34,603 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
fsOwner=brian,None,Administrators,Users
2010-01-30 00:03
this interesting:
Karmasphere Studio for Hadoop. http://www.hadoopstudio.org/
although I haven't fully tested it myself
Brian
1 - 100 of 151 matches
Mail list logo