Re: Hbase with Hadoop

2011-10-11 Thread Matt Foley
Hi Jignesh,
0.20.204.0 does not have hflush/sync support, but 0.20.205.0 does.
Without HDFS hsync, HBase will still work, but is subject to data loss if
the datanode is restarted.  In 205, this deficiency is fixed.

0.20.205.0-rc2 is up for vote in common-dev@.  Please try it out with HBase
:-)
We've been using it with HBase 0.90.3 and it works, with some config
adjustments.

--Matt, RM for 0.20.205.0


On Tue, Oct 11, 2011 at 2:11 PM, Jignesh Patel  wrote:

> Can I integrate HBase 0.90.4 with hadoop 0.20.204.0 ?
>
> -Jignesh
>


Re: Hbase with Hadoop

2011-10-12 Thread Matt Foley
Hi Jignesh,
Not clear what's going on with your ZK, but as a starting point, the
hsync/flush feature in 205 was implemented with an on-off switch.  Make sure
you've turned it on by setting  *dfs.support.append  *to true in the
hdfs-site.xml config file.

Also, are you installing Hadoop with security turned on or off?

I'll gather some other config info that should help.
--Matt


On Wed, Oct 12, 2011 at 1:47 PM, Jignesh Patel  wrote:

> When I tried to run Hbase 0.90.4 with hadoop-.0.20.205.0 I got following
> error
>
> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase shell
> HBase Shell; enter 'help' for list of supported commands.
> Type "exit" to leave the HBase Shell
> Version 0.90.4, r1150278, Sun Jul 24 15:53:29 PDT 2011
>
> hbase(main):001:0> status
>
> ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able
> to connect to ZooKeeper but the connection closes immediately. This could be
> a sign that the server has too many connections (30 is the default).
> Consider inspecting your ZK server logs for that error and then make sure
> you are reusing HBaseConfiguration as often as you can. See HTable's javadoc
> for more information.
>
>
> And when I tried to stop Hbase I continuously sees dot being printed and no
> sign of stopping it. Not sure why it just simply stop it.
>
> stopping
> hbase...….
>
>
> On Oct 12, 2011, at 3:19 PM, Jignesh Patel wrote:
>
> > The new plugin works after deleting eclipse and reinstalling it.
> > On Oct 12, 2011, at 2:39 PM, Jignesh Patel wrote:
> >
> >> I have installed Hadoop-0.20.205.0 but when I replace the hadoop
> 0.20.204.0 eclipse plugin with the 0.20.205.0, eclipse is not recognizing
> it.
> >>
> >> -Jignesh
> >> On Oct 12, 2011, at 12:31 PM, Vinod Gupta Tankala wrote:
> >>
> >>> its free and open source too.. basically, their releases are ahead of
> public
> >>> releases of hadoop/hbase - from what i understand, major bug fixes and
> >>> enhancements are checked in to their branch first and then eventually
> make
> >>> it to public release branches.
> >>>
> >>> thanks
> >>>
> >>> On Wed, Oct 12, 2011 at 9:26 AM, Jignesh Patel 
> wrote:
> >>>
>  Sorry to here that.
>  Is CDH3 is a open source or a paid version?
> 
>  -jignesh
>  On Oct 12, 2011, at 11:58 AM, Vinod Gupta Tankala wrote:
> 
> > for what its worth, i was in a similar situation/dilemma few days ago
> and
> > got frustrated figuring out what version combination of hadoop/hbase
> to
>  use
> > and how to build hadoop manually to be compatible with hbase. the
> build
> > process didn't work for me either.
> > eventually, i ended up using cloudera distribution and i think it
> saved
>  me a
> > lot of headache and time.
> >
> > thanks
> >
> > On Tue, Oct 11, 2011 at 8:29 PM, jigneshmpatel <
> jigneshmpa...@gmail.com
> > wrote:
> >
> >> Matt,
> >> Thanks a lot. Just wanted to have some more information. If hadoop
> >> 0.2.205.0
> >> voted by the community members then will it become major release?
> And
>  what
> >> if it is not approved by community members.
> >>
> >> And as you said I do like to use 0.90.3 if it works. If it is ok,
> can
>  you
> >> share the deails of those configuration changes?
> >>
> >> -Jignesh
> >>
> >> --
> >> View this message in context:
> >>
> 
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3414658.html
> >> Sent from the Hadoop lucene-users mailing list archive at
> Nabble.com.
> >>
> 
> 
> >>
> >
>
>


Re: Hbase with Hadoop

2011-10-13 Thread Matt Foley
Hi Jignesh,
the option is --config (with a double dash) not -config (with a single
dash).  Please let me know if that works.

--Matt


On Thu, Oct 13, 2011 at 8:30 AM, jigneshmpatel wrote:

> There is no command like -config see below
>
> Jignesh-MacBookPro:hadoop-hbase hadoop-user$ bin/hbase -config ./config
> shell
> Unrecognized option: -config
> Could not create the Java virtual machine.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Hbase-with-Hadoop-tp3413950p3418924.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>


[ANNOUNCEMENT] Hadoop 0.20.205.0 release

2011-10-18 Thread Matt Foley
On Friday 14 Oct, the Hadoop community voted ten to zero (including four PMC
members voting in favor) to accept the release of Hadoop 0.20.205.0.

The biggest feature of this release is that it merges the
append/hsync/hflush features of branch-0.20-append, and security features of
branch-0.20-security.  Therefore, this is the first official Apache Hadoop
release that supports HBase in secure mode!

Thanks to everyone who contributed bug fixes, merges, and improvements.  It
was truly a community effort.

Best regards,
--Matt (Release Manager)


Re: Why is hadoop build I generated from a release branch different from release build?

2012-03-08 Thread Matt Foley
Hi Pawan,
The complete way releases are built (for v0.20/v1.0) is documented at
http://wiki.apache.org/hadoop/HowToRelease#Building
However, that does a bunch of stuff you don't need, like generate the
documentation and do a ton of cross-checks.

The full set of ant build targets are defined in build.xml in the top level
of the source code tree.
"binary" may be the target you want.

--Matt

On Thu, Mar 8, 2012 at 3:35 PM, Pawan Agarwal wrote:

> Hi,
>
> I am trying to generate hadoop binaries from source and execute hadoop from
> the build I generate. I am able to build, however I am seeing that as part
> of build *bin* folder which comes with hadoop installation is not generated
> in my build. Can someone tell me how to do a build so that I can generate
> build equivalent to hadoop release build and which can be used directly to
> run hadoop.
>
> Here's the details.
> Desktop: Ubuntu Server 11.10
> Hadoop version for installation: 0.20.203.0  (link:
> http://mirrors.gigenet.com/apache//hadoop/common/hadoop-0.20.203.0/)
> Hadoop Branch used build: branch-0.20-security-203
> Build Command used: "Ant maven-install"
>
> Here's the directory structures from build I generated vs hadoop official
> release build.
>
> *Hadoop directory which I generated:*
> pawan@ubuntu01:/hadoop0.20.203.0/hadoop-common/build$ ls -1
> ant
> c++
> classes
> contrib
> examples
> hadoop-0.20-security-203-pawan
> hadoop-ant-0.20-security-203-pawan.jar
> hadoop-core-0.20-security-203-pawan.jar
> hadoop-examples-0.20-security-203-pawan.jar
> hadoop-test-0.20-security-203-pawan.jar
> hadoop-tools-0.20-security-203-pawan.jar
> ivy
> jsvc
> src
> test
> tools
> webapps
>
> *Official Hadoop build installation*
> pawan@ubuntu01:/hadoop0.20.203.0/hadoop-common/build$ ls /hadoop -1
> bin
> build.xml
> c++
> CHANGES.txt
> conf
> contrib
> docs
> hadoop-ant-0.20.203.0.jar
> hadoop-core-0.20.203.0.jar
> hadoop-examples-0.20.203.0.jar
> hadoop-test-0.20.203.0.jar
> hadoop-tools-0.20.203.0.jar
> input
> ivy
> ivy.xml
> lib
> librecordio
> LICENSE.txt
> logs
> NOTICE.txt
> README.txt
> src
> webapps
>
>
>
> Any pointers for help are greatly appreciated?
>
> Also, if there are any other resources for understanding hadoop build
> system, pointers to that would be also helpful.
>
> Thanks
> Pawan
>


Re: Multiple cores vs multiple nodes

2012-07-02 Thread Matt Foley
This is actually a very complex question.  Without trying to answer
completely, the high points, as I see it, are:
a) [Most important] Different kinds of nodes require different Hadoop
configurations.  In particular, the number of simultaneous tasks per node
should presumably be set higher for a many-core node than for a few-core
node.
b) More nodes (potentially) give you more disk controllers, and more memory
bus bandwidth shared by the disk controllers and RAM and CPUs.
c) More nodes give you (potentially, in a flat network fabric) more network
bandwidth between cores.
d) You can't always assume the cores are equivalent.

Details:

a) If all other issues were indeed equal, you'd configure
"mapred.tasktracker.map.tasks.maximum" and
"mapred.tasktracker.reduce.tasks.maximum" four times larger on an 8-core
system than a 2-core system.  In the real world, you'll want to experiment
to optimize the settings for the actual hardware and actual job streams
you're running.

b) If you're running modern server hardware, you've got DMA disk
controllers and a multi-GByte/sec memory bus, as well as bus controllers
that do a great job of multiplexing all the demands that share the FSB.
 However, as the disk count goes up and the budget goes down, you need to
look at whether you're going to saturate either the controller(s) or the
bus, given the local i/o access patterns of your particular workload.

c) Similarly, given the NIC cards in your servers and your rack/switch
topology, you need to ask whether your network i/o access patterns,
especially during shuffle/sort, will risk saturating your network bandwidth.

d) Make sure that the particular CPUs you are comparing actually have
comparable cores, because there's a world of difference between the
different cores included in dozens of different CPUs available!

Hope this helps.  Cheers,
--Matt

On Sun, Jul 1, 2012 at 4:13 AM, Safdar Kureishy
wrote:

> Hi,
>
> I have a reasonably simple question that I thought I'd post to this list
> because I don't have enough experience with hardware to figure this out
> myself.
>
> Let's assume that I have 2 separate cluster setups for slave nodes. The
> master node is a separate machine *outside* these clusters:
> *Setup A*: 28 nodes, each with a 2-core CPU, 8 GB RAM and 1 SATA drives (1
> TB each)
> *Setup B*: 7 nodes, each with a 8-core CPU, 32 GB Ram and 4 SATA drives (1
> TB each)
>
> Note that I have maintained the same *core:memory:spindle* ratio above. In
> essence, setup B has the same overall processing + memory + spindle
> capacity, but achieved with 4 times fewer nodes.
>
> Ignoring the* cost* of each node above, and assuming a 10Gb Ethernet
> connectivity and the same speed-per-core across nodes in both the scenarios
> above, are Setup A and Setup B equivalent to each other in the context of
> setting up a Hadoop cluster? Or will the relative performance be different?
> Excluding the network connectivity between the nodes, what would be some
> other criteria that might give one setup an edge over the other, for
> regular Hadoop jobs?
>
> Also, assuming the same type of Hadoop jobs on both clusters, how different
> would the load experienced by the master node be for each setup above?
>
> Thanks in advance,
> Safdar
>


Re: Which hadoop version shoul I install in a production environment

2012-07-05 Thread Matt Foley
0.20.2 is really old code and not nearly as stable as Hadoop 1.0.  One
alternative, that would get you immediate access to stable 1.0 code and
which supports yum/rpm packaging, is HDP-1.0.0.12, available free from
Hortonworks.  See http://hortonworks.com/download/

Disclosure:  I'm a Hortonworks employee.
--Matt

On Thu, Jul 5, 2012 at 8:04 AM, Pablo Musa  wrote:

> Thank you all for your answers.
> I think I will keep 0.20.2 and wait for a new major release of cdh while I
> keep an eye on Ambari.
>
> Abs,
> Pablo
>
> -Original Message-
> From: Owen O'Malley [mailto:omal...@apache.org]
> Sent: terça-feira, 3 de julho de 2012 20:19
> To: common-user@hadoop.apache.org
> Subject: Re: Which hadoop version shoul I install in a production
> environment
>
> On Tue, Jul 3, 2012 at 1:19 PM, Pablo Musa  wrote:
>
> > Which is the latest stable hadoop version to install in a production
> > environment with package manager support?
> >
>
>  The current stable version of Hadoop is 1.0.3. It is available as both
> source and rpms from here:
>
> http://hadoop.apache.org/common/releases.html#Download
>
>
> Which is the latest stable hadoop version to install in a production
> > environment with package manager support?
>
>
> Apache Ambari is a new project that can be used to install the Hadoop
> ecosystem including Hadoop 1.0.3 and HBase 0.92.1 using rpms and providing
> a web-based UI to control and monitor your cluster. They are in the process
> of making their first release and we would love to discuss it with you on
> ambari-u...@apache.org.
>
> -- Owen
>


[ANNOUNCE] Hadoop version 1.2.1 (stable) released

2013-08-04 Thread Matt Foley
I'm happy to announce that Hadoop version 1.2.1 has passed its release vote
and is now available.  It has 18 bug fixes and patches over the previous
1.2.0 release; please see Hadoop 1.2.1 Release
Notesfor
details.  This release of Hadoop-1.2 is now considered
*stable*.

The next set of release work will focus on producing Hadoop-1.3.0, which
will include Windows native compatibility.

Thank you,
--Matt
release manager, hadoop-1


Re: [ANNOUNCE] Hadoop version 1.2.1 (stable) released

2013-08-04 Thread Matt Foley
>> which will include Windows native compatibility.

My apologies, this was incorrect.  Windows has only been integrated to
trunk and branch-2.1.

Thanks,
--Matt


On Sun, Aug 4, 2013 at 2:08 PM, Matt Foley  wrote:

> I'm happy to announce that Hadoop version 1.2.1 has passed its release
> vote and is now available.  It has 18 bug fixes and patches over the
> previous 1.2.0 release; please see Hadoop 1.2.1 Release 
> Notes<http://hadoop.apache.org/docs/r1.2.1/releasenotes.html>for details.  
> This release of Hadoop-1.2 is now considered
> *stable*.
>
> The next set of release work will focus on producing Hadoop-1.3.0, which
> will include Windows native compatibility.
>
> Thank you,
> --Matt
> release manager, hadoop-1
>


Re: [ANNOUNCE] Hadoop version 1.2.1 (stable) released

2013-08-05 Thread Matt Foley
It's still available in archive at
http://archive.apache.org/dist/hadoop/core/.  I can put it back on the main
download site if desired, but the model is that the main download site is
for stuff we actively want people to download.  Here is the relevant quote
from http://wiki.apache.org/hadoop/HowToRelease , under the "Publishing"
section:

3. The release directory usually contains just two releases, the most
recent from two branches, with a link named 'stable' to the most recent
recommended version.

Quite a few more than two versions have accumulated, because more than two
branches are active.  But within the Hadoop-1 set I've tried to keep it
minimal.  Since 1.2.1 was accepted as a "stable" release, I saw no reason
to keep 1.1.2, since we'd rather people downloaded 1.2.1.  When 1.3.0 is
produced, since it will be "beta" quality, 1.2.1 will of course remain as
the "stable" version.

--Matt



On Mon, Aug 5, 2013 at 9:43 AM, Chris K Wensel  wrote:

> any particular reason the 1.1.2 releases were pulled from the mirrors (so
> quickly)?
>
> On Aug 4, 2013, at 2:08 PM, Matt Foley  wrote:
>
> > I'm happy to announce that Hadoop version 1.2.1 has passed its release
> vote
> > and is now available.  It has 18 bug fixes and patches over the previous
> > 1.2.0 release; please see Hadoop 1.2.1 Release
> > Notes<http://hadoop.apache.org/docs/r1.2.1/releasenotes.html>for
> > details.  This release of Hadoop-1.2 is now considered
> > *stable*.
> >
> > The next set of release work will focus on producing Hadoop-1.3.0, which
> > will include Windows native compatibility.
> >
> > Thank you,
> > --Matt
> > release manager, hadoop-1
>
> --
> Chris K Wensel
> ch...@concurrentinc.com
> http://concurrentinc.com
>
>


Re: [ANNOUNCE] Hadoop version 1.2.1 (stable) released

2013-08-05 Thread Matt Foley
Chris, there is a stable link for exactly this purpose:
http://www.apache.org/dist/hadoop/core/stable/
--Matt


On Mon, Aug 5, 2013 at 11:43 AM, Chris K Wensel  wrote:

> regardless of what was written in a wiki somewhere, it is a bit aggressive
> I think.
>
> there are a fair number of automated things that link to the former stable
> releases that are now broken as they weren't given a grace period to cut
> over.
>
> not the end of the world or anything. just a bit of a disconnect to
> reference something labeled stable where the reference itself isn't. too
> bad there isn't some 'stable' link that itself is updated to the actual
> stable release.
>
> we do this now with all of the cascading artifacts. it prevents automated
> things from breaking immediately.
>
> http://files.cascading.org/sdk/2.1/latest.txt
>
> ckw
>
> On Aug 5, 2013, at 11:20 AM, Matt Foley  wrote:
>
> It's still available in archive at
> http://archive.apache.org/dist/hadoop/core/.  I can put it back on the
> main download site if desired, but the model is that the main download site
> is for stuff we actively want people to download.  Here is the relevant
> quote from http://wiki.apache.org/hadoop/HowToRelease , under the
> "Publishing" section:
>
> 3. The release directory usually contains just two releases, the most
> recent from two branches, with a link named 'stable' to the most recent
> recommended version.
>
> Quite a few more than two versions have accumulated, because more than two
> branches are active.  But within the Hadoop-1 set I've tried to keep it
> minimal.  Since 1.2.1 was accepted as a "stable" release, I saw no reason
> to keep 1.1.2, since we'd rather people downloaded 1.2.1.  When 1.3.0 is
> produced, since it will be "beta" quality, 1.2.1 will of course remain as
> the "stable" version.
>
> --Matt
>
>
>
> On Mon, Aug 5, 2013 at 9:43 AM, Chris K Wensel  wrote:
>
>> any particular reason the 1.1.2 releases were pulled from the mirrors (so
>> quickly)?
>>
>> On Aug 4, 2013, at 2:08 PM, Matt Foley  wrote:
>>
>> > I'm happy to announce that Hadoop version 1.2.1 has passed its release
>> vote
>> > and is now available.  It has 18 bug fixes and patches over the previous
>> > 1.2.0 release; please see Hadoop 1.2.1 Release
>> > Notes<http://hadoop.apache.org/docs/r1.2.1/releasenotes.html>for
>> > details.  This release of Hadoop-1.2 is now considered
>> > *stable*.
>> >
>> > The next set of release work will focus on producing Hadoop-1.3.0, which
>> > will include Windows native compatibility.
>> >
>> > Thank you,
>> > --Matt
>> > release manager, hadoop-1
>>
>> --
>> Chris K Wensel
>> ch...@concurrentinc.com
>> http://concurrentinc.com
>>
>>
>
> --
> Chris K Wensel
> ch...@concurrentinc.com
> http://concurrentinc.com
>
>


Re: Can i safely set dfs.blockreport.intervalMsec to very large value (1 year or more?)

2011-07-08 Thread Matt Foley
Hi Moon,
The periodic block report is constructed entirely from info in memory, so
there is no complete scan of the filesystem for this purpose.  The periodic
block report defaults to only sending once per hour from each datanode, and
each DN calculates a random start time for the hourly cycle (after initial
startup block report), to spread those hourly reports somewhat evenly across
the entire hour.  It is part of Hadoop's fault tolerance that the namenode
and datanodes perform this hourly check to assure that they both have the
same understanding of what replicas are available from each node.

However, depending on your answers to the questions below, you may be having
memory management and/or garbage collection problems.  We may be able to
help diagnose it if you can provide more info:

First, please confirm that you said 50,000,000 blocks per datanode (not
50,000).  This is a lot.  The data centers I'm most familiar with run with
aprx 100,000 blocks per datanode, because they need a higher ratio of
compute power to data.

Second, please confirm whether it is the datanodes, or the namenode
services, that are being non-responsive for minutes at a time.  And when you
say "often", how often are you experiencing such non-responsiveness?  What
are you experiencing when it happens?

Regarding your environment:
* How many datanodes in the cluster?
* How many volumes (physical HDD's) per datanode?
* How much RAM per datanode?
* What OS on the datanodes, and is it 32-bit or 64-bit?  What max process
size is configured?
* Is the datanode services JVM running as 32-bit or 64-bit?

Hopefully these answers will help figure out what's going on.
--Matt


On Fri, Jul 8, 2011 at 7:21 AM, Robert Evans  wrote:

> Moon Soo Lee
>
> The full block report is used in error cases.  Currently when a datanode
> heartbeats into the namenode the namenode can send back a list of tasks to
> be preformed, this is mostly for deleting blocks.  The namenode just assumes
> that all of these tasks execute successfully.  If any of them fail then the
> namenode is unaware of it.  HDFS-395 adds in an ack to address this.
>  Creating of new blocks is sent to the namenode as they happen so this is
> not really an issue. So if you set the period to 1 year then you will likely
> have several blocks in your cluster sitting around unused but taking up
> space.  It is also likely compensating for other error conditions or even
> bugs in HDFS that I am unaware of, just because of the nature of it.
>
> --Bobby Evans
>
> On 7/7/11 9:02 PM, "moon soo Lee"  wrote:
>
> I have many blocks. Around 50~90m each datanode.
>
> They often do not respond while 1~3 min and i think this is because of full
> scanning for block report.
>
> So if i set dfs.blockreport.intervalMsec to very large value (1year or
> more?), i expect problem clear.
>
> But if i really do what happens? any side effects?
>
>


Re: Version Mismatch

2011-08-18 Thread Matt Foley
Hi Alan,
It seems your XXX application incorporates a DFSClient, which implies it is
compiled in the presence of certain Hadoop jar files.  If it grabs those jar
files and incorporates them in the XXX installable package (tarball, rpm,
whatever), then it's easy to get this kind of mis-match.  Evidently, the
Hadoop client jars incorporated in XXX are running an older version 60 of
the RPC protocol, while your Hadoop service is running a newer version 61.

Possible solutions:

1. You could get your XXX supplier to provide a new compilation using the
newer version of Hadoop client jars.  The benefit to this approach is that
they'll test with the new jars and assure compatibility.  But when some new
version of Hadoop comes along that uses RPC version 62, you'll have the same
problem again.

2. The better solution is to configure your CLASSPATH so that jars from your
Hadoop installation are loaded before jars from the XXX installation.  The
order of entries in the CLASSPATH is very important to achieve this.

Is XXX installed on the same servers as Hadoop? Or on a separate server
where Hadoop is not installed?  If the latter, you will need to install
Hadoop on the XXX server, even though you won't run any Hadoop services
there, so that the up-to-date jars will be available for solution #2.

Hope this helps,
--Matt


On Thu, Aug 18, 2011 at 5:21 AM, Joey Echeverria  wrote:

> It means your HDFS client jars are using a different RPC version than
> your namenode and datanodes. Are you sure that XXX has $HADOOP_HOME in
> it's classpath? It really looks like it's pointing to the wrong jars.
>
> -Joey
>
> On Thu, Aug 18, 2011 at 8:14 AM, Ratner, Alan S (IS)
>  wrote:
> > We have a version mismatch problem which may be Hadoop related but may be
> due to a third party product we are using that requires us to run Zookeeper
> and Hadoop.  This product is rumored to soon be an Apache incubator project.
>  As I am not sure what I can reveal about this third party program prior to
> its release to Apache I will refer to it as XXX.
> >
> > We are running Hadoop 0.20.203.0.  We have no problems running Hadoop at
> all.  It runs our Hadoop programs and our Hadoop fs commands without any
> version mismatch complaints.  Localhost:50030 and 50070 both report we are
> running 0.20.203.0, r1099333.
> >
> > But when we try to initialize XXX we get
> "org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client =
> 60, server = 61)
> > org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client =
> 60, server = 61)".  The developers of XXX tell me that this error is coming
> from HDFS and is unrelated to their program.  (XXX does not include any
> Hadoop or Zookeeper jar files - as HBase does - but simply grabs these from
> HADOOP_HOME which points to our 0.20.203.0 installation and ZOOKEEPER_HOME.)
> >
> >
> > 1.What exactly does "client = 60" mean?  Which Hadoop version is this
> referring to?
> >
> > 2.What exactly does "server = 61" mean?  Which Hadoop version is this
> referring to?
> >
> > 3.Any ideas on whether this is a problem with my Hadoop configuration
> or whether this is a problem with XXX?
> >
> >
> > 17 15:20:56,564 [security.Groups] INFO : Group mapping
> impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
> cacheTimeout=30
> > 17 15:20:56,704 [conf.Configuration] WARN : mapred.task.id is
> deprecated. Instead, use mapreduce.task.attempt.id
> > 17 15:20:56,771 [util.Initialize] FATAL:
> org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client =
> 60, server = 61)
> > org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol
> org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client =
> 60, server = 61)
> > at
> org.apache.hadoop.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:231)
> > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:224)
> > at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:156)
> > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:255)
> > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:222)
> > at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:94)
> > at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1734)
> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:74)
> > at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1768)
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1750)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:234)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:131)
> >
> > Alan
> >
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>