Re: Datanode registration, port number

2013-12-31 Thread Dhaivat Pandya
Hi Andrew,

Implementation:

The cache "layer" is written in Go and wraps around the DataNode so that
all traffic between the DataNode and NameNode as well as DataNode and
client flow through the cache layer. The layer currently employs a simple
LRU cache, where files that are requested are placed on the cache and
loaded into RAM. Requests for data first come through the cache layer which
will respond to them immediately if the given file is found in the cache
and pass it on to the DataNode otherwise.

One of the goals is to add caching while leaving the rest of the stack
(including Hadoop/HDFS) completely untouched; this is something where
HDFS-4949 and my cache layer differ (although it isn't really that big of a
deal).

Also, another portion of the cache layer runs on the client machine (e.g.
worker node) that proxies all communication between client and NameNode in
order to do metadata caching (e.g. getFileListing calls are cached and
updated on a regular basis).

I'm not sure what level of implementation detail you were looking for;
please let me know if anything is missing.

Use cases:

This is something I'm still trying to figure out because I'm still
measuring how much latency is saved with the caching. Purely based on
speculation, I have a feeling that some machine learning algorithms that
run on Hadoop, use multiple passes over the same data and consist of
multi-step procedures will be able to extract significant gains from the
caching.





On Mon, Dec 30, 2013 at 2:48 PM, Andrew Wang wrote:

> Hi Dhaivat,
>
> I did a good chunk of the design and implementation of HDFS-4949, so if you
> could post a longer writeup of your envisioned use cases and
> implementation, I'd definitely be interested in taking a look.
>
> It's also good to note that HDFS-4949 is only the foundation for a whole
> slew of potential enhancements. We're planning to add some form of
> automatic cache replacement, which as a first step could just be an
> external policy that manages your static caching directives. It should also
> already be possible to integrate a job scheduler with HDFS-4949, since it
> both exposes the cache state of the cluster and allows a scheduler to
> prefetch data into RAM. Finally, we're also thinking about caching at finer
> granularities, e.g. block or sub-block rather than file-level caching,
> which is nice for apps that only read regions of a file.
>
> Best,
> Andrew
>
>
> On Mon, Dec 23, 2013 at 9:57 PM, Dhaivat Pandya  >wrote:
>
> > Hi Harsh,
> >
> > Thanks a lot for the response. As it turns out, I figured out the
> > registration mechanism this evening and how the sourceId is relayed to
> the
> > NN.
> >
> > As for your question about the cache layer it is a similar basic concept
> as
> > the plan mentioned, but the technical details differ significantly. First
> > of all, instead of having the user tell the namenode to perform caching
> (as
> > it seems from the proposal on JIRA), there is a distributed caching
> > algorithm that decides what files will be cached. Secondly, I am
> > implementing a hook-in with the job scheduler that arranges jobs
> according
> > to what files are cached at a given point in time (and also allows files
> to
> > be cached based on what jobs are to be run).
> >
> > Also, the cache layer does a bit of metadata caching; the numbers on it
> are
> > not all in, but thus far, some of the *metadata* caching surprisingly
> gives
> > a pretty nice reduction in response time.
> >
> > Any thoughts on the cache layer would be greatly appreciated.
> >
> > Thanks,
> >
> > Dhaivat
> >
> >
> > On Mon, Dec 23, 2013 at 11:46 PM, Harsh J  wrote:
> >
> > > Hi,
> > >
> > > On Mon, Dec 23, 2013 at 9:41 AM, Dhaivat Pandya <
> dhaivatpan...@gmail.com
> > >
> > > wrote:
> > > > Hi,
> > > >
> > > > I'm currently trying to build a cache layer that should sit "on top"
> of
> > > the
> > > > datanode. Essentially, the namenode should know the port number of
> the
> > > > cache layer instead of that of the datanode (since the namenode then
> > > relays
> > > > this information to the default HDFS client). All of the
> communication
> > > > between the datanode and the namenode currently flows through my
> cache
> > > > layer (including heartbeats, etc.)
> > >
> > > Curious Q: What does your cache layer aim to do btw? If its a data
> > > cache, have you checked out the design being implemented currently by
> > > https://issues.apache.org/jira/browse/HDFS-4949?
> > >
> > > > *First question*: is there a way to tell the namenode where a
> datanode
> > > > should be? Any way to trick it into thinking that the datanode is on
> a
> > > port
> > > > number where it actually isn't? As far as I can tell, the port number
> > is
> > > > obtained from the DatanodeId object; can this be set in the
> > configuration
> > > > so that the port number derived is that of the cache layer?
> > >
> > > The NN receives a DN host and port from the DN directly. The DN sends
> > > it whatever its running on. See
>

Re: Possible Length issue

2013-12-31 Thread Colin McCabe
There is a maximum length for message buffers that was introduced by
HADOOP-9676.  So messages with length 1752330339 should not be
accepted.

best,
Colin

On Sat, Dec 28, 2013 at 11:06 AM, Dhaivat Pandya
 wrote:
> Hi,
>
> I've been working a lot with the Hadoop NameNode IPC protocol (while
> building a cache layer on top of Hadoop). I've noticed that for request
> packets coming from the default DFS client that do not have a method name,
> the length field is often *completely *off.
>
> For example, I've been looking at packets with length 1752330339. I thought
> this might have been an issue with my cache layer, so I checked with
> Wireshark, and found packets with such absurd length parameters (obviously,
> the packets themselves weren't actually that long; the length field was
> misrepresented).
>
> Unfortunately, I haven't had the opportunity to test this issue on other
> machines and setups (the reproducing steps should be running an "ls /" with
> the default DFS client and sniffing the packets to find the length
> parameter, release 1.2.1).
>
> Is this normal behavior, a bug or something I'm missing?
>
> Thank you,
>
> Dhaivat


Build failed in Jenkins: Hadoop-Common-trunk #998

2013-12-31 Thread Apache Jenkins Server
See 

Changes:

[vinodkv] YARN-1121. Addendum patch. Fixed AsyncDispatcher hang issue during 
stop due to a race condition caused by the previous patch. Contributed by Jian 
He.

[vinodkv] YARN-1522. Fixed a race condition in the test TestApplicationCleanup 
that was causing it to randomly fail. Contributed by Liyin Liang.

[vinodkv] MAPREDUCE-5685. Fixed a bug with JobContext getCacheFiles API inside 
the WrappedReducer class. Contributed by Yi Song.

[wang] HDFS-5701. Fix the CacheAdmin -addPool -maxTtl option name. Contributed 
by Stephen Chu.

[cmccabe] HDFS-5582. hdfs getconf -excludeFile or -includeFile always failed 
(satish via cmccabe)

[wang] Add updated editsStored files missing from initial HDFS-5636 commit.

--
[...truncated 60274 lines...]
Adding reference: maven.local.repository
[DEBUG] Initialize Maven Ant Tasks
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.jar!/org/apache/maven/ant/tasks/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.jar!/org/apache/maven/ant/tasks/antlib.xml
 from a zip file
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.2/ant-1.8.2.jar!/org/apache/tools/ant/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.2/ant-1.8.2.jar!/org/apache/tools/ant/antlib.xml
 from a zip file
Class org.apache.maven.ant.tasks.AttachArtifactTask loaded from parent loader 
(parentFirst)
 +Datatype attachartifact org.apache.maven.ant.tasks.AttachArtifactTask
Class org.apache.maven.ant.tasks.DependencyFilesetsTask loaded from parent 
loader (parentFirst)
 +Datatype dependencyfilesets org.apache.maven.ant.tasks.DependencyFilesetsTask
Setting project property: test.build.dir -> 

Setting project property: test.exclude.pattern -> _
Setting project property: hadoop.assemblies.version -> 3.0.0-SNAPSHOT
Setting project property: test.exclude -> _
Setting project property: distMgmtSnapshotsId -> apache.snapshots.https
Setting project property: project.build.sourceEncoding -> UTF-8
Setting project property: java.security.egd -> file:///dev/urandom
Setting project property: distMgmtSnapshotsUrl -> 
https://repository.apache.org/content/repositories/snapshots
Setting project property: distMgmtStagingUrl -> 
https://repository.apache.org/service/local/staging/deploy/maven2
Setting project property: avro.version -> 1.7.4
Setting project property: test.build.data -> 

Setting project property: commons-daemon.version -> 1.0.13
Setting project property: hadoop.common.build.dir -> 

Setting project property: testsThreadCount -> 4
Setting project property: maven.test.redirectTestOutputToFile -> true
Setting project property: jdiff.version -> 1.0.9
Setting project property: distMgmtStagingName -> Apache Release Distribution 
Repository
Setting project property: project.reporting.outputEncoding -> UTF-8
Setting project property: build.platform -> Linux-i386-32
Setting project property: protobuf.version -> 2.5.0
Setting project property: failIfNoTests -> false
Setting project property: protoc.path -> ${env.HADOOP_PROTOC_PATH}
Setting project property: jersey.version -> 1.9
Setting project property: distMgmtStagingId -> apache.staging.https
Setting project property: distMgmtSnapshotsName -> Apache Development Snapshot 
Repository
Setting project property: ant.file -> 

[DEBUG] Setting properties with prefix: 
Setting project property: project.groupId -> org.apache.hadoop
Setting project property: project.artifactId -> hadoop-common-project
Setting project property: project.name -> Apache Hadoop Common Project
Setting project property: project.description -> Apache Hadoop Common Project
Setting project property: project.version -> 3.0.0-SNAPSHOT
Setting project property: project.packaging -> pom
Setting project property: project.build.directory -> 

Setting project property: project.build.outputDirectory -> 

Setting project property: project.build.testOutputDirectory -> 

Setting project property: project.build.sourceDirectory ->