Re: Any related paper on how to resolve hadoop SPOF issue?

2011-10-08 Thread M. C. Srivas
Are there any performance benchmarks available for Ceph? (with Hadoop,
without, both?)

On Thu, Aug 25, 2011 at 11:44 AM, Alex Nelson  wrote:

> Hi George,
>
> UC Santa Cruz contributed a ;login: article describing replacing HDFS with
> Ceph.  (I was one of the authors.)  One of the key architectural advantages
> of Ceph over HDFS is that Ceph distributes its metadata service over
> multiple metadata servers.
>
> I hope that helps.
>
> --Alex
>
>
> On Aug 25, 2011, at 02:54 , George Kousiouris wrote:
>
> >
> > Hi,
> >
> > many thanks!!
> >
> > I also found this one, which has some related work also with other
> efforts, in case it is helpful for someone else:
> >
> https://ritdml.rit.edu/bitstream/handle/1850/13321/ATalwalkarThesis1-2011.pdf?sequence=1
> >
> >
> > BR,
> > George
> >
> > On 8/25/2011 12:46 PM, Nan Zhu wrote:
> >> Hope it helps
> >>
> >> http://www.springerlink.com/content/h17r882710314147/
> >>
> >> Best,
> >>
> >> Nan
> >>
> >> On Thu, Aug 25, 2011 at 5:43 PM, George Kousiouris<
> gkous...@mail.ntua.gr>wrote:
> >>
> >>> Hi guys,
> >>>
> >>> We are currently in the process of writing a paper regarding hadoop and
> we
> >>> would like to reference any attempt to remove the single point of
> failure of
> >>> the Namenode. We have found in various presentations some efforts (like
> >>> dividing the namespace between more than one namenodes if i remember
> >>> correctly) but the search for a concrete paper on the issue came to
> nothing.
> >>>
> >>> Is anyone aware or has participated in such an effort?
> >>>
> >>> Thanks,
> >>> George
> >>>
> >>> --
> >>>
> >>> ---
> >>>
> >>> George Kousiouris
> >>> Electrical and Computer Engineer
> >>> Division of Communications,
> >>> Electronics and Information Engineering
> >>> School of Electrical and Computer Engineering
> >>> Tel: +30 210 772 2546
> >>> Mobile: +30 6939354121
> >>> Fax: +30 210 772 2569
> >>> Email: gkous...@mail.ntua.gr
> >>> Site: http://users.ntua.gr/gkousiou/
> >>>
> >>> National Technical University of Athens
> >>> 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
> >>>
> >>>
> >>
> >
> >
> > --
> >
> > ---
> >
> > George Kousiouris
> > Electrical and Computer Engineer
> > Division of Communications,
> > Electronics and Information Engineering
> > School of Electrical and Computer Engineering
> > Tel: +30 210 772 2546
> > Mobile: +30 6939354121
> > Fax: +30 210 772 2569
> > Email: gkous...@mail.ntua.gr
> > Site: http://users.ntua.gr/gkousiou/
> >
> > National Technical University of Athens
> > 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
> >
>
>


Re: performance normal?

2011-10-08 Thread Raj Vishwanathan
Really horriblr performance

Sent from my iPad
Please excuse the typos. 

On Oct 8, 2011, at 12:12 AM, tom uno  wrote:

> release 0.21.0
> Production System
> 20 nodes
> read 100 megabits per second
> write 10 megabits per second
> performance normal?


Re: Simple Hadoop program build with Maven

2011-10-08 Thread Periya.Data
Fantastic ! Worked like a charm. Thanks much Bochun.

For those who are facing similar issues, here is the command and output:

$ hadoop jar ../MyHadoopProgram.jar com.ABC.MyHadoopProgram -libjars
~/CDH3/extJars/json-rpc-1.0.jar /usr/PD/input/sample22.json /usr/PD/output
11/10/08 17:51:45 INFO mapred.FileInputFormat: Total input paths to process
: 1
11/10/08 17:51:46 INFO mapred.JobClient: Running job: job_201110072230_0005
11/10/08 17:51:47 INFO mapred.JobClient:  map 0% reduce 0%
11/10/08 17:51:58 INFO mapred.JobClient:  map 50% reduce 0%
11/10/08 17:51:59 INFO mapred.JobClient:  map 100% reduce 0%
11/10/08 17:52:08 INFO mapred.JobClient:  map 100% reduce 100%
11/10/08 17:52:10 INFO mapred.JobClient: Job complete: job_201110072230_0005
11/10/08 17:52:10 INFO mapred.JobClient: Counters: 23
11/10/08 17:52:10 INFO mapred.JobClient:   Job Counters
11/10/08 17:52:10 INFO mapred.JobClient: Launched reduce tasks=1
11/10/08 17:52:10 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=17981
11/10/08 17:52:10 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
11/10/08 17:52:10 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
11/10/08 17:52:10 INFO mapred.JobClient: Launched map tasks=2
11/10/08 17:52:10 INFO mapred.JobClient: Data-local map tasks=2
11/10/08 17:52:10 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9421
11/10/08 17:52:10 INFO mapred.JobClient:   FileSystemCounters
11/10/08 17:52:10 INFO mapred.JobClient: FILE_BYTES_READ=606
11/10/08 17:52:10 INFO mapred.JobClient: HDFS_BYTES_READ=56375
11/10/08 17:52:10 INFO mapred.JobClient: FILE_BYTES_WRITTEN=157057
11/10/08 17:52:10 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=504
11/10/08 17:52:10 INFO mapred.JobClient:   Map-Reduce Framework
11/10/08 17:52:10 INFO mapred.JobClient: Reduce input groups=24
11/10/08 17:52:10 INFO mapred.JobClient: Combine output records=24
11/10/08 17:52:10 INFO mapred.JobClient: Map input records=24
11/10/08 17:52:10 INFO mapred.JobClient: Reduce shuffle bytes=306
11/10/08 17:52:10 INFO mapred.JobClient: Reduce output records=24
11/10/08 17:52:10 INFO mapred.JobClient: Spilled Records=48
11/10/08 17:52:10 INFO mapred.JobClient: Map output bytes=552
11/10/08 17:52:10 INFO mapred.JobClient: Map input bytes=54923
11/10/08 17:52:10 INFO mapred.JobClient: Combine input records=24
11/10/08 17:52:10 INFO mapred.JobClient: Map output records=24
11/10/08 17:52:10 INFO mapred.JobClient: SPLIT_RAW_BYTES=240
11/10/08 17:52:10 INFO mapred.JobClient: Reduce input records=24
$



Appreciate you help.
PD.

On Fri, Oct 7, 2011 at 11:31 PM, Bochun Bai  wrote:

> To make a jar bundled big jar file using maven I suggest this plugin:
>http://anydoby.com/fatjar/usage.html
> But I prefer not doing so, because the classpath order is different
> with different environment.
>
> I guess your old myHadoopProgram.jar should contains Main-Class meta info.
> So the following ***xxx*** part is omitted. It originally likes:
>
> hadoop jar jar/myHadoopProgram.jar ***com.ABC.xxx*** -libjars
> ../lib/json-rpc-1.0.jar
> /usr/PD/input/sample22.json /usr/PD/output/
>
> I suggest you add the Main-Class meta following this:
>
> http://maven.apache.org/plugins/maven-assembly-plugin/usage.html#Advanced_Configuration
> or
>pay attention to the order of  and <-libjars ..> using:
>hadoop jar   <-libjars ...>  
>
> On Sat, Oct 8, 2011 at 12:05 PM, Periya.Data 
> wrote:
> > Hi all,
> >I am migrating from ant builds to maven. So, brand new to Maven and do
> > not yet understand many parts of it.
> >
> > Problem: I have a perfectly working map-reduce program (working by ant
> > build). This program needs an external jar file (json-rpc-1.0.jar). So,
> when
> > I run the program, I do the following to get a nice output:
> >
> > $ hadoop jar jar/myHadoopProgram.jar -libjars ../lib/json-rpc-1.0.jar
> > /usr/PD/input/sample22.json /usr/PD/output/
> >
> > (note that I include the external jar file by the "-libjars" option as
> > mentioned in the "Hadoop: The Definitive Guide 2nd Edition" - page 253).
> > Everything is fine with my ant build.
> >
> > So, now, I move on to Maven. I had some trouble getting my pom.xml right.
> I
> > am still unsure if it is right, but, it builds "successfully" (the
> resulting
> > jar file has the class files of my program).  The essential part of my
> > pom.xml has the two following dependencies (a complete pom.xml is at the
> end
> > of this email).
> >
> > 
> > 
> >   com.metaparadigm
> >   json-rpc
> >   1.0
> > 
> >
> >  
> > 
> >   org.apache.hadoop
> >   hadoop-core
> >   0.20.2
> >   provided
> > 
> >
> >
> > I try to run it like this:
> >
> > $ hadoop jar ../myHadoopProgram.jar -libjars ../json-rpc-1.0.jar
> > com.ABC.MyHadoopProgram /usr/PD/input/sample22.json /usr/PD/output
> > Exception in thread "main" java.lang.ClassNotFoundExcep

Re: performance normal?

2011-10-08 Thread Brian Bockelman
"Normal operation" is a function of hardware.  Giving the version without the 
underlying hardware means I get to make up any answer I feel like.

I can't imagine a rational set of hardware where 10 megabits (as in, one 
megabyte) a second is normal.

Brian

On Oct 8, 2011, at 3:04 AM, Bochun Bai wrote:

> Yes, for single thread HDFS client.
> No, for well designed map-reduce jobs.
> 
> On Sat, Oct 8, 2011 at 3:12 PM, tom uno  wrote:
>> release 0.21.0
>> Production System
>> 20 nodes
>> read 100 megabits per second
>> write 10 megabits per second
>> performance normal?
>> 



nmap scripts

2011-10-08 Thread John Bond
Hello,

I have just posted a few nmap scripts to the nmap-dev list[1] also
available on github[2].  Would be usefull if i some people would be
willing to test them out and let me know of any errors or false
reports

to use copy the hadoop-* and hbase-* scripts into ~/.nmap/scripts and
run a command like the following

nmap  --script
hadoop-datanode-info,hbase-master-info,hadoop-jobtracker-info,hadoop-namenode-info,hadoop-tasktracker-info,hbase-region-info.nse
--script-args hadoop-jobtracker-info.userinfo,newtargets -p
60010,50030,50070,50075,50060,60030 master-hadoop-server.example.com

obviously these ports should not be open to unauthorized users

all feed back welcome

thanks
john

[1]http://seclists.org/nmap-dev/2011/q4/58
[2]https://github.com/b4ldr/nse-scripts


Re: DFSClient: Could not complete file

2011-10-08 Thread Todd Lipcon
On Fri, Oct 7, 2011 at 12:40 PM, Chris Curtin  wrote:
> hi Todd,
>
> Thanks for the reply.
>
> Yes I'm seeing > 30,000 ms a couple of times a day, though it looks like
> 4000 ms is average. Also see 150,000+ and lots of 50,000.
>
> Is there anything I can do about this? The bug is still open in JIRA.

Currently the following workarounds may be effective:
- schedule a cron job to run once every couple minutes that runs: find
/data/1/hdfs /data/2/hdfs/ ... -length 2 > /dev/null(this will
cause your inodes and dentries to get paged into cache so the block
report runs quickly)
- tune /proc/sys/vm/vfs_cache_pressure to a lower value (this will
encourage Linux to keep inodes and dentries in cache)

Both have some associated costs, but at least one of our customers has
found the above set of workarounds to be effective. Currently I'm
waiting on review of HDFS-2379, though if you are adventurous you
could consider building your own copy of Hadoop with this patch
applied. I've tested on a cluster and fairly confident it is safe.

Thanks
-Todd

> On Fri, Oct 7, 2011 at 2:15 PM, Todd Lipcon  wrote:
>
>> Hi Chris,
>>
>> You may be hitting HDFS-2379.
>>
>> Can you grep your DN logs for the string "BlockReport" and see if you
>> see any taking more than 3ms or so?
>>
>> -Todd
>>
>> On Fri, Oct 7, 2011 at 6:31 AM, Chris Curtin 
>> wrote:
>> > Sorry to bring this back from the dead, but we're having the issues
>> again.
>> >
>> > This is on a NEW cluster, using Cloudera 0.20.2-cdh3u0 (old was stock
>> Apache
>> > 0.20.2). Nothing carried over from the old cluster except data in HDFS
>> > (copied from old cluster). Bigger/more machines, more RAM, faster disks
>> etc.
>> > And it is back.
>> >
>> > Confirmed that all the disks setup for HDFS are 'deadline'.
>> >
>> > Runs fine for  few days then hangs again with the 'Could not complete'
>> error
>> > in the JobTracker log until we kill the cluster.
>> >
>> > 2011-09-09 08:04:32,429 INFO org.apache.hadoop.hdfs.DFSClient: Could not
>> > complete file
>> >
>> /log/hadoop/tmp/flow_BYVMTA_family_BYVMTA_72751_8284775/_logs/history/10.120.55.2_1311201333949_job_201107201835_13900_deliv_flow_BYVMTA%2Bflow_BYVMTA*family_B%5B%284%2F5%29+...UNCED%27%2C+
>> > retrying...
>> >
>> > Found HDFS-148 (https://issues.apache.org/jira/browse/HDFS-148) which
>> looks
>> > like what could be happening to us. Anyone found a good workaround?
>> > Any other ideas?
>> >
>> > Also, does the HDFS system try to do 'du' on disks not assigned to it?
>> The
>> > HDFS disks are separate from the root and OS disks. Those disks are NOT
>> > setup to be 'deadline'. Should that matter?
>> >
>> > Thanks,
>> >
>> > Chris
>> >
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: performance normal?

2011-10-08 Thread Bochun Bai
Yes, for single thread HDFS client.
No, for well designed map-reduce jobs.

On Sat, Oct 8, 2011 at 3:12 PM, tom uno  wrote:
> release 0.21.0
> Production System
> 20 nodes
> read 100 megabits per second
> write 10 megabits per second
> performance normal?
>


performance normal?

2011-10-08 Thread tom uno
release 0.21.0
Production System
20 nodes
read 100 megabits per second
write 10 megabits per second
performance normal?