from:"Edward Capriolo"

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Edward Capriolo

I tried your suggested setting and forced GC from Jconsole and once it
crept up nothing was freeing up.

So just food for thought:

You said "average file name size is 32 bytes". Well most of my data sits in

/user/hive/warehouse/
Then I have a tables with partitions.

Does it make sense to just move this to "/u/h/w"?

Will I be saving 400,000,000 bytes of memory if I do?
On Thu, Dec 27, 2012 at 5:41 PM, Suresh Srinivas wrote:

> I do not follow what you mean here.
>
> > Even when I forced a GC it cleared 0% memory.
> Is this with new younggen setting? Because earlier, based on the
> calculation I posted, you need ~11G in old generation. With 6G as the
> default younggen size, you actually had just enough memory to fit the
> namespace in oldgen. Hence you might not have seen Full GC freeing up
> enough memory.
>
> Have you tried Full GC with 1G youngen size have you tried this? I supsect
> you would see lot more memory freeing up.
>
> > One would think that since the entire NameNode image is stored in memory
> that the heap would not need to grow beyond that
> Namenode image that you see during checkpointing is the size of file
> written after serializing file system namespace in memory. This is not what
> is directly stored in namenode memory. Namenode stores data structures that
> corresponds to file system directory tree and block locations. Out of this
> only file system directory is serialized and written to fsimage. Blocks
> locations are not.
>
>
>
>
> On Thu, Dec 27, 2012 at 2:22 PM, Edward Capriolo  >wrote:
>
> > I am not sure GC had a factor. Even when I forced a GC it cleared 0%
> > memory. One would think that since the entire NameNode image is stored in
> > memory that the heap would not need to grow beyond that, but that sure
> does
> > not seem to be the case. a 5GB image starts off using 10GB of memory and
> > after "burn in" it seems to use about 15GB memory.
> >
> > So really we say "the name node data has to fit in memory" but what we
> > really mean is "the name node data must fit in memory 3x"
> >
> > On Thu, Dec 27, 2012 at 5:08 PM, Suresh Srinivas  > >wrote:
> >
> > > You did free up lot of old generation with reducing young generation,
> > > right? The extra 5G of RAM for the old generation should have helped.
> > >
> > > Based on my calculation, for the current number of objects you have,
> you
> > > need roughly:
> > > 12G of total heap with young generation size of 1G. This assumes the
> > > average file name size is 32 bytes.
> > >
> > > In later releases (>= 0.20.204), several memory optimization and
> startup
> > > optimizations have been done. It should help you as well.
> > >
> > >
> > >
> > > On Thu, Dec 27, 2012 at 1:48 PM, Edward Capriolo <
> edlinuxg...@gmail.com
> > > >wrote:
> > >
> > > > So it turns out the issue was just the size of the filesystem.
> > > > 2012-12-27 16:37:22,390 WARN
> > > > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint
> > > done.
> > > > New Image Size: 4,354,340,042
> > > >
> > > > Basically if the NN image size hits ~ 5,000,000,000 you get f'ed. So
> > you
> > > > need about 3x ram as your FSImage size. If you do not have enough you
> > > die a
> > > > slow death.
> > > >
> > > > On Sun, Dec 23, 2012 at 9:40 PM, Suresh Srinivas <
> > sur...@hortonworks.com
> > > > >wrote:
> > > >
> > > > > Do not have access to my computer. Based on reading the previous
> > > email, I
> > > > > do not see any thing suspicious on the list of objects in the histo
> > > live
> > > > > dump.
> > > > >
> > > > > I would like to hear from you about if it continued to grow. One
> > > instance
> > > > > of this I had seen in the past was related to weak reference
> related
> > to
> > > > > socket objects.  I do not see that happening here though.
> > > > >
> > > > > Sent from phone
> > > > >
> > > > > On Dec 23, 2012, at 10:34 AM, Edward Capriolo <
> edlinuxg...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Tried this..
> > > > > >
> > > > > > NameNode is still Ruining my Xmas on its slow death march to OOM.
> > > > > >
> > > > > > http://imagebin.org/240453
> > > > > >
> > > > > >
> > > > > > On Sat, Dec 22, 2012 at 10:23 PM, Suresh Srinivas <
> > > > > sur...@hortonworks.com>wrote:
> > > > > >
> > > > > >> -XX:NewSize=1G -XX:MaxNewSize=1G
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > http://hortonworks.com/download/
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Edward Capriolo

I am not sure GC had a factor. Even when I forced a GC it cleared 0%
memory. One would think that since the entire NameNode image is stored in
memory that the heap would not need to grow beyond that, but that sure does
not seem to be the case. a 5GB image starts off using 10GB of memory and
after "burn in" it seems to use about 15GB memory.

So really we say "the name node data has to fit in memory" but what we
really mean is "the name node data must fit in memory 3x"

On Thu, Dec 27, 2012 at 5:08 PM, Suresh Srinivas wrote:

> You did free up lot of old generation with reducing young generation,
> right? The extra 5G of RAM for the old generation should have helped.
>
> Based on my calculation, for the current number of objects you have, you
> need roughly:
> 12G of total heap with young generation size of 1G. This assumes the
> average file name size is 32 bytes.
>
> In later releases (>= 0.20.204), several memory optimization and startup
> optimizations have been done. It should help you as well.
>
>
>
> On Thu, Dec 27, 2012 at 1:48 PM, Edward Capriolo  >wrote:
>
> > So it turns out the issue was just the size of the filesystem.
> > 2012-12-27 16:37:22,390 WARN
> > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint
> done.
> > New Image Size: 4,354,340,042
> >
> > Basically if the NN image size hits ~ 5,000,000,000 you get f'ed. So you
> > need about 3x ram as your FSImage size. If you do not have enough you
> die a
> > slow death.
> >
> > On Sun, Dec 23, 2012 at 9:40 PM, Suresh Srinivas  > >wrote:
> >
> > > Do not have access to my computer. Based on reading the previous
> email, I
> > > do not see any thing suspicious on the list of objects in the histo
> live
> > > dump.
> > >
> > > I would like to hear from you about if it continued to grow. One
> instance
> > > of this I had seen in the past was related to weak reference related to
> > > socket objects.  I do not see that happening here though.
> > >
> > > Sent from phone
> > >
> > > On Dec 23, 2012, at 10:34 AM, Edward Capriolo 
> > > wrote:
> > >
> > > > Tried this..
> > > >
> > > > NameNode is still Ruining my Xmas on its slow death march to OOM.
> > > >
> > > > http://imagebin.org/240453
> > > >
> > > >
> > > > On Sat, Dec 22, 2012 at 10:23 PM, Suresh Srinivas <
> > > sur...@hortonworks.com>wrote:
> > > >
> > > >> -XX:NewSize=1G -XX:MaxNewSize=1G
> > >
> >
>
>
>
> --
> http://hortonworks.com/download/
>

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Edward Capriolo

So it turns out the issue was just the size of the filesystem.
2012-12-27 16:37:22,390 WARN
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint done.
New Image Size: 4,354,340,042

Basically if the NN image size hits ~ 5,000,000,000 you get f'ed. So you
need about 3x ram as your FSImage size. If you do not have enough you die a
slow death.

On Sun, Dec 23, 2012 at 9:40 PM, Suresh Srinivas wrote:

> Do not have access to my computer. Based on reading the previous email, I
> do not see any thing suspicious on the list of objects in the histo live
> dump.
>
> I would like to hear from you about if it continued to grow. One instance
> of this I had seen in the past was related to weak reference related to
> socket objects.  I do not see that happening here though.
>
> Sent from phone
>
> On Dec 23, 2012, at 10:34 AM, Edward Capriolo 
> wrote:
>
> > Tried this..
> >
> > NameNode is still Ruining my Xmas on its slow death march to OOM.
> >
> > http://imagebin.org/240453
> >
> >
> > On Sat, Dec 22, 2012 at 10:23 PM, Suresh Srinivas <
> sur...@hortonworks.com>wrote:
> >
> >> -XX:NewSize=1G -XX:MaxNewSize=1G
>

Re: NN Memory Jumps every 1 1/2 hours

2012-12-23 Thread Edward Capriolo

Tried this..

NameNode is still Ruining my Xmas on its slow death march to OOM.

http://imagebin.org/240453


On Sat, Dec 22, 2012 at 10:23 PM, Suresh Srinivas wrote:

> -XX:NewSize=1G -XX:MaxNewSize=1G

Re: NN Memory Jumps every 1 1/2 hours

2012-12-22 Thread Edward Capriolo

Ok so here is the latest.

http://imagebin.org/240392

I took a jmap on startup and one an hour after.

http://pastebin.com/xEkWid4f

I think the biggest deal is [B which may not be very helpful

num #instances #bytes  class name
--
   1:  25094067 2319943656  [B
   2:  23720125 1518088000
org.apache.hadoop.hdfs.server.namenode.INodeFile
   3:  24460244 1174091712
org.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo
   4:  25671649 1134707328  [Ljava.lang.Object;
   5:  31106937  995421984  java.util.HashMap$Entry
   6:  23725233  570829968
[Lorg.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo;
   7:  2934  322685952  [Ljava.util.HashMap$Entry;



 num #instances #bytes  class name
--
   1:  24739690 3727511000  [B
   2:  23280668 1489962752
org.apache.hadoop.hdfs.server.namenode.INodeFile
   3:  24850044 1192802112
org.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo
   4:  26124258 1157073272  [Ljava.lang.Object;
   5:  32142057 1028545824  java.util.HashMap$Entry
   6:  23307473  560625432
[Lorg.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo;

GC starts like this:

3.204: [GC 102656K->9625K(372032K), 0.0150300 secs]
3.519: [GC 112281K->21180K(372032K), 0.0741210 secs]
3.883: [GC 123836K->30729K(372032K), 0.0208900 secs]
4.194: [GC 132724K->45785K(372032K), 0.0293860 secs]
4.522: [GC 148441K->59282K(372032K), 0.0341330 secs]
4.844: [GC 161938K->70071K(372032K), 0.0284850 secs]
5.139: [GC 172727K->80624K(372032K), 0.0171910 secs]
5.338: [GC 183280K->90661K(372032K), 0.0184200 secs]
5.549: [GC 193317K->103126K(372032K), 0.0430080 secs]
5.775: [GC 205782K->113534K(372032K), 0.0359480 secs]
5.995: [GC 216190K->122832K(372032K), 0.0192900 secs]
6.192: [GC 225488K->131777K(372032K), 0.0183870 secs]

Then steadily increases

453.808: [GC 7482139K->7384396K(11240624K), 0.0208170 secs]
455.605: [GC 7487052K->7384177K(11240624K), 0.0206360 secs]
457.942: [GC 7486831K->7384131K(11240624K), 0.0189600 secs]
459.924: [GC 7486787K->7384141K(11240624K), 0.0193560 secs]
462.887: [GC 7486797K->7384151K(11240624K), 0.0189290 secs]


Until I triggered this full gc a moment ago.

6266.988: [GC 11255823K->10373641K(17194656K), 0.0331910 secs]
6280.083: [GC 11259721K->10373499K(17194656K), 0.0324870 secs]
6293.706: [GC 11259579K->10376656K(17194656K), 0.0324330 secs]
6309.781: [GC 11262736K->10376110K(17194656K), 0.0310330 secs]
6333.790: [GC 11262190K->10374348K(17194656K), 0.0297670 secs]
6333.934: [Full GC 10391746K->9722532K(17194656K), 63.9812940 secs]
6418.466: [GC 10608612K->9725743K(17201024K), 0.0339610 secs]
6421.420: [GC 10611823K->9760611K(17201024K), 0.1501610 secs]
6428.221: [GC 10646691K->9767236K(17201024K), 0.1503170 secs]
6437.431: [GC 10653316K->9734750K(17201024K), 0.0344960 secs]

Essentially gc sometimes clears some memory but not all and then the line
keeps rising. Delta is about 10-17 hours until the heap is exhaused.

On Sat, Dec 22, 2012 at 7:03 PM, Edward Capriolo wrote:

> Blocks is ~26,000,000 Files is a bit higher ~27,000,000
>
> Currently running:
> [root@hnn217 ~]# java -version
> java version "1.7.0_09"
> Was running 1.6.0_23
>
> export JVM_OPTIONS="-XX:+UseCompressedOops -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly"
>
> I will grab the gc logs and the heap dump in a follow up.
>
>
>
> On Sat, Dec 22, 2012 at 1:32 PM, Suresh Srinivas 
> wrote:
>
>> Please take a histo live dump when the memory is full. Note that this
>> causes full gc.
>> http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
>>
>> What are the number of blocks you have on the system.
>>
>> Send the JVM options you are using. From earlier java versions which used
>> 1/8 of total heap for young gen, it has gone upto 1/3 of total heap. This
>> could also be the reason.
>>
>> Do you collect gc logs? Send that as well.
>>
>> Sent from a mobile device
>>
>> On Dec 22, 2012, at 9:51 AM, Edward Capriolo 
>> wrote:
>>
>> > Newer 1.6 are getting close to 1.7 so I am not going to fear a number
>> and
>> > fight the future.
>> >
>> > I have been aat around 27 million files for a while been as high as 30
>> > million I do not think that is related.
>> >
>> > I do not think it is related to checkpoints but I am considering
>> > raising/lowering the checkpoint triggers.
>> >
>> > O

Re: NN Memory Jumps every 1 1/2 hours

2012-12-22 Thread Edward Capriolo

Blocks is ~26,000,000 Files is a bit higher ~27,000,000

Currently running:
[root@hnn217 ~]# java -version
java version "1.7.0_09"
Was running 1.6.0_23

export JVM_OPTIONS="-XX:+UseCompressedOops -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly"

I will grab the gc logs and the heap dump in a follow up.



On Sat, Dec 22, 2012 at 1:32 PM, Suresh Srinivas wrote:

> Please take a histo live dump when the memory is full. Note that this
> causes full gc.
> http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
>
> What are the number of blocks you have on the system.
>
> Send the JVM options you are using. From earlier java versions which used
> 1/8 of total heap for young gen, it has gone upto 1/3 of total heap. This
> could also be the reason.
>
> Do you collect gc logs? Send that as well.
>
> Sent from a mobile device
>
> On Dec 22, 2012, at 9:51 AM, Edward Capriolo 
> wrote:
>
> > Newer 1.6 are getting close to 1.7 so I am not going to fear a number and
> > fight the future.
> >
> > I have been aat around 27 million files for a while been as high as 30
> > million I do not think that is related.
> >
> > I do not think it is related to checkpoints but I am considering
> > raising/lowering the checkpoint triggers.
> >
> > On Saturday, December 22, 2012, Joep Rottinghuis  >
> > wrote:
> >> Do your OOMs correlate with the secondary checkpointing?
> >>
> >> Joep
> >>
> >> Sent from my iPhone
> >>
> >> On Dec 22, 2012, at 7:42 AM, Michael Segel 
> > wrote:
> >>
> >>> Hey Silly question...
> >>>
> >>> How long have you had 27 million files?
> >>>
> >>> I mean can you correlate the number of files to the spat of OOMs?
> >>>
> >>> Even without problems... I'd say it would be a good idea to upgrade due
> > to the probability of a lot of code fixes...
> >>>
> >>> If you're running anything pre 1.x, going to 1.7 java wouldn't be a
> good
> > idea.  Having said that... outside of MapR, have any of the distros
> > certified themselves on 1.7 yet?
> >>>
> >>> On Dec 22, 2012, at 6:54 AM, Edward Capriolo 
> > wrote:
> >>>
> >>>> I will give this a go. I have actually went in JMX and manually
> > triggered
> >>>> GC no memory is returned. So I assumed something was leaking.
> >>>>
> >>>> On Fri, Dec 21, 2012 at 11:59 PM, Adam Faris 
> > wrote:
> >>>>
> >>>>> I know this will sound odd, but try reducing your heap size.   We had
> > an
> >>>>> issue like this where GC kept falling behind and we either ran out of
> > heap
> >>>>> or would be in full gc.  By reducing heap, we were forcing concurrent
> > mark
> >>>>> sweep to occur and avoided both full GC and running out of heap space
> > as
> >>>>> the JVM would collect objects more frequently.
> >>>>>
> >>>>> On Dec 21, 2012, at 8:24 PM, Edward Capriolo 
> >>>>> wrote:
> >>>>>
> >>>>>> I have an old hadoop 0.20.2 cluster. Have not had any issues for a
> > while.
> >>>>>> (which is why I never bothered an upgrade)
> >>>>>>
> >>>>>> Suddenly it OOMed last week. Now the OOMs happen periodically. We
> > have a
> >>>>>> fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
> >>>>>> 27,000,000 files.
> >>>>>>
> >>>>>> So the strangest thing is that every 1 and 1/2 hour the NN memory
> > usage
> >>>>>> increases until the heap is full.
> >>>>>>
> >>>>>> http://imagebin.org/240287
> >>>>>>
> >>>>>> We tried failing over the NN to another machine. We change the Java
> >>>>> version
> >>>>>> from 1.6_23 -> 1.7.0.
> >>>>>>
> >>>>>> I have set the NameNode logs to debug and ALL and I have done the
> same
> >>>>> with
> >>>>>> the data nodes.
> >>>>>> Secondary NN is running and shipping edits and making new images.
> >>>>>>
> >>>>>> I am thinking something has corrupted the NN MetaData and after
> enough
> >>>>> time
> >>>>>> it becomes a time bomb, but this is just a total shot in the dark.
> > Does
> >>>>>> anyone have any interesting trouble shooting ideas?
> >>
>

Re: NN Memory Jumps every 1 1/2 hours

2012-12-22 Thread Edward Capriolo

Newer 1.6 are getting close to 1.7 so I am not going to fear a number and
fight the future.

I have been aat around 27 million files for a while been as high as 30
million I do not think that is related.

I do not think it is related to checkpoints but I am considering
raising/lowering the checkpoint triggers.

On Saturday, December 22, 2012, Joep Rottinghuis 
wrote:
> Do your OOMs correlate with the secondary checkpointing?
>
> Joep
>
> Sent from my iPhone
>
> On Dec 22, 2012, at 7:42 AM, Michael Segel 
wrote:
>
>> Hey Silly question...
>>
>> How long have you had 27 million files?
>>
>> I mean can you correlate the number of files to the spat of OOMs?
>>
>> Even without problems... I'd say it would be a good idea to upgrade due
to the probability of a lot of code fixes...
>>
>> If you're running anything pre 1.x, going to 1.7 java wouldn't be a good
idea.  Having said that... outside of MapR, have any of the distros
certified themselves on 1.7 yet?
>>
>> On Dec 22, 2012, at 6:54 AM, Edward Capriolo 
wrote:
>>
>>> I will give this a go. I have actually went in JMX and manually
triggered
>>> GC no memory is returned. So I assumed something was leaking.
>>>
>>> On Fri, Dec 21, 2012 at 11:59 PM, Adam Faris 
wrote:
>>>
>>>> I know this will sound odd, but try reducing your heap size.   We had
an
>>>> issue like this where GC kept falling behind and we either ran out of
heap
>>>> or would be in full gc.  By reducing heap, we were forcing concurrent
mark
>>>> sweep to occur and avoided both full GC and running out of heap space
as
>>>> the JVM would collect objects more frequently.
>>>>
>>>> On Dec 21, 2012, at 8:24 PM, Edward Capriolo 
>>>> wrote:
>>>>
>>>>> I have an old hadoop 0.20.2 cluster. Have not had any issues for a
while.
>>>>> (which is why I never bothered an upgrade)
>>>>>
>>>>> Suddenly it OOMed last week. Now the OOMs happen periodically. We
have a
>>>>> fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
>>>>> 27,000,000 files.
>>>>>
>>>>> So the strangest thing is that every 1 and 1/2 hour the NN memory
usage
>>>>> increases until the heap is full.
>>>>>
>>>>> http://imagebin.org/240287
>>>>>
>>>>> We tried failing over the NN to another machine. We change the Java
>>>> version
>>>>> from 1.6_23 -> 1.7.0.
>>>>>
>>>>> I have set the NameNode logs to debug and ALL and I have done the same
>>>> with
>>>>> the data nodes.
>>>>> Secondary NN is running and shipping edits and making new images.
>>>>>
>>>>> I am thinking something has corrupted the NN MetaData and after enough
>>>> time
>>>>> it becomes a time bomb, but this is just a total shot in the dark.
Does
>>>>> anyone have any interesting trouble shooting ideas?
>>
>

Re: NN Memory Jumps every 1 1/2 hours

2012-12-22 Thread Edward Capriolo

I will give this a go. I have actually went in JMX and manually triggered
GC no memory is returned. So I assumed something was leaking.

On Fri, Dec 21, 2012 at 11:59 PM, Adam Faris  wrote:

> I know this will sound odd, but try reducing your heap size.   We had an
> issue like this where GC kept falling behind and we either ran out of heap
> or would be in full gc.  By reducing heap, we were forcing concurrent mark
> sweep to occur and avoided both full GC and running out of heap space as
> the JVM would collect objects more frequently.
>
> On Dec 21, 2012, at 8:24 PM, Edward Capriolo 
> wrote:
>
> > I have an old hadoop 0.20.2 cluster. Have not had any issues for a while.
> > (which is why I never bothered an upgrade)
> >
> > Suddenly it OOMed last week. Now the OOMs happen periodically. We have a
> > fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
> > 27,000,000 files.
> >
> > So the strangest thing is that every 1 and 1/2 hour the NN memory usage
> > increases until the heap is full.
> >
> > http://imagebin.org/240287
> >
> > We tried failing over the NN to another machine. We change the Java
> version
> > from 1.6_23 -> 1.7.0.
> >
> > I have set the NameNode logs to debug and ALL and I have done the same
> with
> > the data nodes.
> > Secondary NN is running and shipping edits and making new images.
> >
> > I am thinking something has corrupted the NN MetaData and after enough
> time
> > it becomes a time bomb, but this is just a total shot in the dark. Does
> > anyone have any interesting trouble shooting ideas?
>
>

NN Memory Jumps every 1 1/2 hours

2012-12-21 Thread Edward Capriolo

I have an old hadoop 0.20.2 cluster. Have not had any issues for a while.
(which is why I never bothered an upgrade)

Suddenly it OOMed last week. Now the OOMs happen periodically. We have a
fairly large NameNode heap Xmx 17GB. It is a fairly large FS about
27,000,000 files.

So the strangest thing is that every 1 and 1/2 hour the NN memory usage
increases until the heap is full.

http://imagebin.org/240287

We tried failing over the NN to another machine. We change the Java version
from 1.6_23 -> 1.7.0.

I have set the NameNode logs to debug and ALL and I have done the same with
the data nodes.
Secondary NN is running and shipping edits and making new images.

I am thinking something has corrupted the NN MetaData and after enough time
it becomes a time bomb, but this is just a total shot in the dark. Does
anyone have any interesting trouble shooting ideas?

Re: Regarding DataJoin contrib jar for 1.0.3

2012-07-25 Thread Edward Capriolo

DataJoin is an example. Most people doing joins use Hive or Pig rather
then code them up themselves.


On Tue, Jul 24, 2012 at 5:19 PM, Abhinav M Kulkarni
 wrote:
> Hi,
>
> Do we not have any info on this? Join must be such a common scenario for
> most of the people out on this list.
>
> Thanks.
>
> On 07/22/2012 10:22 PM, Abhinav M Kulkarni wrote:
>>
>> Hi,
>>
>> I was planning to use DataJoin jar (located in
>> $HADOOP_INSTALL/contrib/datajoin) for reduce-side join (version 1.0.3).
>>
>> It looks like DataJoinMapperBase implements Mapper interface (according to
>> old API) and not extends it (according to new API). This is a problem
>> because I cannot write Map classes that extend DataJoinMapperBase.
>>
>> Do we have newer version of data join jar?
>>
>> Thanks.
>
>

Re: hadoop FileSystem.close()

2012-07-24 Thread Edward Capriolo

In all my experience you let FileSystem instances close themselves.

On Tue, Jul 24, 2012 at 10:34 AM, Koert Kuipers  wrote:
> Since FileSystem is a Closeable i would expect code using it to be like
> this:
>
> FileSystem fs = path.getFileSystem(conf);
> try {
> // do something with fs, such as read from the path
> } finally {
> fs.close()
> }
>
> However i have repeatedly gotten into trouble with this approach. In one
> situation it turned out that when i closed a FileSystem other operations
> that were using their own FileSystems (pointing to the same real-world HDFS
> filesystem) also saw their FileSystems closed, leading to very confusing
> read and write errors. This led me to believe that FileSystem should never
> be closed since it seemed to act like some sort of Singleton. However now
> was just looking at some code (Hoop server, to be precise) and noticed that
> FileSystems were indeed closed, but they were always threadlocal. Is this
> the right approach?
>
> And if FileSystem is threadlocal, is this safe (assuming fs1 and fs2 could
> point to the same real-world filesystem)?
>
> FileSystem fs1 = path.getFileSystem(conf);
> try {
> FileSystem fs2 = path.getFileSystem(conf);
> try {
> // do something with fs2, such as read from the path
> } finally {
>   fs2.close()
> }
> // do something with fs1, such as read from the path (note, fs2 is
> closed here, and i wouldn't be surprised if fs1 by now is also closed given
> my experience)
> } finally {
>   fs1.close()
> }

Re: Avro vs Protocol Buffer

2012-07-20 Thread Edward Capriolo

We just open sourced our protobuf support for Hive. We built it out
because in our line of work protobuf is very common and it gave us the
ability to log protobufs directly to files and then query them.

https://github.com/edwardcapriolo/hive-protobuf

I did not do any heavy benchmarking vs avro. However I did a few
things, sorry that I do not have exact numbers here.

A compresses SequenceFile of Text verses a sequence file of protobufs
is maybe 5-10 percent smaller depending on the data. That is pretty
good compression, so space wise your are not hurting there.

Speed wise I have to do some more analysis. Our input format is doing
reflection so that will have its cost (although we tried to cache
things where possible) protobuf has some DynamicObject components
which I need to explore to possibly avoid reflection. also you have to
consider that protobuf's do more (then TextinputFormat) like validate
data, so if you comparing raw speed you have to watch out for apples
to oranges type stuff.

I never put our ProtoBuf format head to head with the AvroFormat.
Generally I hate those type of benchmarks but I would be curious to
know.

Overall if you have no global serialization format (company wide) you
have to look at what tools you have and what they support. Aka Hive
has avro and protobuf, but maybe pig only has one of the other. Are
you using sqoop? and can it output files in the format that you want?
Are you using a language like Ruby and what support do you have there.

In my mind speed is important but compatibility is more so, for
example, even if reading avro was 2 times slower then reading thrift
(which it is not),your jobs might doing some very complex logic with a
long shuffle sort and reduce phase. Then the performance of physically
reading the file is not as important as it may seem.

On Thu, Jul 19, 2012 at 12:34 PM, Harsh J  wrote:
> +1 to what Bruno's pointed you at. I personally like Avro for its data
> files (schema's stored on file, and a good, splittable container for
> typed data records). I think speed for serde is on-par with Thrift, if
> not faster today. Thrift offers no optimized data container format
> AFAIK.
>
> On Thu, Jul 19, 2012 at 1:57 PM, Bruno Freudensprung
>  wrote:
>> Once new results will be available, you might be interested in:
>> https://github.com/eishay/jvm-serializers/wiki/
>> https://github.com/eishay/jvm-serializers/wiki/Staging-Results
>>
>> My2cts,
>>
>> Bruno.
>>
>> Le 16/07/2012 22:49, Mike S a écrit :
>>
>>> Strictly from speed and performance perspective, is Avro as fast as
>>> protocol buffer?
>>>
>>
>
>
>
> --
> Harsh J

Re: Group mismatches?

2012-07-16 Thread Edward Capriolo

In all places I have found it only to be the primary group, not all
the users supplemental groups.

On Mon, Jul 16, 2012 at 3:05 PM, Clay B.  wrote:
> Hi all,
>
> I have a Hadoop cluster which uses Samba to map an Active Directory domain
> to my CentOS 5.7 Hadoop cluster. However, I notice a strange mismatch with
> groups. Does anyone have any debugging advice, or how to refresh the DFS
> groups mapping? If not, should I file a bug at
> https://issues.apache.org/jira/browse/HADOOP?
>
> # I see the following error:
> [clayb@hamster ~]$ hadoop fs -ls /projects/foobarcommander
> log4j:ERROR Could not find value for key log4j.appender.NullAppender
> log4j:ERROR Could not instantiate appender named "NullAppender".
> ls: could not get get listing for
> 'hdfs://hamster:9000/projects/foobarcommander' :
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=clayb, access=READ_EXECUTE,
> inode="/projects/foobarcommander":hadmin:foobarcommander:drwxrwx---
>
> # I verify group membership -- look a mismatch!
> [clayb@hamster ~]$ which groups
> /usr/bin/groups
> [clayb@hamster ~]$ groups
> foobarcommander xxx_rec_eng domain users all all_north america batchlogon
> xxx-s xxx03-s xxx1-admins xxx-emr-users xxx-emr-admins xxx1-users
> BUILTIN\users
> [clayb@hamster ~]$ hadoop dfsgroups
> log4j:ERROR Could not find value for key log4j.appender.NullAppender
> log4j:ERROR Could not instantiate appender named "NullAppender".
> clayb : domain users xxx_rec_eng xxx-emr-users all xxx-emr-admins batchlogon
> all_north america xxx1-users xxx-s xxx03-s xxx1-admins BUILTIN\users
>
> Notice, in particular the foobarcommander group is only shown for my
> /usr/bin/groups output. It looks like the following from the HDFS
> Permissions Guide[1] is not correct, in my case:
> "The group list is the equivalent of `bash -c groups`."
>
> # I have tried the following to no useful effect:
> [admin@hamster ~]$ hadoop dfsadmin -refreshUserToGroupsMappings
> log4j:ERROR Could not find value for key log4j.appender.NullAppender
> log4j:ERROR Could not instantiate appender named "NullAppender".
>
> # I do, however, see other users with the foobarcommander group, so the
> group should be "visible" to Hadoop:
> [clayb@hamster ~]$ hadoop dfsgroups pat
> log4j:ERROR Could not find value for key log4j.appender.NullAppender
> log4j:ERROR Could not instantiate appender named "NullAppender".
> pat : domain users all_north america all_san diego all foobarcommander
> BUILTIN\users
> # And 'hadoop mrgroups' (like dfsgroups) returns the same bad data, for me:
> [clayb@hamster ~]$ hadoop mrgroups
> log4j:ERROR Could not find value for key log4j.appender.NullAppender
> log4j:ERROR Could not instantiate appender named "NullAppender".
> clayb : domain users xxx_rec_eng xxx-emr-users all xxx-emr-admins batchlogon
> all_north america xxx1-users xxx-s xxx03-s xxx1-admins BUILTIN\users
>
> # And I see that the system is getting the right data via getent(1):
> [clayb@hamster ~]$ getent group foobarcommander
> foobarcommander:*:16777316:pat,user1,user2,user3,clayb,user4,user5,user6,user7,user8,user9,user10,user12,user13,user14,user15
>
> # I am using Cloudera's CDH3u4 Hadoop:
> [clayb@hamster ~]$ hadoop version
> Hadoop 0.20.2-cdh3u4
> Subversion file:///data/1/tmp/topdir/BUILD/hadoop-0.20.2-cdh3u4 -r
> 214dd731e3bdb687cb55988d3f47dd9e248c5690
> Compiled by root on Mon May  7 14:03:02 PDT 2012
> From source with checksum a60c9795e41a3248b212344fb131c12c
>
> I also do not see any obviously useful errors in my namenode logs.
>
> -Clay
>
> [1]:
> http://hadoop.apache.org/common/docs/r0.20.2/hdfs_permissions_guide.html#User+Identity
>

Re: stuck in safe mode after restarting dfs after found dead node

2012-07-14 Thread Edward Capriolo

If the files are gone forever you should run:

hadoop fsck -delete /

To acknowledge they have moved on from existence. Otherwise things
that attempt to read this files will, to put it in a technical way,
BARF.

On Fri, Jul 13, 2012 at 12:22 PM, Juan Pino  wrote:
> Thank you for your reply. I ran that command before and it works fine but
> hadoop fs -ls diplays the list of files in the user's directory but then
> hangs for quite a while (~ 10 minutes) before
> handing the command line prompt back, then if I rerun the same command
> there is no problem. That is why I would like to be able to leave safe mode
> automatically (at least I think it's related).
> Also, in the hdfs web page, clicking on the Live Nodes or Dead Nodes links
> hangs forever but I am able to browse the file
> system without any problem with the browser.
> There is no error in the logs.
> Please let me know what sort of details I can provide to help resolve this
> issue.
>
> Best,
>
> Juan
>
> On Fri, Jul 13, 2012 at 4:10 PM, Edward Capriolo wrote:
>
>> If the datanode is not coming back you have to explicitly tell hadoop
>> to leave safemode.
>>
>> http://hadoop.apache.org/common/docs/r0.17.2/hdfs_user_guide.html#Safemode
>>
>> hadoop dfsadmin -safemode leave
>>
>>
>> On Fri, Jul 13, 2012 at 9:35 AM, Juan Pino 
>> wrote:
>> > Hi,
>> >
>> > I can't get HDFS to leave safe mode automatically. Here is what I did:
>> >
>> > -- there was a dead node
>> > -- I stopped dfs
>> > -- I restarted dfs
>> > -- Safe mode wouldn't leave automatically
>> >
>> > I am using hadoop-1.0.2
>> >
>> > Here are the logs:
>> >
>> > end of hadoop-hadoop-namenode.log (attached):
>> >
>> > 2012-07-13 13:22:29,372 INFO org.apache.hadoop.hdfs.StateChange: STATE*
>> Safe
>> > mode ON.
>> > The ratio of reported blocks 0.9795 has not reached the threshold 0.9990.
>> > Safe mode will be turned off automatically.
>> > 2012-07-13 13:22:29,375 INFO org.apache.hadoop.hdfs.StateChange: STATE*
>> Safe
>> > mode extension entered.
>> > The ratio of reported blocks 0.9990 has reached the threshold 0.9990.
>> Safe
>> > mode will be turned off automatically in 29 seconds.
>> > 2012-07-13 13:22:29,375 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
>> > NameSystem.processReport: from , blocks: 3128, processing time: 4 msecs
>> > 2012-07-13 13:31:29,201 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>> > NameSystem.processReport: discarded non-initial block report from because
>> > namenode still in startup phase
>> >
>> > Any help would be greatly appreciated.
>> >
>> > Best,
>> >
>> > Juan
>> >
>>

Re: stuck in safe mode after restarting dfs after found dead node

2012-07-13 Thread Edward Capriolo

If the datanode is not coming back you have to explicitly tell hadoop
to leave safemode.

http://hadoop.apache.org/common/docs/r0.17.2/hdfs_user_guide.html#Safemode

hadoop dfsadmin -safemode leave


On Fri, Jul 13, 2012 at 9:35 AM, Juan Pino  wrote:
> Hi,
>
> I can't get HDFS to leave safe mode automatically. Here is what I did:
>
> -- there was a dead node
> -- I stopped dfs
> -- I restarted dfs
> -- Safe mode wouldn't leave automatically
>
> I am using hadoop-1.0.2
>
> Here are the logs:
>
> end of hadoop-hadoop-namenode.log (attached):
>
> 2012-07-13 13:22:29,372 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe
> mode ON.
> The ratio of reported blocks 0.9795 has not reached the threshold 0.9990.
> Safe mode will be turned off automatically.
> 2012-07-13 13:22:29,375 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe
> mode extension entered.
> The ratio of reported blocks 0.9990 has reached the threshold 0.9990. Safe
> mode will be turned off automatically in 29 seconds.
> 2012-07-13 13:22:29,375 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> NameSystem.processReport: from , blocks: 3128, processing time: 4 msecs
> 2012-07-13 13:31:29,201 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.processReport: discarded non-initial block report from because
> namenode still in startup phase
>
> Any help would be greatly appreciated.
>
> Best,
>
> Juan
>

Re: Setting number of mappers according to number of TextInput lines

2012-06-16 Thread Edward Capriolo

No. The number of lines is not known at planning time. All you know is
the size of the blocks. You want to look at mapred.max.split.size .

On Sat, Jun 16, 2012 at 5:31 AM, Ondřej Klimpera  wrote:
> I tried this approach, but the job is not distributed among 10 mapper nodes.
> Seems Hadoop ignores this property :(
>
> My first thought is, that the small file size is the problem and Hadoop
> doesn't care about it's splitting in proper way.
>
> Thanks any ideas.
>
>
> On 06/16/2012 11:27 AM, Bejoy KS wrote:
>>
>> Hi Ondrej
>>
>> You can use NLineInputFormat with n set to 10.
>>
>> --Original Message--
>> From: Ondřej Klimpera
>> To: common-user@hadoop.apache.org
>> ReplyTo: common-user@hadoop.apache.org
>> Subject: Setting number of mappers according to number of TextInput lines
>> Sent: Jun 16, 2012 14:31
>>
>> Hello,
>>
>> I have very small input size (kB), but processing to produce some output
>> takes several minutes. Is there a way how to say, file has 100 lines, i
>> need 10 mappers, where each mapper node has to process 10 lines of input
>> file?
>>
>> Thanks for advice.
>> Ondrej Klimpera
>>
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>

Re: Ideal file size

2012-06-06 Thread Edward Capriolo

It does not matter what the file size is because the file size is
split into blocks which is what the NN tracks.

For larger deployments you can go with a large block size like 256MB
or even 512MB.  Generally the bigger the file the better split
calculation is very input format dependent however.

On Wed, Jun 6, 2012 at 10:00 AM, Mohit Anchlia  wrote:
> We have continuous flow of data into the sequence file. I am wondering what
> would be the ideal file size before file gets rolled over. I know too many
> small files are not good but could someone tell me what would be the ideal
> size such that it doesn't overload NameNode.

Re: Hadoop with Sharded MySql

2012-05-31 Thread Edward Capriolo

Maybe you can do some VIEWs or unions or merge tables on the mysql
side to overcome the aspect of launching so many sqoop jobs.

On Thu, May 31, 2012 at 6:02 PM, Srinivas Surasani
 wrote:
> All,
>
> We are trying to implement sqoop in our environment which has 30 mysql
> sharded databases and all the databases have around 30 databases with
> 150 tables in each of the database which are all sharded (horizontally
> sharded that means the data is divided into all the tables in mysql).
>
> The problem is that we have a total of around 70K tables which needed
> to be pulled from mysql into hdfs.
>
> So, my question is that generating 70K sqoop commands and running them
> parallel is feasible or not?
>
> Also, doing incremental updates is going to be like invoking 70K
> another sqoop jobs which intern kick of map-reduce jobs.
>
> The main problem is monitoring and managing this huge number of jobs?
>
> Can anyone suggest me the best way of doing it or is sqoop a good
> candidate for this type of scenario?
>
> Currently the same process is done by generating tsv files  mysql
> server and dumped into staging server and  from there we'll generate
> hdfs put statements..
>
> Appreciate your suggestions !!!
>
>
> Thanks,
> Srinivas Surasani

Re: Hadoop on physical Machines compared to Amazon Ec2 / virtual machines

2012-05-31 Thread Edward Capriolo

We actually were in an Amazon/host it yourself debate with someone.
Which prompted us to do some calculations:

http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/myth_busters_ops_editition_is

We calculated the cost for storage alone of 300 TB on ec2 as 585K a month!

The cloud people hate hearing facts like this with staggering $
values. They also do not like hearing how a $35 dollar a month
physical server at Joe's datacenter is much better then an equivilent
cloud machine.

http://blog.carlmercier.com/2012/01/05/ec2-is-basically-one-big-ripoff/

When you bring these facts the go-to-move is go-buzzword with phrases
"cost of system admin", "elastic", "up front initial costs".

I will say that Amazons EMR service is pretty cool and their is
something to it, but the cost of storage and good performance is off
the scale for me.

On 5/31/12, Mathias Herberts  wrote:
> Correct me if I'm wrong, but the sole cost of storing 300TB on AWS
> will account for roughly 30*0.10*12 = 36 USD per annum.
>
> We operate a cluster with 112 nodes offering 800+ TB of raw HDFS
> capacity and the CAPEX was less than 700k USD, if you ask me there is
> no comparison possible if you have the datacenter space to host your
> machines.
>
> Do you really need 10Gbe? We're quite happy with 1Gbe will no
> over-subscription.
>
> Mathias.
>

Re: Problems with block compression using native codecs (Snappy, LZO) and MapFile.Reader.get()

2012-05-22 Thread Edward Capriolo

if You are getting a SIGSEG it never hurts to try a more recent JVM.
21 has many bug fixes at this point.

On Tue, May 22, 2012 at 11:45 AM, Jason B  wrote:
> JIRA entry created:
>
> https://issues.apache.org/jira/browse/HADOOP-8423
>
>
> On 5/21/12, Jason B  wrote:
>> Sorry about using attachment. The code is below for the reference.
>> (I will also file a jira as you suggesting)
>>
>> package codectest;
>>
>> import com.hadoop.compression.lzo.LzoCodec;
>> import java.io.IOException;
>> import java.util.Formatter;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.fs.FileSystem;
>> import org.apache.hadoop.io.MapFile;
>> import org.apache.hadoop.io.SequenceFile.CompressionType;
>> import org.apache.hadoop.io.Text;
>> import org.apache.hadoop.io.compress.CompressionCodec;
>> import org.apache.hadoop.io.compress.DefaultCodec;
>> import org.apache.hadoop.io.compress.SnappyCodec;
>> import org.apache.hadoop.util.Tool;
>> import org.apache.hadoop.util.ToolRunner;
>>
>> public class MapFileCodecTest implements Tool {
>>     private Configuration conf = new Configuration();
>>
>>     private void createMapFile(Configuration conf, FileSystem fs, String
>> path,
>>             CompressionCodec codec, CompressionType type, int records)
>> throws IOException {
>>         MapFile.Writer writer = new MapFile.Writer(conf, fs, path,
>> Text.class, Text.class,
>>                 type, codec, null);
>>         Text key = new Text();
>>         for (int j = 0; j < records; j++) {
>>             StringBuilder sb = new StringBuilder();
>>             Formatter formatter = new Formatter(sb);
>>             formatter.format("%03d", j);
>>             key.set(sb.toString());
>>             writer.append(key, key);
>>         }
>>         writer.close();
>>     }
>>
>>     private void testCodec(Configuration conf, Class> CompressionCodec> clazz,
>>             CompressionType type, int records) throws IOException {
>>         FileSystem fs = FileSystem.getLocal(conf);
>>         try {
>>             System.out.println("Creating MapFiles with " + records  +
>>                     " records using codec " + clazz.getSimpleName());
>>             String path = clazz.getSimpleName() + records;
>>             createMapFile(conf, fs, path, clazz.newInstance(), type,
>> records);
>>             MapFile.Reader reader = new MapFile.Reader(fs, path, conf);
>>             Text key1 = new Text("002");
>>             if (reader.get(key1, new Text()) != null) {
>>                 System.out.println("1st key found");
>>             }
>>             Text key2 = new Text("004");
>>             if (reader.get(key2, new Text()) != null) {
>>                 System.out.println("2nd key found");
>>             }
>>         } catch (Throwable ex) {
>>             ex.printStackTrace();
>>         }
>>     }
>>
>>     @Override
>>     public int run(String[] strings) throws Exception {
>>         System.out.println("Using native library " +
>> System.getProperty("java.library.path"));
>>
>>         testCodec(conf, DefaultCodec.class, CompressionType.RECORD, 100);
>>         testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 100);
>>         testCodec(conf, LzoCodec.class, CompressionType.RECORD, 100);
>>
>>         testCodec(conf, DefaultCodec.class, CompressionType.RECORD, 10);
>>         testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 10);
>>         testCodec(conf, LzoCodec.class, CompressionType.RECORD, 10);
>>
>>         testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 100);
>>         testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 100);
>>         testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 100);
>>
>>         testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 10);
>>         testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 10);
>>         testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 10);
>>         return 0;
>>     }
>>
>>     @Override
>>     public void setConf(Configuration c) {
>>         this.conf = c;
>>     }
>>
>>     @Override
>>     public Configuration getConf() {
>>         return conf;
>>     }
>>
>>     public static void main(String[] args) throws Exception {
>>         ToolRunner.run(new MapFileCodecTest(), args);
>>     }
>>
>> }
>>
>>
>> On 5/21/12, Todd Lipcon  wrote:
>>> Hi Jason,
>>>
>>> Sounds like a bug. Unfortunately the mailing list strips attachments.
>>>
>>> Can you file a jira in the HADOOP project, and attach your test case
>>> there?
>>>
>>> Thanks
>>> Todd
>>>
>>> On Mon, May 21, 2012 at 3:57 PM, Jason B  wrote:
 I am using Cloudera distribution cdh3u1.

 When trying to check native codecs for better decompression
 performance such as Snappy or LZO, I ran into issues with random
 access using MapFile.Reader.get(key, value) method.
 First call of MapFile.Reader.get() works but a second call fails.

 Also  I am getting different exceptions depending on number of entries
 in a map

Re: Splunk + Hadoop

2012-05-22 Thread Edward Capriolo

So a while back their was an article:
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

I recently did my own take on full text searching your logs with
solandra, though I have prototyped using solr inside datastax
enterprise as well.

http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/more_taco_bell_programming_with

Splunk has a graphical front end with a good deal of sophistication,
but I am quite happy just being able to solr search everything, and
providing my own front ends built in solr.

On Mon, May 21, 2012 at 5:13 PM, Abhishek Pratap Singh
 wrote:
> I have used Hadoop and Splunk both. Can you please let me know what is your
> requirement?
> Real time processing with hadoop depends upon What defines "Real time" in
> particular scenario. Based on requirement, Real time (near real time) can
> be achieved.
>
> ~Abhishek
>
> On Fri, May 18, 2012 at 3:58 PM, Russell Jurney 
> wrote:
>
>> Because that isn't Cube.
>>
>> Russell Jurney
>> twitter.com/rjurney
>> russell.jur...@gmail.com
>> datasyndrome.com
>>
>> On May 18, 2012, at 2:01 PM, Ravi Shankar Nair
>>  wrote:
>>
>> > Why not Hbase with Hadoop?
>> > It's a best bet.
>> > Rgds, Ravi
>> >
>> > Sent from my Beethoven
>> >
>> >
>> > On May 18, 2012, at 3:29 PM, Russell Jurney 
>> wrote:
>> >
>> >> I'm playing with using Hadoop and Pig to load MongoDB with data for
>> Cube to
>> >> consume. Cube  is a realtime
>> tool...
>> >> but we'll be replaying events from the past.  Does that count?  It is
>> nice
>> >> to batch backfill metrics into 'real-time' systems in bulk.
>> >>
>> >> On Fri, May 18, 2012 at 12:11 PM,  wrote:
>> >>
>> >>> Hi ,
>> >>>
>> >>> Has anyone used Hadoop and splunk, or any other real-time processing
>> tool
>> >>> over Hadoop?
>> >>>
>> >>> Regards,
>> >>> Shreya
>> >>>
>> >>>
>> >>>
>> >>> This e-mail and any files transmitted with it are for the sole use of
>> the
>> >>> intended recipient(s) and may contain confidential and privileged
>> >>> information. If you are not the intended recipient(s), please reply to
>> the
>> >>> sender and destroy all copies of the original message. Any unauthorized
>> >>> review, use, disclosure, dissemination, forwarding, printing or
>> copying of
>> >>> this email, and/or any action taken in reliance on the contents of this
>> >>> e-mail is strictly prohibited and may be unlawful.
>> >>>
>> >>
>> >> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com
>> datasyndrome.com
>>

Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3

2012-05-03 Thread Edward Capriolo

Honestly that is a hassle, going from 205 to cdh3u3 is probably more
or a cross-grade then an upgrade or downgrade. I would just stick it
out. But yes like Michael said two clusters on the same gear and
distcp. If you are using RF=3 you could also lower your replication to
rf=2 'hadoop dfs -setrepl 2' to clear headroom as you are moving
stuff.


On Thu, May 3, 2012 at 7:25 AM, Michel Segel  wrote:
> Ok... When you get your new hardware...
>
> Set up one server as your new NN, JT, SN.
> Set up the others as a DN.
> (Cloudera CDH3u3)
>
> On your existing cluster...
> Remove your old log files, temp files on HDFS anything you don't need.
> This should give you some more space.
> Start copying some of the directories/files to the new cluster.
> As you gain space, decommission a node, rebalance, add node to new cluster...
>
> It's a slow process.
>
> Should I remind you to make sure you up you bandwidth setting, and to clean 
> up the hdfs directories when you repurpose the nodes?
>
> Does this make sense?
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On May 3, 2012, at 5:46 AM, Austin Chungath  wrote:
>
>> Yeah I know :-)
>> and this is not a production cluster ;-) and yes there is more hardware
>> coming :-)
>>
>> On Thu, May 3, 2012 at 4:10 PM, Michel Segel 
>> wrote:
>>
>>> Well, you've kind of painted yourself in to a corner...
>>> Not sure why you didn't get a response from the Cloudera lists, but it's a
>>> generic question...
>>>
>>> 8 out of 10 TB. Are you talking effective storage or actual disks?
>>> And please tell me you've already ordered more hardware.. Right?
>>>
>>> And please tell me this isn't your production cluster...
>>>
>>> (Strong hint to Strata and Cloudea... You really want to accept my
>>> upcoming proposal talk... ;-)
>>>
>>>
>>> Sent from a remote device. Please excuse any typos...
>>>
>>> Mike Segel
>>>
>>> On May 3, 2012, at 5:25 AM, Austin Chungath  wrote:
>>>
 Yes. This was first posted on the cloudera mailing list. There were no
 responses.

 But this is not related to cloudera as such.

 cdh3 is based on apache hadoop 0.20 as the base. My data is in apache
 hadoop 0.20.205

 There is an upgrade namenode option when we are migrating to a higher
 version say from 0.20 to 0.20.205
 but here I am downgrading from 0.20.205 to 0.20 (cdh3)
 Is this possible?


 On Thu, May 3, 2012 at 3:25 PM, Prashant Kommireddi >>> wrote:

> Seems like a matter of upgrade. I am not a Cloudera user so would not
>>> know
> much, but you might find some help moving this to Cloudera mailing list.
>
> On Thu, May 3, 2012 at 2:51 AM, Austin Chungath 
> wrote:
>
>> There is only one cluster. I am not copying between clusters.
>>
>> Say I have a cluster running apache 0.20.205 with 10 TB storage
>>> capacity
>> and has about 8 TB of data.
>> Now how can I migrate the same cluster to use cdh3 and use that same 8
>>> TB
>> of data.
>>
>> I can't copy 8 TB of data using distcp because I have only 2 TB of free
>> space
>>
>>
>> On Thu, May 3, 2012 at 3:12 PM, Nitin Pawar 
>> wrote:
>>
>>> you can actually look at the distcp
>>>
>>> http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
>>>
>>> but this means that you have two different set of clusters available
>>> to
>> do
>>> the migration
>>>
>>> On Thu, May 3, 2012 at 12:51 PM, Austin Chungath 
>>> wrote:
>>>
 Thanks for the suggestions,
 My concerns are that I can't actually copyToLocal from the dfs
> because
>>> the
 data is huge.

 Say if my hadoop was 0.20 and I am upgrading to 0.20.205 I can do a
 namenode upgrade. I don't have to copy data out of dfs.

 But here I am having Apache hadoop 0.20.205 and I want to use CDH3
> now,
 which is based on 0.20
 Now it is actually a downgrade as 0.20.205's namenode info has to be
>> used
 by 0.20's namenode.

 Any idea how I can achieve what I am trying to do?

 Thanks.

 On Thu, May 3, 2012 at 12:23 PM, Nitin Pawar <
> nitinpawar...@gmail.com
> wrote:

> i can think of following options
>
> 1) write a simple get and put code which gets the data from DFS and
>>> loads
> it in dfs
> 2) see if the distcp  between both versions are compatible
> 3) this is what I had done (and my data was hardly few hundred GB)
> ..
 did a
> dfs -copyToLocal and then in the new grid did a copyFromLocal
>
> On Thu, May 3, 2012 at 11:41 AM, Austin Chungath <
> austi...@gmail.com
>>>
> wrote:
>
>> Hi,
>> I am migrating from Apache hadoop 0.20.205 to CDH3u3.
>> I don't want to lose the data that is in the HDFS of Apach

Re: hadoop.tmp.dir with multiple disks

2012-04-22 Thread Edward Capriolo

Since each hadoop tasks is isolated from others having more tmp
directories allows you to isolate that disk bandwidth as well. By
listing the disks you give more firepower to shuffle-sorting and
merging processes.

Edward

On Sun, Apr 22, 2012 at 10:02 AM, Jay Vyas  wrote:
> I don't understand why multiple disks would be particularly beneficial for
> a Map/Reduce job. would I/O for a map/reduce job be i/o *as well as CPU
> bound* ?   I would think that simply reading and parsing large files would
> still require dedicated CPU blocks. ?
>
> On Sun, Apr 22, 2012 at 3:14 AM, Harsh J  wrote:
>
>> You can use mapred.local.dir for this purpose. It accepts a list of
>> directories tasks may use, just like dfs.data.dir uses multiple disks
>> for block writes/reads.
>>
>> On Sun, Apr 22, 2012 at 12:50 PM, mete  wrote:
>> > Hello folks,
>> >
>> > I have a job that processes text files from hdfs on local fs (temp
>> > directory) and then copies those back to hdfs.
>> > I added another drive to each server to have better io performance, but
>> as
>> > far as i could see hadoop.tmp.dir will not benefit from multiple
>> disks,even
>> > if i setup two different folders on different disks. (dfs.data.dir works
>> > fine). As a result the disk with temp folder set is highy utilized, where
>> > the other one is a little bit idle.
>> > Does anyone have an idea on what to do? (i am using cdh3u3)
>> >
>> > Thanks in advance
>> > Mete
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> Jay Vyas
> MMSB/UCHC

Re: Feedback on real world production experience with Flume

2012-04-22 Thread Edward Capriolo

I think this is valid to talk about for example one need not need a
decentralized collector if they can just write log directly to
decentralized files in a decentralized file system. In any case it was
not even a hard vendor pitch. It was someone describing how they
handle centralized logging. It stated facts and it was informative.

Lets face it, if fuse-mounting-hdfs or directly soft mounting NFS in a
way that performs well many of the use cases for flume and scribe like
tools would be gone. (not all but many)

I never knew there was a rule that discussing alternative software on
a mailing list. It seems like a closed minded thing. I also doubt the
ASF would back a rule like that. Are we not allowed to talk about EMR
or S3, or am I not even allowed to mention S3?

Can flume run on ec2 and log to S3? (oops party foul I guess I cant ask that.)

Edward

On Sun, Apr 22, 2012 at 12:59 AM, Alexander Lorenz
 wrote:
> no. That is the Flume Open Source Mailinglist. Not a vendor list.
>
> NFS logging has nothing to do with decentralized collectors like Flume, JMS 
> or Scribe.
>
> sent via my mobile device
>
> On Apr 22, 2012, at 12:23 AM, Edward Capriolo  wrote:
>
>> It seems pretty relevant. If you can directly log via NFS that is a
>> viable alternative.
>>
>> On Sat, Apr 21, 2012 at 11:42 AM, alo alt  wrote:
>>> We decided NO product and vendor advertising on apache mailing lists!
>>> I do not understand why you'll put that closed source stuff from your 
>>> employe in the room. It has nothing to do with flume or the use cases!
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> On Apr 21, 2012, at 4:06 PM, M. C. Srivas wrote:
>>>
>>>> Karl,
>>>>
>>>> since you did ask for alternatives,  people using MapR prefer to use the
>>>> NFS access to directly deposit data (or access it).  Works seamlessly from
>>>> all Linuxes, Solaris, Windows, AIX and a myriad of other legacy systems
>>>> without having to load any agents on those machines. And it is fully
>>>> automatic HA
>>>>
>>>> Since compression is built-in in MapR, the data gets compressed coming in
>>>> over NFS automatically without much fuss.
>>>>
>>>> Wrt to performance,  can get about 870 MB/s per node if you have 10GigE
>>>> attached (of course, with compression, the effective throughput will
>>>> surpass that based on how good the data can be squeezed).
>>>>
>>>>
>>>> On Fri, Apr 20, 2012 at 3:14 PM, Karl Hennig  wrote:
>>>>
>>>>> I am investigating automated methods of moving our data from the web tier
>>>>> into HDFS for processing, a process that's performed periodically.
>>>>>
>>>>> I am looking for feedback from anyone who has actually used Flume in a
>>>>> production setup (redundant, failover) successfully.  I understand it is
>>>>> now being largely rearchitected during its incubation as Apache Flume-NG,
>>>>> so I don't have full confidence in the old, stable releases.
>>>>>
>>>>> The other option would be to write our own tools.  What methods are you
>>>>> using for these kinds of tasks?  Did you write your own or does Flume (or
>>>>> something else) work for you?
>>>>>
>>>>> I'm also on the Flume mailing list, but I wanted to ask these questions
>>>>> here because I'm interested in Flume _and_ alternatives.
>>>>>
>>>>> Thank you!
>>>>>
>>>>>
>>>

Re: Feedback on real world production experience with Flume

2012-04-21 Thread Edward Capriolo

It seems pretty relevant. If you can directly log via NFS that is a
viable alternative.

On Sat, Apr 21, 2012 at 11:42 AM, alo alt  wrote:
> We decided NO product and vendor advertising on apache mailing lists!
> I do not understand why you'll put that closed source stuff from your employe 
> in the room. It has nothing to do with flume or the use cases!
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> On Apr 21, 2012, at 4:06 PM, M. C. Srivas wrote:
>
>> Karl,
>>
>> since you did ask for alternatives,  people using MapR prefer to use the
>> NFS access to directly deposit data (or access it).  Works seamlessly from
>> all Linuxes, Solaris, Windows, AIX and a myriad of other legacy systems
>> without having to load any agents on those machines. And it is fully
>> automatic HA
>>
>> Since compression is built-in in MapR, the data gets compressed coming in
>> over NFS automatically without much fuss.
>>
>> Wrt to performance,  can get about 870 MB/s per node if you have 10GigE
>> attached (of course, with compression, the effective throughput will
>> surpass that based on how good the data can be squeezed).
>>
>>
>> On Fri, Apr 20, 2012 at 3:14 PM, Karl Hennig  wrote:
>>
>>> I am investigating automated methods of moving our data from the web tier
>>> into HDFS for processing, a process that's performed periodically.
>>>
>>> I am looking for feedback from anyone who has actually used Flume in a
>>> production setup (redundant, failover) successfully.  I understand it is
>>> now being largely rearchitected during its incubation as Apache Flume-NG,
>>> so I don't have full confidence in the old, stable releases.
>>>
>>> The other option would be to write our own tools.  What methods are you
>>> using for these kinds of tasks?  Did you write your own or does Flume (or
>>> something else) work for you?
>>>
>>> I'm also on the Flume mailing list, but I wanted to ask these questions
>>> here because I'm interested in Flume _and_ alternatives.
>>>
>>> Thank you!
>>>
>>>
>

Re: Multiple data centre in Hadoop

2012-04-19 Thread Edward Capriolo

Hive is beginning to implement Region support where one metastore will
manage multiple filesystems and jobtrackers. When a query creates a
table it will then be copied to one ore more datacenters. In addition
the query planner will intelligently attempt to run queries in regions
only where all the tables exists.

While wiating for these awesome features I am doing a fair amount of
distcp work from groovy scripts.

Edward

On Thu, Apr 19, 2012 at 5:33 PM, Robert Evans  wrote:
> If you want to start an open source project for this I am sure that there are 
> others with the same problem that might be very wiling to help out. :)
>
> --Bobby Evans
>
> On 4/19/12 4:31 PM, "Michael Segel"  wrote:
>
> I don't know of any open source solution in doing this...
> And yeah its something one can't talk about  ;-)
>
>
> On Apr 19, 2012, at 4:28 PM, Robert Evans wrote:
>
>> Where I work  we have done some things like this, but none of them are open 
>> source, and I have not really been directly involved with the details of it. 
>>  I can guess about what it would take, but that is all it would be at this 
>> point.
>>
>> --Bobby
>>
>>
>> On 4/17/12 5:46 PM, "Abhishek Pratap Singh"  wrote:
>>
>> Thanks bobby, I m looking for something like this. Now the question is
>> what is the best strategy to do Hot/Hot or Hot/Warm.
>> I need to consider the CPU and Network bandwidth, also needs to decide from
>> which layer this replication should start.
>>
>> Regards,
>> Abhishek
>>
>> On Mon, Apr 16, 2012 at 7:08 AM, Robert Evans  wrote:
>>
>>> Hi Abhishek,
>>>
>>> Manu is correct about High Availability within a single colo.  I realize
>>> that in some cases you have to have fail over between colos.  I am not
>>> aware of any turn key solution for things like that, but generally what you
>>> want to do is to run two clusters, one in each colo, either hot/hot or
>>> hot/warm, and I have seen both depending on how quickly you need to fail
>>> over.  In hot/hot the input data is replicated to both clusters and the
>>> same software is run on both.  In this case though you have to be fairly
>>> sure that your processing is deterministic, or the results could be
>>> slightly different (i.e. No generating if random ids).  In hot/warm the
>>> data is replicated from one colo to the other at defined checkpoints.  The
>>> data is only processed on one of the grids, but if that colo goes down the
>>> other one can take up the processing from where ever the last checkpoint
>>> was.
>>>
>>> I hope that helps.
>>>
>>> --Bobby
>>>
>>> On 4/12/12 5:07 AM, "Manu S"  wrote:
>>>
>>> Hi Abhishek,
>>>
>>> 1. Use multiple directories for *dfs.name.dir* & *dfs.data.dir* etc
>>> * Recommendation: write to *two local directories on different
>>> physical volumes*, and to an *NFS-mounted* directory
>>> - Data will be preserved even in the event of a total failure of the
>>> NameNode machines
>>> * Recommendation: *soft-mount the NFS* directory
>>> - If the NFS mount goes offline, this will not cause the NameNode
>>> to fail
>>>
>>> 2. *Rack awareness*
>>>
>>> https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf
>>>
>>> On Thu, Apr 12, 2012 at 2:18 AM, Abhishek Pratap Singh
>>> wrote:
>>>
 Thanks Robert.
 Is there a best practice or design than can address the High Availability
 to certain extent?

 ~Abhishek

 On Wed, Apr 11, 2012 at 12:32 PM, Robert Evans 
 wrote:

> No it does not. Sorry
>
>
> On 4/11/12 1:44 PM, "Abhishek Pratap Singh" 
>>> wrote:
>
> Hi All,
>
> Just wanted if hadoop supports more than one data centre. This is
 basically
> for DR purposes and High Availability where one centre goes down other
 can
> bring up.
>
>
> Regards,
> Abhishek
>
>

>>>
>>>
>>>
>>> --
>>> Thanks & Regards
>>> 
>>> *Manu S*
>>> SI Engineer - OpenSource & HPC
>>> Wipro Infotech
>>> Mob: +91 8861302855                Skype: manuspkd
>>> www.opensourcetalk.co.in
>>>
>>>
>>
>
>

Re: Hive Thrift help

2012-04-16 Thread Edward Capriolo

You can NOT connect to hive thrift to confirm it's status. Thrift is
thrift not http. But you are right to say HiveServer does not produce
and output by default.

if
netstat -nl | grep 1

shows status it is up.


On Mon, Apr 16, 2012 at 5:18 PM, Rahul Jain  wrote:
> I am assuming you read thru:
>
> https://cwiki.apache.org/Hive/hiveserver.html
>
> The server comes up on port 10,000 by default, did you verify that it is
> actually listening on the port ?  You can also connect to hive server using
> web browser to confirm its status.
>
> -Rahul
>
> On Mon, Apr 16, 2012 at 1:53 PM, Michael Wang 
> wrote:
>
>> we need to connect to HIVE from Microstrategy reports, and it requires the
>> Hive Thrift server. But I
>> tried to start it, and it just hangs as below.
>> # hive --service hiveserver
>> Starting Hive Thrift Server
>> Any ideas?
>> Thanks,
>> Michael
>>
>> This electronic message, including any attachments, may contain
>> proprietary, confidential or privileged information for the sole use of the
>> intended recipient(s). You are hereby notified that any unauthorized
>> disclosure, copying, distribution, or use of this message is prohibited. If
>> you have received this message in error, please immediately notify the
>> sender by reply e-mail and delete it.
>>

Re: Issue with loading the Snappy Codec

2012-04-15 Thread Edward Capriolo

You need three things. 1 install snappy to a place the system can pick
it out automatically or add it to your java.library.path

Then add the full name of the codec to io.compression.codecs.

hive> set io.compression.codecs;
io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec

Edward


On Sun, Apr 15, 2012 at 8:36 AM, Bas Hickendorff
 wrote:
> Hello Jay,
>
> My input is just a csv file (created it myself), so I am sure it is
> not compressed in any way. Also, the same input works when I use the
> standalone example (using the hadoop executable in the bin folder).
> When I try to integrate it in a larger java program it fails  :(
>
> Regards,
>
> Bas
>
> On Sun, Apr 15, 2012 at 2:30 PM, JAX  wrote:
>> That is odd why would it crash when your m/r job did not rely on snappy?
>>
>> One possibility : Maybe because your input is snappy compressed, Hadoop is 
>> detecting that compression, and trying to use the snappy codec to 
>> decompress.?
>>
>> Jay Vyas
>> MMSB
>> UCHC
>>
>> On Apr 15, 2012, at 5:08 AM, Bas Hickendorff  
>> wrote:
>>
>>> Hello John,
>>>
>>> I did restart them (in fact, I did a full reboot of the machine). The
>>> error is still there.
>>>
>>> I guess my question is: is it expected that Hadoop needs to do
>>> something with the Snappycodec when mapred.compress.map.output is set
>>> to false?
>>>
>>> Regards,
>>>
>>> Bas
>>>
>>> On Sun, Apr 15, 2012 at 12:04 PM, john smith  wrote:
 Can you restart tasktrackers once and run the job again? It refreshes the
 class path.

 On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff
 wrote:

> Thanks.
>
> The native snappy libraries I have installed. However, I use the
> normal jars that you get when downloading Hadoop, I am not compiling
> Hadoop myself.
>
> I do not want to use the snappy codec (I don't care about compression
> at the moment), but it seems it is needed anyway? I added this to the
> mapred-site.xml:
>
> 
>        mapred.compress.map.output
>        false
> 
>
> But it still fails with the error of my previous email (SnappyCodec not
> found).
>
> Regards,
>
> Bas
>
>
> On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli
>  wrote:
>>
>> Hadoop has integrated snappy via installed native libraries instead of
> snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206)
>>  - You need to have the snappy system libraries (snappy and
> snappy-devel) installed before you compile hadoop. (RPMs are available on
> the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example)
>>  - When you build hadoop, you will need to compile the native
> libraries(by passing -Dcompile.native=true to ant) to avail snappy 
> support.
>>  - You also need to make sure that snappy system library is available on
> the library path for all mapreduce tasks at runtime. Usually if you 
> install
> them on /usr/lib or /usr/local/lib, it should work.
>>
>> HTH,
>> +Vinod
>>
>> On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote:
>>
>>> Hello,
>>>
>>> When I start a map-reduce job, it starts, and after a short while,
>>> fails with the error below (SnappyCodec not found).
>>>
>>> I am currently starting the job from other Java code (so the Hadoop
>>> executable in the bin directory is not used anymore), but in principle
>>> this seems to work (in the admin of the Jobtracker the job shows up
>>> when it starts). However after a short while the map task fails with:
>>>
>>>
>>> java.lang.IllegalArgumentException: Compression codec
>>> org.apache.hadoop.io.compress.SnappyCodec not found.
>>>       at
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
>>>       at
> org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:134)
>>>       at
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62)
>>>       at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522)
>>>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>       at java.security.AccessController.doPrivileged(Native Method)
>>>       at javax.security.auth.Subject.doAs(Subject.java:416)
>>>       at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.hadoop.io.compress.SnappyCo

Re: Accessing HDFS files from an servlet

2012-04-13 Thread Edward Capriolo

http://www.edwardcapriolo.com/wiki/en/Tomcat_Hadoop

Have all the hadoop jars and conf files in your classpath
--or-- construct your own conf and URI programatically

URI i = URI.create("hdfs://192.168.220.200:54310");
FileSystem fs = FileSystem.get(i,conf);

On Fri, Apr 13, 2012 at 7:40 AM, Jessica Seastrom wrote:

> Have you considered using Hoop?
> http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
>
> On Fri, Apr 13, 2012 at 3:46 AM, sushil sontakke wrote:
>
>> I want to know if there is any way of reading a file from HDFS using a
>> servlet . Suppose I have filename of a valid file situated over HDFS . How
>> do I generate a URL to display that file on a jsp page using some servlet
>> code .
>>
>> Thank You .
>>
>
>
>
> --
>
> 
>  Jessica Seastrom
> Solutions Architect
> Email: jess...@cloudera.com
> Mobile: 443.622.6707
>
>
>

Re: Yahoo Hadoop Tutorial with new APIs?

2012-04-04 Thread Edward Capriolo

Nathan but together the steps together on this blog.

http://blog.milford.io/2012/01/kicking-the-tires-on-hadoop-0-23-pseudo-distributed-mode/

Which fills out the missing "details" such as

 
yarn.nodemanager.local-dirs

the local directories used by the nodemanager
  

in the official docs.

http://hadoop.apache.org/common/docs/r0.23.1/hadoop-yarn/hadoop-yarn-site/SingleCluster.html





On Wed, Apr 4, 2012 at 5:43 PM, Marcos Ortiz  wrote:
> Ok, Robert, I will be waiting for you then. There are many folks that use
> this tutorial, so I think this a good effort in favor of the Hadoop
> community.It would be nice
> if Yahoo! donate this work, because, I have some ideas behind this, for
> example: to release a Spanish version of the tutorial.
> Regards and best wishes
>
> On 04/04/2012 05:29 PM, Robert Evans wrote:
>>
>> I am dropping the cross posts and leaving this on common-user with the
>> others BCCed.
>>
>> Marcos,
>>
>> That is a great idea to be able to update the tutorial, especially if the
>> community is interested in helping to do so.  We are looking into the best
>> way to do this.  The idea right now is to donate this to the Hadoop project
>> so that the community can keep it up to date, but we need some time to jump
>> through all of the corporate hoops to get this to happen.  We have a lot
>> going on right now, so if you don't see any progress on this please feel
>> free to ping me and bug me about it.
>>
>> --
>> Bobby Evans
>>
>>
>> On 4/4/12 8:15 AM, "Jagat Singh"  wrote:
>>
>>    Hello Marcos
>>
>>     Yes , Yahoo tutorials are pretty old but still they explain the
>>    concepts of Map Reduce , HDFS beautifully. The way in which
>>    tutorials have been defined into sub sections , each builing on
>>    previous one is awesome. I remember when i started i was digged in
>>    there for many days. The tutorials are lagging now from new API
>>    point of view.
>>
>>     Lets have some documentation session one day , I would love to
>>    Volunteer to update those tutorials if people at Yahoo take input
>>    from outside world :)
>>
>>     Regards,
>>
>>     Jagat
>>
>>    - Original Message -
>>    From: Marcos Ortiz
>>    Sent: 04/04/12 08:32 AM
>>    To: common-user@hadoop.apache.org, 'hdfs-u...@hadoop.apache.org
>>    <%27hdfs-u...@hadoop.apache.org>', mapreduce-u...@hadoop.apache.org
>>    Subject: Yahoo Hadoop Tutorial with new APIs?
>>
>>    Regards to all the list.
>>     There are many people that use the Hadoop Tutorial released by
>>    Yahoo at http://developer.yahoo.com/hadoop/tutorial/
>>    http://developer.yahoo.com/hadoop/tutorial/module4.html#chaining
>>    The main issue here is that, this tutorial is written with the old
>>    APIs? (Hadoop 0.18 I think).
>>     Is there a project for update this tutorial to the new APIs? to
>>    Hadoop 1.0.2 or YARN (Hadoop 0.23)
>>
>>     Best wishes
>>     -- Marcos Luis Ortíz Valmaseda (@marcosluis2186) Data Engineer at
>>    UCI http://marcosluis2186.posterous.com
>>    http://www.uci.cu/
>>
>>
>>
>> 
>
>
> --
> Marcos Luis Ortíz Valmaseda (@marcosluis2186)
>  Data Engineer at UCI
>  http://marcosluis2186.posterous.com
>
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci

Re: activity on IRC .

2012-03-29 Thread Edward Capriolo

You are better off on the ML.

Hadoop is not designed for high throughput not low latency operations.
This carries over to the IRC room :) JK

I feel most hadoop questions are harder to ask and answer on IRC
(large code segments, deep questions) and as a result the mailing list
is more natural for these type of problems.

Edward

On Wed, Mar 28, 2012 at 3:26 PM, Todd Lipcon  wrote:
> Hey Jay,
>
> That's the only one I know of. Not a lot of idle chatter, but when
> people have questions, discussions do start up. Much more active
> during PST working hours, of course :)
>
> -Todd
>
> On Wed, Mar 28, 2012 at 8:05 AM, Jay Vyas  wrote:
>> Hi guys : I notice the IRC activity is a little low.  Just wondering if
>> theres a better chat channel for hadoop other than the official one
>> (#hadoop on freenode)?
>> In any case... Im on there :)   come say hi.
>>
>> --
>> Jay Vyas
>> MMSB/UCHC
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera

Re: state of HOD

2012-03-09 Thread Edward Capriolo

It has been in a quasi-defunct state for a while now. It seems like
hadoop.next and yarn, helps archive a similar effect of hod. Plus it
has this new hotness factor.

On Fri, Mar 9, 2012 at 2:41 AM, Stijn De Weirdt  wrote:
> (my apologies for those who have received this already. i posted this mail a
> few days back on the common-dev list, as this is more a development related
> mail; but one of the original authors/maintainers suggested to also post
> this here)
>
> hi all,
>
> i am a system administrator/user support person/... for the HPC team at
> Ghent University (Ghent, Flanders, Belgium).
>
> recently we have been asked to look into support for hadoop. for the moment
> we are holding off on a dedicated cluster (esp dedicated hdfs setup).
>
> but as all our systems are torque/pbs based, we looked into HOD to help out
> our users.
> we have started from the HOD code that was part of the hadoop 1.0.0 release
> (in the contrib part).
> at first it was not working, but we have been patching and cleaning up the
> code for a a few weeks and now have a version that works for us (we had to
> add some features besides fixing a few things).
> it looks sufficient for now, although we will add some more features soon to
> get the users started.
>
>
> my question is the following: what is the state of HOD atm? is it still
> maintained/supported? are there forks somewhere that have more up-to-date
> code?
> what we are now missing most is the documentation (eg
> http://hadoop.apache.org/common/docs/r0.16.4/hod.html) so we can update this
> with our extra features. is the source available somewhere?
>
> i could contribute back all patches, but a few of them are identation fixes
> (to use 4 space indentation throughout the code) and other cosmetic changes,
> so this messes up patches a lot.
> i have also shuffled a bit with the options (rename and/or move to other
> sections) so no 100% backwards compatibility with the current HOD code.
>
> current main improvements:
> - works with python 2.5 and up (we have been testing with 2.7.2)
> - set options through environment variables
> - better default values (we can now run with empty hodrc file)
> - support for mail and nodes:ppn for pbs
> - no deprecation warnings from hadoop (nearly finished)
> - host-mask to bind xrs addr on non-default ip (in case you have
> non-standard network on the compute nodes)
> - more debug statements
> - gradual code cleanup (using pylint)
>
> on the todo list:
> - further tuning of hadoop parameters (i'm not a hadoop user myself, so this
> will take some time)
> - 0.23.X support
>
>
>
> many thanks,
>
> stijn

Re: Should splittable Gzip be a "core" hadoop feature?

2012-02-29 Thread Edward Capriolo

Too bad we can not up the replication on the first few blocks of the
file or distributed cache it.

The crontrib statement is arguable. I could make a case that the
majority of stuff should not be in hadoop-core. NLineInputFormat for
example, nice to have. Took a long time to get ported to the new map
reduce format. DBInputFormat DataDriverDBInputFormat sexy for sure but
does not need to be part of core. I could see hadoop as just coming
with TextInputFormat and SequenceInputFormat and everything else is
after market from github,

On Wed, Feb 29, 2012 at 11:31 AM, Robert Evans  wrote:
> I can see a use for it, but I have two concerns about it.  My biggest concern 
> is maintainability.  We have had lots of things get thrown into contrib in 
> the past, very few people use them, and inevitably they start to suffer from 
> bit rot.  I am not saying that it will happen with this, but if you have to 
> ask if people will use it and there has been no overwhelming yes, it makes me 
> nervous about it.  My second concern is with knowing when to use this.  
> Anything that adds this in would have to come with plenty of documentation 
> about how it works, how it is different from the normal gzip format, 
> explanations about what type of a load it might put on data nodes that hold 
> the start of the file, etc.
>
> From both of these I would prefer to see this as a github project for a while 
> first, and one it shows that it has a significant following, or a community 
> with it, then we can pull it in.  But if others disagree I am not going to 
> block it.  I am a -0 on pulling this in now.
>
> --Bobby
>
> On 2/29/12 10:00 AM, "Niels Basjes"  wrote:
>
> Hi,
>
> On Wed, Feb 29, 2012 at 16:52, Edward Capriolo wrote:
> ...
>
>> But being able to generate split info for them and processing them
>> would be good as well. I remember that was a hot thing to do with lzo
>> back in the day. The pain of once overing the gz files to generate the
>> split info is detracting but it is nice to know it is there if you
>> want it.
>>
>
> Note that the solution I created (HADOOP-7076) does not require any
> preprocessing.
> It can split ANY gzipped file as-is.
> The downside is that this effectively costs some additional performance
> because the task has to decompress the first part of the file that is to be
> discarded.
>
> The other two ways of splitting gzipped files either require
> - creating come kind of "compression index" before actually using the file
> (HADOOP-6153)
> - creating a file in a format that is gerenated in such a way that it is
> really a set of concatenated gzipped files. (HADOOP-7909)
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>

Re: Should splittable Gzip be a "core" hadoop feature?

2012-02-29 Thread Edward Capriolo

Mike,

Snappy is cool and all, but I was not overly impressed with it.

GZ zipps much better then Snappy. Last time I checked for our log file
gzip took them down from 100MB-> 40MB, while snappy compressed them
from 100MB->55MB. That was only with sequence files. But still that is
pretty significant if you are considering long term storage. Also
being that the delta in the file size was large I could not actually
make the agree that using sequence+snappy was faster then sequence+gz.
Sure the MB/s rate was probably faster but since I had more MB I was
not able to prove snappy a win. I use it for intermediate compression
only.

Actually the raw formats (gz vs sequence gz) are significantly smaller
and faster then their sequence file counterparts.

Believe it or not, I commonly use mapred.compress.output without
sequence files. As long as I have a larger number of reducers I do not
have to worry about files being splittable because N mappers process N
files. Generally I am happpy with say N mappers because the input
formats tend to create more mappers then I want which makes more
overhead and more shuffle.

But being able to generate split info for them and processing them
would be good as well. I remember that was a hot thing to do with lzo
back in the day. The pain of once overing the gz files to generate the
split info is detracting but it is nice to know it is there if you
want it.

Edward
On Wed, Feb 29, 2012 at 7:10 AM, Michel Segel  wrote:
> Let's play devil's advocate for a second?
>
> Why? Snappy exists.
> The only advantage is that you don't have to convert from gzip to snappy and 
> can process gzip files natively.
>
> Next question is how large are the gzip files in the first place?
>
> I don't disagree, I just want to have a solid argument in favor of it...
>
>
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Feb 28, 2012, at 9:50 AM, Niels Basjes  wrote:
>
>> Hi,
>>
>> Some time ago I had an idea and implemented it.
>>
>> Normally you can only run a single gzipped input file through a single
>> mapper and thus only on a single CPU core.
>> What I created makes it possible to process a Gzipped file in such a way
>> that it can run on several mappers in parallel.
>>
>> I've put the javadoc I created on my homepage so you can read more about
>> the details.
>> http://howto.basjes.nl/hadoop/javadoc-for-skipseeksplittablegzipcodec
>>
>> Now the question that was raised by one of the people reviewing this code
>> was: Should this implementation be part of the core Hadoop feature set?
>> The main reason that was given is that this needs a bit more understanding
>> on what is happening and as such cannot be enabled by default.
>>
>> I would like to hear from the Hadoop Core/Map reduce users what you think.
>>
>> Should this be
>> - a part of the default Hadoop feature set so that anyone can simply enable
>> it by setting the right configuration?
>> - a separate library?
>> - a nice idea I had fun building but that no one needs?
>> - ... ?
>>
>> --
>> Best regards / Met vriendelijke groeten,
>>
>> Niels Basjes

Re: LZO with sequenceFile

2012-02-26 Thread Edward Capriolo

On Sun, Feb 26, 2012 at 1:49 PM, Harsh J  wrote:
> Hi Mohit,
>
> On Sun, Feb 26, 2012 at 10:42 PM, Mohit Anchlia  
> wrote:
>> Thanks! Some questions I have is:
>> 1. Would it work with sequence files? I am using
>> SequenceFileAsTextInputStream
>
> Yes, you just need to set the right codec when you write the file.
> Reading is then normal as reading a non-compressed sequence-file.
>
> The codec classnames are stored as meta information into sequence
> files and are read back to load the right codec for the reader - thus
> you don't have to specify a 'reader' codec once you are done writing a
> file with any codec of choice.
>
>> 2. If I use SequenceFile.CompressionType.RECORD or BLOCK would it still
>> split the files?
>
> Yes SequenceFiles are a natively splittable file format, designed for
> HDFS and MapReduce. Compressed sequence files are thus splittable too.
>
> You mostly need block compression unless your records are large in
> size and you feel you'll benefit better with compression algorithms
> applied to a single, complete record instead of a bunch of records.
>
>> 3. I am also using CDH's 20.2 version of hadoop.
>
> http://www.cloudera.com/assets/images/diagrams/whats-in-a-version.png :)
>
> --
> Harsh J

LZO confuses most because how it was added and removed. Also there is
a system to make raw LZO files split-table by indexing it.

I have just patched google-snappy into 0.20.2. Snappy has a similar
performance profile to LZO, good compression low processor overhead.
It does not have all the licence issues and there is not thousands and
semi contradictory/confusing information it ends up being easier to
setup and use.

http://code.google.com/p/snappy/

Recent version of hadoop just snappy build in so it will just work out
of the box.

Edward

Re: Writing small files to one big file in hdfs

2012-02-21 Thread Edward Capriolo

On Tue, Feb 21, 2012 at 7:50 PM, Mohit Anchlia  wrote:
> It looks like in mapper values are coming as binary instead of Text. Is
> this expected from sequence file? I initially wrote SequenceFile with Text
> values.
>
> On Tue, Feb 21, 2012 at 4:13 PM, Mohit Anchlia wrote:
>
>> Need some more help. I wrote sequence file using below code but now when I
>> run mapreduce job I get "file.*java.lang.ClassCastException*:
>> org.apache.hadoop.io.LongWritable cannot be cast to
>> org.apache.hadoop.io.Text" even though I didn't use LongWritable when I
>> originally wrote to the sequence
>>
>> //Code to write to the sequence file. There is no LongWritable here
>>
>> org.apache.hadoop.io.Text key =
>> *new* org.apache.hadoop.io.Text();
>>
>> BufferedReader buffer =
>> *new* BufferedReader(*new* FileReader(filePath));
>>
>> String line =
>> *null*;
>>
>> org.apache.hadoop.io.Text value =
>> *new* org.apache.hadoop.io.Text();
>>
>> *try* {
>>
>> writer = SequenceFile.*createWriter*(fs, conf, path, key.getClass(),
>>
>> value.getClass(), SequenceFile.CompressionType.
>> *RECORD*);
>>
>> *int* i = 1;
>>
>> *long* timestamp=System.*currentTimeMillis*();
>>
>> *while* ((line = buffer.readLine()) != *null*) {
>>
>> key.set(String.*valueOf*(timestamp));
>>
>> value.set(line);
>>
>> writer.append(key, value);
>>
>> i++;
>>
>> }
>>
>>
>>   On Tue, Feb 21, 2012 at 12:18 PM, Arko Provo Mukherjee <
>> arkoprovomukher...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I think the following link will help:
>>> http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
>>>
>>> Cheers
>>> Arko
>>>
>>> On Tue, Feb 21, 2012 at 2:04 PM, Mohit Anchlia >> >wrote:
>>>
>>> > Sorry may be it's something obvious but I was wondering when map or
>>> reduce
>>> > gets called what would be the class used for key and value? If I used
>>> > "org.apache.hadoop.io.Text
>>> > value = *new* org.apache.hadoop.io.Text();" would the map be called with
>>>  > Text class?
>>> >
>>> > public void map(LongWritable key, Text value, Context context) throws
>>> > IOException, InterruptedException {
>>> >
>>> >
>>> > On Tue, Feb 21, 2012 at 11:59 AM, Arko Provo Mukherjee <
>>> > arkoprovomukher...@gmail.com> wrote:
>>> >
>>> > > Hi Mohit,
>>> > >
>>> > > I am not sure that I understand your question.
>>> > >
>>> > > But you can write into a file using:
>>> > > *BufferedWriter output = new BufferedWriter
>>> > > (new OutputStreamWriter(fs.create(my_path,true)));*
>>> > > *output.write(data);*
>>> > > *
>>> > > *
>>> > > Then you can pass that file as the input to your MapReduce program.
>>> > >
>>> > > *FileInputFormat.addInputPath(jobconf, new Path (my_path) );*
>>> > >
>>> > > From inside your Map/Reduce methods, I think you should NOT be
>>> tinkering
>>> > > with the input / output paths of that Map/Reduce job.
>>> > > Cheers
>>> > > Arko
>>> > >
>>> > >
>>> > > On Tue, Feb 21, 2012 at 1:38 PM, Mohit Anchlia <
>>> mohitanch...@gmail.com
>>> > > >wrote:
>>> > >
>>> > > > Thanks How does mapreduce work on sequence file? Is there an
>>> example I
>>> > > can
>>> > > > look at?
>>> > > >
>>> > > > On Tue, Feb 21, 2012 at 11:34 AM, Arko Provo Mukherjee <
>>> > > > arkoprovomukher...@gmail.com> wrote:
>>> > > >
>>> > > > > Hi,
>>> > > > >
>>> > > > > Let's say all the smaller files are in the same directory.
>>> > > > >
>>> > > > > Then u can do:
>>> > > > >
>>> > > > > *BufferedWriter output = new BufferedWriter
>>> > > > > (newOutputStreamWriter(fs.create(output_path,
>>> > > > > true)));  // Output path*
>>> > > > >
>>> > > > > *FileStatus[] output_files = fs.listStatus(new Path(input_path));
>>>  //
>>> > > > Input
>>> > > > > directory*
>>> > > > >
>>> > > > > *for ( int i=0; i < output_files.length; i++ )  *
>>> > > > >
>>> > > > > *{*
>>> > > > >
>>> > > > > *   BufferedReader reader = new
>>> > > > >
>>> > >
>>> BufferedReader(newInputStreamReader(fs.open(output_files[i].getPath(;
>>> > > > > *
>>> > > > >
>>> > > > > *   String data;*
>>> > > > >
>>> > > > > *   data = reader.readLine();*
>>> > > > >
>>> > > > > *   while ( data != null ) *
>>> > > > >
>>> > > > > *  {*
>>> > > > >
>>> > > > > *        output.write(data);*
>>> > > > >
>>> > > > > *  }*
>>> > > > >
>>> > > > > *    reader.close*
>>> > > > >
>>> > > > > *}*
>>> > > > >
>>> > > > > *output.close*
>>> > > > >
>>> > > > >
>>> > > > > In case you have the files in multiple directories, call the code
>>> for
>>> > > > each
>>> > > > > of them with different input paths.
>>> > > > >
>>> > > > > Hope this helps!
>>> > > > >
>>> > > > > Cheers
>>> > > > >
>>> > > > > Arko
>>> > > > >
>>> > > > > On Tue, Feb 21, 2012 at 1:27 PM, Mohit Anchlia <
>>> > mohitanch...@gmail.com
>>> > > > > >wrote:
>>> > > > >
>>> > > > > > I am trying to look for examples that demonstrates using
>>> sequence
>>> > > files
>>> > > > > > including writing to it and then running mapred on it, but
>>> unable
>>> > to
>>> > > > find
>>> > > > > > one. Could you please point me to some examples of sequence
>>> files?
>>> > > > >

Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)

2012-02-17 Thread Edward Capriolo

As your numbers show.

Dataset
SizeHypertable
Queries/s   HBase
Queries/s   Hypertable
Latency (ms)HBase
Latency (ms)
0.5 TB  3256.42 2969.52 157.221 172.351
5 TB2450.01 2066.52 208.972 247.680

Raw data goes up. Read performance goes down. Latency goes up.

You mentioned you loaded 1/2 trillion records of historical financial
data. The operative word is historical. Your not doing 1/2 trillion
writes every day.

Most of the system that use structured log formats can write very fast
(I am guessing that is what hypertable uses btw). DD writes very fast
as well, but if you want acceptable read latency you are going to need
a good RAM/disk ratio.

Even at 0.5 TB 157.221ms is not a great read latency, so your ability
to write fast has already outstripped your ability to read at a rate
that could support say web application. (I come from a world of 1-5ms
latency BTW).

What application can you support with numbers like that? An email
compliance system where you want to store a ton of data, but only plan
of doing 1 search a day to make an auditor happy? :) This is why I say
your going to end up needing about the same # of nodes because when it
comes time to read this data having a machine with 4Tb of data and 24
GB ram is not going to cut it.

You are right on a couple of fronts
1) being able to load data fast is good (can't argue with that)
2) If hbase can't load X entries that is bad

I really can't imagine that hbase blows up and just stops accepting
inserts at one point. You seem to say its happening and I don't have
time to verify. But if you are at the point where you are getting
175ms random and 85 zipfan latency what are you proving that is
already more data then a server can handle.

http://en.wikipedia.org/wiki/Network_performance

Users browsing the Internet feel that responses are "instant" when
delays are less than 100 ms from click to response[11]. Latency and
throughput together affect the perceived speed of a connection.
However, the perceived performance of a connection can still vary
widely, depending in part on the type of information transmitted and
how it is used.

On Fri, Feb 17, 2012 at 7:25 PM, Doug Judd  wrote:
> Hi Edward,
>
> The problem is that even if the workload is 5% write and 95% read, if you
> can't load the data, you need more machines.  In the 167 billion insert
> test, HBase failed with *Concurrent mode failure* after 20% of the data was
> loaded.  One of our customers has loaded 1/2 trillion records of historical
> financial market data on 16 machines.  If you do the back-of-the-envelope
> calculation, it would take about 180 machines for HBase to load 1/2
> trillion cells.  That makes HBase 10X more expensive in terms of hardware,
> power consumption, and data center real estate.
>
> - Doug
>
> On Fri, Feb 17, 2012 at 3:58 PM, Edward Capriolo wrote:
>
>> I would almost agree with prospective. But their is a problem with 'java is
>> slow' theory. The reason is that in a 100 percent write workload gc might
>> be a factor.
>>
>> But in the real world people have to read data and read becomes disk bound
>> as your data gets larger then memory.
>>
>> Unless C++ can make your disk spin faster then java It is a wash. Making a
>> claim that your going to need more servers for java/hbase is bogus. To put
>> it in prospective, if the workload is 5 % write and 95 % read you are
>> probably going to need just the same amount of hardware.
>>
>> You might get some win on the read size because your custom caching could
>> be more efficient in terms of object size in memory and other gc issues but
>> it is not 2 or 3 to one.
>>
>> If a million writes fall into a hypertable forest but it take a billion
>> years to read them back did the writes ever sync :)
>>
>>
>> On Monday, February 13, 2012, Doug Judd  wrote:
>> > Hey Todd,
>> >
>> > Bulk loading isn't always an option when data is streaming in from a live
>> > application.  Many big data use cases involve massive amounts of smaller
>> > items in the size range of 10-100 bytes, for example URLs, sensor
>> readings,
>> > genome sequence reads, network traffic logs, etc.  If HBase requires 2-3
>> > times the amount of hardware to avoid *Concurrent mode failures*, then
>> that
>> > makes HBase 2-3 times more expensive from the standpoint of hardware,
>> power
>> > consumption, and datacenter real estate.
>> >
>> > What takes the most time is getting the core database mechanics right
>> > (we're going on 5 years now).  Once the core database is stable,
>> > integration with applications such as Solr and others are

Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)

2012-02-17 Thread Edward Capriolo

I would almost agree with prospective. But their is a problem with 'java is
slow' theory. The reason is that in a 100 percent write workload gc might
be a factor.

But in the real world people have to read data and read becomes disk bound
as your data gets larger then memory.

Unless C++ can make your disk spin faster then java It is a wash. Making a
claim that your going to need more servers for java/hbase is bogus. To put
it in prospective, if the workload is 5 % write and 95 % read you are
probably going to need just the same amount of hardware.

You might get some win on the read size because your custom caching could
be more efficient in terms of object size in memory and other gc issues but
it is not 2 or 3 to one.

If a million writes fall into a hypertable forest but it take a billion
years to read them back did the writes ever sync :)


On Monday, February 13, 2012, Doug Judd  wrote:
> Hey Todd,
>
> Bulk loading isn't always an option when data is streaming in from a live
> application.  Many big data use cases involve massive amounts of smaller
> items in the size range of 10-100 bytes, for example URLs, sensor
readings,
> genome sequence reads, network traffic logs, etc.  If HBase requires 2-3
> times the amount of hardware to avoid *Concurrent mode failures*, then
that
> makes HBase 2-3 times more expensive from the standpoint of hardware,
power
> consumption, and datacenter real estate.
>
> What takes the most time is getting the core database mechanics right
> (we're going on 5 years now).  Once the core database is stable,
> integration with applications such as Solr and others are short term
> projects.  I believe that sooner or later, most engineers working in this
> space will come to the conclusion that Java is the wrong language for this
> kind of database application.  At that point, folks on the HBase project
> will realize that they are five years behind.
>
> - Doug
>
> On Mon, Feb 13, 2012 at 11:33 AM, Todd Lipcon  wrote:
>
>> Hey Doug,
>>
>> Want to also run a comparison test with inter-cluster replication
>> turned on? How about kerberos-based security on secure HDFS? How about
>> ACLs or other table permissions even without strong authentication?
>> Can you run a test comparing performance running on top of Hadoop
>> 0.23? How about running other ecosystem products like Solbase,
>> Havrobase, and Lily, or commercial products like Digital Reasoning's
>> Synthesys, etc?
>>
>> For those unfamiliar, the answer to all of the above is that those
>> comparisons can't be run because Hypertable is years behind HBase in
>> terms of features, adoption, etc. They've found a set of benchmarks
>> they win at, but bulk loading either database through the "put" API is
>> the wrong way to go about it anyway. Anyone loading 5T of data like
>> this would use the bulk load APIs which are one to two orders of
>> magnitude more efficient. Just ask the Yahoo crawl cache team, who has
>> ~1PB stored in HBase, or Facebook, or eBay, or many others who store
>> hundreds to thousands of TBs in HBase today.
>>
>> Thanks,
>> -Todd
>>
>> On Mon, Feb 13, 2012 at 9:07 AM, Doug Judd  wrote:
>> > In our original test, we mistakenly ran the HBase test with
>> > the hbase.hregion.memstore.mslab.enabled property set to false.  We
>> re-ran
>> > the test with the hbase.hregion.memstore.mslab.enabled property set to
>> true
>> > and have reported the results in the following addendum:
>> >
>> > Addendum to Hypertable vs. HBase Performance
>> > Test<
>> http://www.hypertable.com/why_hypertable/hypertable_vs_hbase_2/addendum/>
>> >
>> > Synopsis: It slowed performance on the 10KB and 1KB tests and still
>> failed
>> > the 100 byte and 10 byte tests with *Concurrent mode failure*
>> >
>> > - Doug
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: Addendum to Hypertable vs. HBase Performance Test (w/ mslab enabled)

2012-02-16 Thread Edward Capriolo

You ain't gotta like me, you just mad
Cause I tell it how it is, and you tell it how it might be

-Attributed to Puff Daddy
& Now apparently T. Lipcon

On Mon, Feb 13, 2012 at 2:33 PM, Todd Lipcon  wrote:
>
> Hey Doug,
>
> Want to also run a comparison test with inter-cluster replication
> turned on? How about kerberos-based security on secure HDFS? How about
> ACLs or other table permissions even without strong authentication?
> Can you run a test comparing performance running on top of Hadoop
> 0.23? How about running other ecosystem products like Solbase,
> Havrobase, and Lily, or commercial products like Digital Reasoning's
> Synthesys, etc?
>
> For those unfamiliar, the answer to all of the above is that those
> comparisons can't be run because Hypertable is years behind HBase in
> terms of features, adoption, etc. They've found a set of benchmarks
> they win at, but bulk loading either database through the "put" API is
> the wrong way to go about it anyway. Anyone loading 5T of data like
> this would use the bulk load APIs which are one to two orders of
> magnitude more efficient. Just ask the Yahoo crawl cache team, who has
> ~1PB stored in HBase, or Facebook, or eBay, or many others who store
> hundreds to thousands of TBs in HBase today.
>
> Thanks,
> -Todd
>
> On Mon, Feb 13, 2012 at 9:07 AM, Doug Judd  wrote:
> > In our original test, we mistakenly ran the HBase test with
> > the hbase.hregion.memstore.mslab.enabled property set to false.  We re-ran
> > the test with the hbase.hregion.memstore.mslab.enabled property set to true
> > and have reported the results in the following addendum:
> >
> > Addendum to Hypertable vs. HBase Performance
> > Test
> >
> > Synopsis: It slowed performance on the 10KB and 1KB tests and still failed
> > the 100 byte and 10 byte tests with *Concurrent mode failure*
> >
> > - Doug
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera

Re: Brisk vs Cloudera Distribution

2012-02-08 Thread Edward Capriolo

Hadoop can work on a number of filessytems hdfs , s3. Local files. Brisk
file system is known as cfs. Cfs stores all block and meta data in
cassandra. Thus it does not use a name node. Brisk fires up a jobtracker
automatically as well. Brisk also has a hivemeta store backed by cassandra
so takes away that spof.

Brisk snappy compresses all data so you may not need to use compression or
sequence files. Performance wise I have gotten comparable numbers with tera
sort and tera gen. But the system work vastly differently and likely it
scales differently.

The hive integration is solid. Not sure what the biggest cluster is or
making other vague performance claims. Brisk is not active anymore the
commercial product is dse. There is a github fork of brisk however.

On Wednesday, February 8, 2012, rk vishu  wrote:
> Hello All,
>
> Could any one help me understand pros and cons of Brisk vs Cloudera Hadoop
> (DHFS + HBASE) in terms of functionality and performance?
> Wanted to keep aside the single point of failure (NN) issue while
comparing?
> Are there any big clusters in petabytes using brisk in production? How is
> the performance comparision CFS vs HDFS? How is Hive integration?
>
> Thanks and Regrds
> RK
>

Re: Checking Which Filesystem Being Used?

2012-02-07 Thread Edward Capriolo

On Tue, Feb 7, 2012 at 5:24 PM, Eli Finkelshteyn  wrote:

> Hi Folks,
> This might be a stupid question, but I'm new to Java and Hadoop, so...
>
> Anyway, if I want to check what FileSystem is currently being used at some
> point (i.e. evaluating FileSystem.get(conf)), what would be the most
> elegant way of doing that? Should I just do something like:
>if (FileSystem.get(conf) == "HDFS") {...}
> Or is there a better way?
>
> Eli
>

conf.get("fs.default.name") would return a URI such as hdfs://bla:8000 or
file:///this. Although an application could have two Configurations which
could be used to connect to two separate FileSystems inside the same java
application.

Edward

Re: jobtracker url(Critical)

2012-01-27 Thread Edward Capriolo

Task tracker sometimes so not clean up their mapred temp directories well
if that is the case the tt on startup can spent many minutes deleting
files. I use find to delete files older then a couple of days.

On Friday, January 27, 2012, hadoop hive  wrote:
> Hey Harsh,
>
> but after sumtym they are available 1 by 1 in jobtracker URL.
>
> any idea how they add up slowly slowly.
>
> regards
> Vikas
>
> On Fri, Jan 27, 2012 at 5:05 PM, Harsh J  wrote:
>
>> Vikas,
>>
>> Have you ensured your non-appearing tasktracker services are
>> started/alive and carry no communication errors in their logs? Did you
>> perhaps bring up a firewall accidentally, that was not present before?
>>
>> On Fri, Jan 27, 2012 at 4:47 PM, hadoop hive 
wrote:
>> > Hey folks,
>> >
>> > i m facing a problem, with job Tracker URL, actually  i added a node to
>> the
>> > cluster and after sometime i restart the cluster,  then i found that my
>> job
>> > tracker is showing  recent added node in *nodes * but rest of nodes are
>> not
>> > available not even in *blacklist. *
>> > *
>> > *
>> > can any1 have any idea why its happening.
>> >
>> >
>> > Thanks and regards
>> > Vikas Srivastava
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>

Re: NameNode per-block memory usage?

2012-01-17 Thread Edward Capriolo

On Tue, Jan 17, 2012 at 10:08 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hello,
>
> How much memory/JVM heap does NameNode use for each block?
>
> I've tried locating this in the FAQ and on search-hadoop.com, but
> couldn't find a ton of concrete numbers, just these two:
>
> http://search-hadoop.com/m/RmxWMVyVvK1 - 150 bytes/block?
> http://search-hadoop.com/m/O886P1VyVvK1 - 1 GB heap for every object?
>
> Thanks,
> Otis
>

Some real world statistics. From NN web Interface. replication factor=2

Cluster Summary
22,061,605 files and directories, 22,151,870 blocks = 44,213,475 total.
Heap Size is 10.85 GB / 16.58 GB (65%)

compressedOOps is enabled.

Re: hadoop filesystem cache

2012-01-16 Thread Edward Capriolo

The challenges of this design is people accessing the same data over and
over again is the uncommon usecase for hadoop. Hadoop's bread and butter is
all about streaming through large datasets that do not fit in memory. Also
your shuffle-sort-spill is going to play havoc on and file system based
cache. The distributed cache roughly fits this role except that it does not
persist after a job.

Replicating content to N nodes also is not a hard problem to tackle (you
can hack up a content delivery system with ssh+rsync) and get similar
results.The approach often taken has been to keep data that is accessed
repeatedly and fits in memory in some other system
(hbase/cassandra/mysql/whatever).

Edward

On Mon, Jan 16, 2012 at 11:33 AM, Rita  wrote:

> Thanks. I believe this is a good feature to have for clients especially if
> you are reading the same large file over and over.
>
>
> On Sun, Jan 15, 2012 at 7:33 PM, Todd Lipcon  wrote:
>
> > There is some work being done in this area by some folks over at UC
> > Berkeley's AMP Lab in coordination with Facebook. I don't believe it
> > has been published quite yet, but the title of the project is "PACMan"
> > -- I expect it will be published soon.
> >
> > -Todd
> >
> > On Sat, Jan 14, 2012 at 5:30 PM, Rita  wrote:
> > > After reading this article,
> > > http://www.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/ , I
> was
> > > wondering if there was a filesystem cache for hdfs. For example, if a
> > large
> > > file (10gigabytes) was keep getting accessed on the cluster instead of
> > keep
> > > getting it from the network why not storage the content of the file
> > locally
> > > on the client itself.  A use case on the client would be like this:
> > >
> > >
> > >
> > > 
> > >  dfs.client.cachedirectory
> > >  /var/cache/hdfs
> > > 
> > >
> > >
> > > 
> > > dfs.client.cachesize
> > > in megabytes
> > > 10
> > > 
> > >
> > >
> > > Any thoughts of a feature like this?
> > >
> > >
> > > --
> > > --- Get your facts first, then you can distort them as you please.--
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>

Re: desperate question about NameNode startup sequence

2011-12-17 Thread Edward Capriolo

The problem with checkpoint /2nn is that it happily "runs" and has no
outward indication that it is unable to connect.

Because you have a large edits file you startup will complete, however with
that size it could take hours. It logs nothing while this is going on but
as long as the CPU is working that means it is progressing.

We have a nagios check on the size of this directory so if the edit rolling
stops we know about it.

On Saturday, December 17, 2011, Brock Noland  wrote:
> Hi,
>
> Since your using CDH2, I am moving this to CDH-USER. You can subscribe
here:
>
> http://groups.google.com/a/cloudera.org/group/cdh-user
>
> BCC'd common-user
>
> On Sat, Dec 17, 2011 at 2:01 AM, Meng Mao  wrote:
>> Maybe this is a bad sign -- the edits.new was created before the master
>> node crashed, and is huge:
>>
>> -bash-3.2$ ls -lh /hadoop/hadoop-metadata/cache/dfs/name/current
>> total 41G
>> -rw-r--r-- 1 hadoop hadoop 3.8K Jan 27  2011 edits
>> -rw-r--r-- 1 hadoop hadoop  39G Dec 17 00:44 edits.new
>> -rw-r--r-- 1 hadoop hadoop 2.5G Jan 27  2011 fsimage
>> -rw-r--r-- 1 hadoop hadoop8 Jan 27  2011 fstime
>> -rw-r--r-- 1 hadoop hadoop  101 Jan 27  2011 VERSION
>>
>> could this mean something was up with our SecondaryNameNode and rolling
the
>> edits file?
>
> Yes it looks like a checkpoint never completed. It's a good idea to
> monitor the mtime on fsimage to ensure it never gets too old.
>
> Has a checkpoint completed since you restarted?
>
> Brock
>

Re: Analysing Completed Job info programmatically apart from Jobtracker GUI

2011-12-14 Thread Edward Capriolo

I would check out hitune. I have a github project that connects to the
JobTracker and stores counters, job times and other stats into Cassandra.

https://github.com/edwardcapriolo/hadoop_cluster_profiler

Worth checking out as discovering how to connect and mine information from
the JobTracker was quite fun.

Edward



On Wed, Dec 14, 2011 at 9:40 AM, ArunKumar  wrote:

> Hi Guys !
>
> I want to analyse the completed Job counters like FILE/HDFS BYTES
> READ/WRITTEN along with other values like average map/reduce task run time.
> I see that Jobtracker GUI has this info but i want to programmatically
> retrieve these values instead of manually noting down these values and do
> some analysis. Can i do it in a simple/easier way ?
> I also see that Cloudera's HUE is good for this but is there anything
> equivalent in Hadoop.
>
> Can anyone guide me in this regard ?
>
> Arun
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Analysing-Completed-Job-info-programmatically-apart-from-Jobtracker-GUI-tp3585629p3585629.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>

Re: Matrix multiplication in Hadoop

2011-11-19 Thread Edward Capriolo

Sounds like a job for next gen map reduce native libraries and gpu's. A
modern day Dr frankenstein for sure.

On Saturday, November 19, 2011, Tim Broberg  wrote:
> Perhaps this is a good candidate for a native library, then?
>
> 
> From: Mike Davis [xmikeda...@gmail.com]
> Sent: Friday, November 18, 2011 7:39 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Matrix multiplication in Hadoop
>
> On Friday, November 18, 2011, Mike Spreitzer  wrote:
>>  Why is matrix multiplication ill-suited for Hadoop?
>
> IMHO, a huge issue here is the JVM's inability to fully support cpu vendor
> specific SIMD instructions and, by extension, optimized BLAS routines.
> Running a large MM task using intel's MKL rather than relying on generic
> compiler optimization is orders of magnitude faster on a single multicore
> processor. I see almost no way that Hadoop could win such a CPU intensive
> task against an mpi cluster with even a tenth of the nodes running with a
> decently tuned BLAS library. Racing even against a single CPU might be
> difficult, given the i/o overhead.
>
> Still, it's a reasonably common problem and we shouldn't murder the good
in
> favor of the best. I'm certain a MM/LinAlg Hadoop library with even
> mediocre performance, wrt C, would get used.
>
> --
> Mike Davis
>
> The information and any attached documents contained in this message
> may be confidential and/or legally privileged.  The message is
> intended solely for the addressee(s).  If you are not the intended
> recipient, you are hereby notified that any use, dissemination, or
> reproduction is strictly prohibited and may be unlawful.  If you are
> not the intended recipient, please contact the sender immediately by
> return e-mail and destroy all copies of the original message.
>

Re: Matrix multiplication in Hadoop

2011-11-18 Thread Edward Capriolo

A problem with matrix multiplication in hadoop is that hadoop is row
oriented for the most part. I have thought about this use case however and
you can theoretically turn a 2D matrix into a 1D matrix and then that fits
into the row oriented nature of hadoop. Also being that the typical mapper
can have fairly large chunks of memory like 1024MB I have done work like
this before were I loaded such datasets into memory to process them. That
usage does not really fit the map reduce model.

I have been wanting to look at:
http://www.scidb.org/

Edward
On Fri, Nov 18, 2011 at 1:48 PM, Ayon Sinha  wrote:

> I'd really be interested in a comparison of Numpy/Octave/Matlab kind of
> tools with a Hadoop (lets say 4-10 large cloud servers) implementation with
> growing size of the matrix. I want to know the scale at which Hadoop really
> starts to pull away.
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
> 
> From: Michel Segel 
> To: "common-user@hadoop.apache.org" 
> Sent: Friday, November 18, 2011 9:33 AM
> Subject: Re: Matrix multiplication in Hadoop
>
> Is Hadoop the best tool for doing large matrix math.
> Sure you can do it, but, aren't there better tools for these types of
> problems?
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Nov 18, 2011, at 10:59 AM, Mike Spreitzer  wrote:
>
> > Who is doing multiplication of large dense matrices using Hadoop?  What
> is
> > a good way to do that computation using Hadoop?
> >
> > Thanks,
> > Mike

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Edward Capriolo

This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
vino...@hortonworks.com> wrote:

> Must be related to some kind of permissions problems.
>
> It will help if you can paste the corresponding source code for
> FileUtil.copy(). Hard to track it with different versions, so.
>
> Thanks,
> +Vinod
>
>
> On Mon, Oct 3, 2011 at 9:28 PM, Raj V  wrote:
>
> > Eric
> >
> > Yes. The owner is hdfs and group is hadoop and the directory is group
> > writable(775).  This is tehe exact same configuration I have when I use
> real
> > disks.But let me give it a try again to see if I overlooked something.
> > Thanks
> >
> > Raj
> >
> > >
> > >From: Eric Caspole 
> > >To: common-user@hadoop.apache.org
> > >Sent: Monday, October 3, 2011 8:44 AM
> > >Subject: Re: pointing mapred.local.dir to a ramdisk
> > >
> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
> > writeable by your hadoop user? I have played with this in the past and it
> > should basically work.
> > >
> > >
> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
> > >
> > >> Sending it to the hadoop mailing list - I think this is a hadoop
> related
> > problem and not related to Cloudera distribution.
> > >>
> > >> Raj
> > >>
> > >>
> > >> - Forwarded Message -
> > >>> From: Raj V 
> > >>> To: CDH Users 
> > >>> Sent: Friday, September 30, 2011 5:21 PM
> > >>> Subject: pointing mapred.local.dir to a ramdisk
> > >>>
> > >>>
> > >>> Hi all
> > >>>
> > >>>
> > >>> I have been trying some experiments to improve performance. One of
> the
> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
> I
> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
> but
> > I have not been able to get the task tracker to start.
> > >>>
> > >>>
> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
> > from the task tracker log.
> > >>>
> > >>>
> > >>> Tasktracker logs
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
> > global filtersafety
> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
> -1.
> > Opening the listener on 50060
> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
> > listener.getLocalPort() returned 50060
> > webServer.getConnectors()[0].getLocalPort() returned 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
> > bound to port 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
> > SelectChannelConnector@0.0.0.0:50060
> > >>> 2011-09-30 16:50:02,400 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> > with mapRetainSize=-1 and reduceRetainSize=-1
> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
> > Starting tasktracker with owner as mapred
> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
> Can
> > not start task tracker because java.lang.NullPointerException
> > >>> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
> > >>> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
> > >>> at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
> > >>> at
> >
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
> > >>> at
> >
> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
> > >>> at
> >
> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
> > >>> at
> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
> > >>> at
> > org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:1351)
> > >>> at
> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
> > SHUTDOWN_MSG:
> > >>> /
> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
> > >>>
> > >>>
> > >>> and here is my mapred-site.xml file
> > >>>
> > >>>
> > >>> 
> > >>> mapred.local.dir
> > >>> /ramdisk1
> > >>>   
> > >>>
> > >>>
> > >>> If I have a regular directory on a regular drive such as below - it
> > works. If I don't mount the ramdisk - it works.
> > >>>
> >

Re: linux containers with Hadoop

2011-09-30 Thread Edward Capriolo

On Fri, Sep 30, 2011 at 9:03 AM, bikash sharma wrote:

> Hi,
> Does anyone knows if Linux containers (which are like kernel supported
> virtualization technique for providing resource isolation across
> process/appication) have ever been used with Hadoop to provide resource
> isolation for map/reduce tasks?
> If yes, what could be the up/down sides of such approach and how feasible
> it
> is in the context of Hadoop?
> Any pointers if any in terms of papers, etc would be useful.
>
> Thanks,
> Bikash
>

Previously hadoop launched map reduce tasks as a single user, now with
security tasks can launch as different users in the same OS/VM. I would say
the closest you can to that isolation is the work done with mesos .
http://www.mesosproject.org/

Re: formatting hdfs without user interaction

2011-09-23 Thread Edward Capriolo

On Fri, Sep 23, 2011 at 11:52 AM,  wrote:

> Hi Harsh,
>
> On 9/22/11 8:48 PM, "Harsh J"  wrote:
>
> >Ivan,
> >
> >Writing your own program was overkill.
> >
> >The 'yes' coreutil is pretty silly, but nifty at the same time. It
> >accepts an argument, which it would repeat infinitely.
> >
> >So:
> >
> >$ yes Y | hadoop namenode -format
> >
> >Would do it for you.
>
> Nice!  I read the man page for yes too quickly and did not see that
> option.  Thanks!
>
>
> >(Note that in the future release, saner answers will be acceptable,
> >i.e. y instead of strictly Y, etc.)
>
> Y/y/yes/YES would all seem like good things to accept :)
>
>
> >Also, two other things:
> >
> >- What do you mean by 'Yeah I have a secondary namenode as well so 2
> >directories.'? A secondary namenode uses different directories than
> >dfs.name.dir.
>
> Which parameter are you referring to? I am planning on using 2 directories
> in dfs.name.dir, one is local and the other is an NFS mount of a 2nd
> machine running the secondary namenode.
>
>
> >- The prompt only appears when it detects a 'reformat' being happening
> >- which is very dangerous to do non-interactively. If you do the
> >-format the first time, on clean dfs.name.dir setups, you will never
> >receive a prompt.
>
> Yeah I am creating some automation, so it needs to be able to wipe out an
> existing filesystem and start over..
>
>
> Cheers,
> Ivan
>
>
>
> >
>
>
You might want to try expect scripting. You open a stream to the process and
then can wait for prompts and print replies. expect also has this feature
autoexpect which is a shell that records the streams and turns your teminal
interaction into an expect script.

Re: Hadoop doesnt use Replication Level of Namenode

2011-09-13 Thread Edward Capriolo

On Tue, Sep 13, 2011 at 5:53 AM, Steve Loughran  wrote:

> On 13/09/11 05:02, Harsh J wrote:
>
>> Ralf,
>>
>> There is no current way to 'fetch' a config at the moment. You have
>> the NameNode's config available at NNHOST:WEBPORT/conf page which you
>> can perhaps save as a resource (dynamically) and load into your
>> Configuration instance, but apart from this hack the only other ways
>> are the ones Bharath mentioned. This might lead to slow start ups of
>> your clients, but would give you the result you want.
>>
>
> I've done it a modified version of Hadoop, all it takes is a servlet in the
> NN. It even served up the live data of the addresses and ports a NN was
> running on, even if it didn't know in advance.
>
>
Another technique is that if you are using a single replication factor on
all files you can mark the property as true in the
configuration of the NameNode and DataNode. This will always override the
client settings. However in general it is best to manage client
configurations as carefully as you manage the server ones, and ensure that
you give clients the configuration they MUST use puppet/cfengine etc.
Essentially do not count on a client to get them right because the risk is
too high if they are set wrong. IE your situation. "I thought everything was
replicated 3 times"

Re: do HDFS files starting with _ (underscore) have special properties?

2011-09-02 Thread Edward Capriolo

On Fri, Sep 2, 2011 at 4:04 PM, Meng Mao  wrote:

> We have a compression utility that tries to grab all subdirs to a directory
> on HDFS. It makes a call like this:
> FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*"));
>
> and handles files vs dirs accordingly.
>
> We tried to run our utility against a dir containing a computed SOLR shard,
> which has files that look like this:
> -rw-r--r--   2 hadoopuser visible 8538430603 2011-09-01 18:58
> /test/output/solr-20110901165238/part-0/data/index/_ox.fdt
> -rw-r--r--   2 hadoopuser visible  233396596 2011-09-01 18:57
> /test/output/solr-20110901165238/part-0/data/index/_ox.fdx
> -rw-r--r--   2 hadoopuser visible130 2011-09-01 18:57
> /test/output/solr-20110901165238/part-0/data/index/_ox.fnm
> -rw-r--r--   2 hadoopuser visible 2147948283 2011-09-01 18:55
> /test/output/solr-20110901165238/part-0/data/index/_ox.frq
> -rw-r--r--   2 hadoopuser visible   87523726 2011-09-01 18:57
> /test/output/solr-20110901165238/part-0/data/index/_ox.nrm
> -rw-r--r--   2 hadoopuser visible  920936168 2011-09-01 18:57
> /test/output/solr-20110901165238/part-0/data/index/_ox.prx
> -rw-r--r--   2 hadoopuser visible   22619542 2011-09-01 18:58
> /test/output/solr-20110901165238/part-0/data/index/_ox.tii
> -rw-r--r--   2 hadoopuser visible 2070214402 2011-09-01 18:51
> /test/output/solr-20110901165238/part-0/data/index/_ox.tis
> -rw-r--r--   2 hadoopuser visible 20 2011-09-01 18:51
> /test/output/solr-20110901165238/part-0/data/index/segments.gen
> -rw-r--r--   2 hadoopuser visible282 2011-09-01 18:55
> /test/output/solr-20110901165238/part-0/data/index/segments_2
>
>
> The globStatus call seems only able to pick up those last 2 files; the
> several files that start with _ don't register.
>
> I've skimmed the FileSystem and GlobExpander source to see if there's
> anything related to this, but didn't see it. Google didn't turn up anything
> about underscores. Am I misunderstanding something about the regex patterns
> needed to pick these up or unaware of some filename convention in HDFS?
>

Files starting with '_' are considered 'hidden' like unix files starting
with '.'. I did not know that for a very long time because not everyone
follows this rule or even knows about it.

Re: Help - Rack Topology Script - Hadoop 0.20 (CDH3u1)

2011-08-21 Thread Edward Capriolo

On Sun, Aug 21, 2011 at 10:22 AM, Joey Echeverria  wrote:

> Not that I know of.
>
> -Joey
>
> On Fri, Aug 19, 2011 at 1:16 PM, modemide  wrote:
> > Ha, what a silly mistake.
> >
> > Thank you Joey.
> >
> > Do you also happen to know of an easier way to tell which racks the
> > jobtracker/namenode think each node is in?
> >
> >
> >
> > On 8/19/11, Joey Echeverria  wrote:
> >> Did you restart the JobTracker?
> >>
> >> -Joey
> >>
> >> On Fri, Aug 19, 2011 at 12:45 PM, modemide  wrote:
> >>> Hi all,
> >>> I've tried to make a rack topology script.  I've written it in python
> >>> and it works if I call it with the following arguments:
> >>> 10.2.0.1 10.2.0.11 10.2.0.11 10.2.0.12 10.2.0.21 10.2.0.26  10.2.0.31
> >>> 10.2.0.33
> >>>
> >>> The output is:
> >>> /rack0 /rack1 /rack1 /rack1 /rack2 /rack2 /rack3 /rack3
> >>> Should the output be on newlines or is any whitespace sufficient?
> >>>
> >>> Additionally, my cluster's datanodes have DNS names such as:
> >>> r1dn02
> >>> r2dn05
> >>> etc...
> >>>
> >>> I restarted the namenode in my running cluster (after configuring the
> >>> topology script setting in core-site.xml).
> >>> I ran a job and checked what the job tracker thinks the rack id's are
> >>> and it showed default-rack.
> >>> Can anyone tell me what I'm doing wrong?
> >>>
> >>> Thanks,
> >>> tim
> >>>
> >>
> >>
> >>
> >> --
> >> Joseph Echeverria
> >> Cloudera, Inc.
> >> 443.305.9434
> >>
> >
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

If you run the hdfs balancer application the application displays the
topology it learns from from the topology script. Assuming your jobtracker
started with the same configuration you have your answer.

Re: Why hadoop should be built on JAVA?

2011-08-16 Thread Edward Capriolo

This should explain it http://jz10.java.no/java-4-ever-trailer.html .

On Tue, Aug 16, 2011 at 1:17 PM, Adi  wrote:

> >
> >
> >  > On Mon, Aug 15, 2011 at 9:00 PM, Chris Song  wrote:
> > >
> > > > Why hadoop should be built in JAVA?
> > > >
> > > > For integrity and stability, it is good for hadoop to be implemented
> in
> > > > Java
> > > >
> > > > But, when it comes to speed issue, I have a question...
> > > >
> > > > How will it be if HADOOP is implemented in C or Phython?
> > > >
> >
>
> I haven't used anything besides hadoop but in case you are interested in
> alternate (some of them non-java) M/R frameworks this list is a decent
> compilation of those
>
> https://sites.google.com/site/cloudcomputingsystem/research/programming-model
>
> Erlang/Python - http://discoproject.org/
> Ruby - http://skynet.rubyforge.org/
>
> -Adi
>

Re: YCSB Benchmarking for HBase

2011-08-03 Thread Edward Capriolo

On Wed, Aug 3, 2011 at 6:10 AM, praveenesh kumar wrote:

> Hi,
>
> Anyone working on YCSB (Yahoo Cloud Service Benchmarking) for HBase ??
>
> I am trying to run it, its giving me error:
>
> $ java -cp build/ycsb.jar com.yahoo.ycsb.CommandLine -db
> com.yahoo.ycsb.db.HBaseClient
>
> YCSB Command Line client
> Type "help" for command line help
> Start with "-help" for usage info
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/conf/Configuration
>at java.lang.Class.getDeclaredConstructors0(Native Method)
>at java.lang.Class.privateGetDeclaredConstructors(Class.java:2406)
>at java.lang.Class.getConstructor0(Class.java:2716)
>at java.lang.Class.newInstance0(Class.java:343)
>at java.lang.Class.newInstance(Class.java:325)
>at com.yahoo.ycsb.CommandLine.main(Unknown Source)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.conf.Configuration
>at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>at java.security.AccessController.doPrivileged(Native Method)
>at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>... 6 more
>
> By the error, it seems like its not able to get Hadoop-core.jar file, but
> its already in the class path.
> Has anyone worked on YCSB with hbase ?
>
> Thanks,
> Praveenesh
>


I just did
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/ycsb_cassandra_0_7_6.

For hbase I followed the steps here:
http://blog.lars-francke.de/2010/08/16/performance-testing-hbase-using-ycsb/

I also followed the comment in the bottom to make sure the hbase-site.xml
was on the classpath.

Startup script looks like this:
CP=build/ycsb.jar:db/hbase/conf/
for i in db/hbase/lib/* ; do
CP=$CP:${i}
done
#-load load the workload
#-t run the workload
java -cp $CP com.yahoo.ycsb.Client -db com.yahoo.ycsb.db.HBaseClient -P
workloads/workloadb \

Re: One file per mapper

2011-07-06 Thread Edward Capriolo

On Tue, Jul 5, 2011 at 5:28 PM, Jim Falgout wrote:

> I've done this before by placing the name of each file to process into a
> single file (newline separated) and using the NLineInputFormat class as the
> input format. Run your job with the single file with all of the file names
> to process as the input. Each mapper will then be handed one line (this is
> tunable) from the single input file. The line will contain the name of the
> file to process.
>
> You can also write your own InputFormat class that creates a split for each
> file.
>
> Both of these options have scalability issues which begs the question: why
> one file per mapper?
>
> -Original Message-
> From: Govind Kothari [mailto:govindkoth...@gmail.com]
> Sent: Tuesday, July 05, 2011 3:04 PM
> To: common-user@hadoop.apache.org
> Subject: One file per mapper
>
> Hi,
>
> I am new to hadoop. I have a set of files and I want to assign each file to
> a mapper. Also in mapper there should be a way to know the complete path of
> the file. Can you please tell me how to do that ?
>
> Thanks,
> Govind
>
> --
> Govind Kothari
> Graduate Student
> Dept. of Computer Science
> University of Maryland College Park
>
> <---Seek Excellence, Success will Follow --->
>
>
You can also do this with MultipleInputs and MultipleOutputs classes. Each
source file can have a different mapper.

Re: Jobs are still in running state after executing "hadoop job -kill jobId"

2011-07-05 Thread Edward Capriolo

On Tue, Jul 5, 2011 at 11:45 AM, Juwei Shi  wrote:

> We sometimes have hundreds of map or reduce tasks for a job. I think it is
> hard to find all of them and kill the corresponding jvm processes. If we do
> not want to restart hadoop, is there any automatic methods?
>
> 2011/7/5 
>
> > Um kill  -9 "pid" ?
> >
> > -Original Message-
> > From: Juwei Shi [mailto:shiju...@gmail.com]
> > Sent: Friday, July 01, 2011 10:53 AM
> > To: common-user@hadoop.apache.org; mapreduce-u...@hadoop.apache.org
> > Subject: Jobs are still in running state after executing "hadoop job
> > -kill jobId"
> >
> > Hi,
> >
> > I faced a problem that the jobs are still running after executing
> > "hadoop
> > job -kill jobId". I rebooted the cluster but the job still can not be
> > killed.
> >
> > The hadoop version is 0.20.2.
> >
> > Any idea?
> >
> > Thanks in advance!
> >
> > --
> > - Juwei
> >
> >
>

I do not think they pop up very often but after days and months of running a
orphans can be alive. The way I would handle it is write a check that runs
over Nagios (NRPE) and looks for Hadoop task processes using ps, that are
older then a certain age such as 1 day or 1 week etc. Then you can decide if
want nagios to terminate these orphans or do it by hand.

Edward

Re: Jobs are still in running state after executing "hadoop job -kill jobId"

2011-07-05 Thread Edward Capriolo

On Tue, Jul 5, 2011 at 10:05 AM,  wrote:

> Um kill  -9 "pid" ?
>
> -Original Message-
> From: Juwei Shi [mailto:shiju...@gmail.com]
> Sent: Friday, July 01, 2011 10:53 AM
> To: common-user@hadoop.apache.org; mapreduce-u...@hadoop.apache.org
> Subject: Jobs are still in running state after executing "hadoop job
> -kill jobId"
>
> Hi,
>
> I faced a problem that the jobs are still running after executing
> "hadoop
> job -kill jobId". I rebooted the cluster but the job still can not be
> killed.
>
> The hadoop version is 0.20.2.
>
> Any idea?
>
> Thanks in advance!
>
> --
> - Juwei
>
>
This happens sometimes. A task gets orphaned from the Task Tracker and never
goes away. It is a good idea to have a nagios check for very old tasks
because the orphans slowly such your memory away especially if the task
launches with a big Xmx. You really *should not* need to be nuking tasks
like this but occasionally it happens.

Edward

Re: hadoop 0.20.203.0 Java Runtime Environment Error

2011-07-01 Thread Edward Capriolo

That looks like an ancient version of java. Get 1.6.0_u24 or 25 from oracle.

Upgrade to a recent java and possibly update your c libs.

Edward

On Fri, Jul 1, 2011 at 7:24 PM, Shi Yu  wrote:

> I had difficulty upgrading applications from Hadoop 0.20.2 to Hadoop
> 0.20.203.0.
>
> The standalone mode runs without problem.  In real cluster mode, the
> program freeze at map 0% reduce 0% and there is only one attempt  file in
> the log directory. The only information is contained in stdout file :
>
> #
> # An unexpected error has been detected by Java Runtime Environment:
> #
> #  SIGFPE (0x8) at pc=0x2ae751a87b83, pid=5801, tid=1076017504
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.6.0-b105 mixed mode)
> # Problematic frame:
> # C  [ld-linux-x86-64.so.2+0x7b83]
> #
> # An error report file with more information is saved as hs_err_pid5801.log
> #
> # If you would like to submit a bug report, please visit:
> #   
> http://java.sun.com/webapps/**bugreport/crash.jsp
>
>
> (stderr and syslog are empty)
>
> So what is the problem in ld-linux-x86-64.so.2+0x7b83 ?
>
> The program I was testing uses identity Mapper and Reducer, and the input
> file is a single 1M plain text file.   Then I found several hs_err logs
> under the default directory of hadoop, I attach the log file in this email.
>
> The reason I upgrade from 0.20.2 is I had lots of disk check error when
> processing TB data when the disks still have plenty of space. But then I was
> stuck at getting a simple toy problem to work in 0.20.203.0.
>
> Shi
>
>
>

Re: extremely imbalance in the hdfs cluster

2011-06-29 Thread Edward Capriolo

We have run into this issue as well. Since hadoop is RR writing different
size disks really screw things up royally especially if you are running at
high capacity. We have found that decommissioning hosts for stretches of
time is more effective then the balancer in extreme situations. Another
hokey trick is that nodes that launch a job always use that node as the
first replica. You can leverage that by launching jobs from your bigger
machines which makes data more likely to be saved there. Super hokey
solution is moving blocks around with rsync! (block reports later happen and
deal with this (I do not suggest this)).

Hadoop really does need a more intelligent system then Round Robin writing
for heterogeneous systems, there might be a jira open on this somewhere. But
if you are on 0.20.X you have to work with it.

Edward

On Wed, Jun 29, 2011 at 9:06 AM, 茅旭峰  wrote:

> Hi,
>
> I'm running a 37 DN hdfs cluster. There are 12 nodes have 20TB capacity
> each
> node, and the other 25 nodes have 24TB each node.Unfortunately, there are
> several nodes that contain much more data than others, and I can still see
> the data increasing crazy. The 'dstat' shows
>
> dstat -ta 2
> -time- total-cpu-usage -dsk/total- -net/total- ---paging--
> ---system--
>  date/time   |usr sys idl wai hiq siq| read  writ| recv  send|  in   out |
> int   csw
> 24-06 00:42:43|  1   1  95   2   0   0|  25M   62M|   0 0 |   0   0.1
> |3532  5644
> 24-06 00:42:45|  7   1  91   0   0   0|  16k  176k|8346B 1447k|   0 0
> |1201   365
> 24-06 00:42:47|  7   1  91   0   0   0|  12k  172k|9577B 1493k|   0 0
> |1223   334
> 24-06 00:42:49| 11   3  83   1   0   1|  26M   11M|  78M   66M|   0 0 |
>  12k   18k
> 24-06 00:42:51|  4   3  90   1   0   2|  17M  181M| 117M   53M|   0 0 |
>  15k   26k
> 24-06 00:42:53|  4   3  87   4   0   2|  15M  375M| 117M   55M|   0 0 |
>  16k   26k
> 24-06 00:42:55|  3   2  94   1   0   1|  15M   37M|  80M   17M|   0 0 |
>  10k   15k
> 24-06 00:42:57|  0   0  98   1   0   0|  18M   23M|7259k 5988k|   0 0
> |1932  1066
> 24-06 00:42:59|  0   0  98   1   0   0|  16M  132M| 708k  106k|   0 0
> |1484   491
> 24-06 00:43:01|  4   2  91   2   0   1|  23M   64M|  76M   41M|   0 0
> |844113k
> 24-06 00:43:03|  4   3  88   3   0   1|  17M  207M|  91M   48M|   0 0 |
>  11k   16k
>
> From the result of dstat, we can see that the throughput of write is much
> more than read.
> I've started a balancer processor, with dfs.balance.bandwidthPerSec set to
> bytes. From
> the balancer log, I can see the balancer works well. But the balance
> operation can not
> catch up with the write operation.
>
> Now I can only stop the mad increase of data size by stopping the datanode,
> and setting
> dfs.datanode.du.reserved 300GB, then starting the datanode again. Until the
> total size
> reaches the 300GB reservation line, the increase stopped.
>
> The output of 'hadoop dfsadmin -report' shows for the crazy nodes,
>
> Name: 10.150.161.88:50010
> Decommission Status : Normal
> Configured Capacity: 20027709382656 (18.22 TB)
> DFS Used: 14515387866480 (13.2 TB)
> Non DFS Used: 0 (0 KB)
> DFS Remaining: 5512321516176(5.01 TB)
> DFS Used%: 72.48%
> DFS Remaining%: 27.52%
> Last contact: Wed Jun 29 21:03:01 CST 2011
>
>
> Name: 10.150.161.76:50010
> Decommission Status : Normal
> Configured Capacity: 20027709382656 (18.22 TB)
> DFS Used: 16554450730194 (15.06 TB)
> Non DFS Used: 0 (0 KB)
> DFS Remaining: 3473258652462(3.16 TB)
> DFS Used%: 82.66%
> DFS Remaining%: 17.34%
> Last contact: Wed Jun 29 21:03:02 CST 2011
>
> while the other normal datanode, it just like
>
> Name: 10.150.161.65:50010
> Decommission Status : Normal
> Configured Capacity: 23627709382656 (21.49 TB)
> DFS Used: 5953984552236 (5.42 TB)
> Non DFS Used: 1200643810004 (1.09 TB)
> DFS Remaining: 16473081020416(14.98 TB)
> DFS Used%: 25.2%
> DFS Remaining%: 69.72%
> Last contact: Wed Jun 29 21:03:01 CST 2011
>
>
> Name: 10.150.161.80:50010
> Decommission Status : Normal
> Configured Capacity: 23627709382656 (21.49 TB)
> DFS Used: 5982565373592 (5.44 TB)
> Non DFS Used: 1202701691240 (1.09 TB)
> DFS Remaining: 16442442317824(14.95 TB)
> DFS Used%: 25.32%
> DFS Remaining%: 69.59%
> Last contact: Wed Jun 29 21:03:02 CST 2011
>
> Any hint on this issue? We are using 0.20.2-cdh3u0.
>
> Thanks and regards,
>
> Mao Xu-Feng
>

Re: NameNode heapsize

2011-06-10 Thread Edward Capriolo

On Fri, Jun 10, 2011 at 8:22 AM, Brian Bockelman wrote:

>
> On Jun 10, 2011, at 6:32 AM, si...@ugcv.com wrote:
>
> > Dear all,
> >
> > I'm looking for ways to improve the namenode heap size usage of a
> 800-node 10PB testing Hadoop cluster that stores
> > around 30 million files.
> >
> > Here's some info:
> >
> > 1 x namenode: 32GB RAM, 24GB heap size
> > 800 x datanode:   8GB RAM, 13TB hdd
> >
> > *33050825 files and directories, 47708724 blocks = 80759549 total. Heap
> Size is 22.93 GB / 22.93 GB (100%) *
> >
> > From the cluster summary report, it seems the heap size usage is always
> full but couldn't drop, do you guys know of any ways
> > to reduce it ? So far I don't see any namenode OOM errors so it looks
> memory assigned for the namenode process is (just)
> > enough. But i'm curious which factors would account for the full use of
> heap size ?
> >
>
> The advice I give to folks is to plan on 1GB heap for every million
> objects.  It's an over-estimate, but I prefer to be on the safe side.  Why
> not increase the heap-size to 28GB?  Should buy you some time.
>
> You can turn on compressed pointers, but your best bet is really going to
> be spending some more money on RAM.
>
> Brian


The problem with the "buy more RAM" philosophy is the JVM's tend to have
problems operating without pausing for large heaps. NameNode JVM pausing is
not a good thing. Number of Files and Number of Blocks is important so
larger block sizes help make for less NN memory usage.

Also your setup nodes do not mention a secondary name node. Do you have one?
It needs slightly more RAM then the NN.

Re: Verbose screen logging on hadoop-0.20.203.0

2011-06-05 Thread Edward Capriolo

On Sun, Jun 5, 2011 at 1:04 PM, Shi Yu  wrote:

> We just upgraded from 0.20.2 to hadoop-0.20.203.0
>
> Running the same code ends up a massive amount of debug
> information on the screen output. Normally this type of
> information is written to logs/userlogs directory. However,
> nothing is written there now and seems everything is outputted
> to screen.
>
> We did set the HADOOP_LOG_DIR path in hadoop-env.sh
> Other settings are almost the same.
>
> What is the most likely reason triggering this verbose log
> output? How should it be turned off?
>
>
> Shi
>

I think what is happening is you are using an older log4j.properties file
with a newer hadoop. Since many classes are renamed, gone, you tend to get
lots of output. FSAuditSystem something is always a good culprit for this.
You can use the LogLevelServlet to change these at runtime without
restarting then make the related changes to the log4j.properties.

Hadoop Filecrusher! V2 Released!

2011-06-01 Thread Edward Capriolo

All,

You know the story:
You have data files that are created every 5 minutes.
You have hundreds of servers.
You want to put those files in hadoop.

Eventually:
You get lots of files and blocks.
Your namenode and secondary name node need more memory (BTW JVM's have
issues at large Xmx values).
Your map reduce jobs start launching too many tasks.

A solution:
Hadoop File Crusher
http://www.jointhegrid.com/hadoop_filecrush/index.jsp

How does it work?
Hadoop filecrusher uses map reduce to combine multiple smaller files into a
single larger one.

What was the deal with v1?
V1 was great. It happily crushed files, although some datasets presented
some challenges.
For example, the case where one partition of a hive table was very large and
others were smaller. V1 would allocate a reducer per folder and this job
would run as long as the biggest folder.
Also V1 ALWAYS created one file per directory, which is not optimal if a
directory already had maybe 2 largish files and crushing was not necessary.

How does v2 deal with this better?
V2 is more intelligent in it's job planning. It has tunable parameters which
define which files are too small to crush such as.

--threshold
  Percent threshold relative to the dfs block size over which a file
becomes eligible for crushing. Must be in the (0, 1]. Default is 0.75,
which means files smaller than or equal to 75% of a dfs block will be
eligible for crushing. File greater than 75% of a dfs block will be
left untouched.

--max-file-blocks
  The maximum number of dfs blocks per output file. Must be a positive
integer. Small input files are associated with an output file under
the assumption that input and output compression codecs have similar
efficiency. Also, a directory containing a lot of data in many small
files will be converted into a directory containing a fewer number of
large files rather than one super-massive file. With the default value
8, 80 small files, each being 1/10th of a dfs block will be grouped
into to a single output file since 8 * 1/10 = 8 dfs blocks. If there
are 81 small files, each being 1/10th of a dfs block, two output files
will be created. One output file contain the combined contents of 41
files and the second will contain the combined contents of the other
40. A directory of many small files will be converted into fewer
number of larger files where each output file is roughly the same
size.

Why is file crushing optimal?
You can not always control how many files are generated by upstream processes
Namenode file and block constraints
Jobs have less overhead with less files and run MUCH faster.

Usage documentation is found here:
http://www.jointhegrid.com/svn/filecrush/trunk/src/main/resources/help.txt

Enjoy!

Re: Why don't my jobs get preempted?

2011-05-31 Thread Edward Capriolo

On Tue, May 31, 2011 at 2:50 PM, W.P. McNeill  wrote:

> I'm launching long-running tasks on a cluster running the Fair Scheduler.
>  As I understand it, the Fair Scheduler is preemptive. What I expect to see
> is that my long-running jobs sometimes get killed to make room for other
> people's jobs. This never happens instead my long-running jobs hog mapper
> and reducer slots and starve other people out.
>
> Am I misunderstanding how the Fair Scheduler works?
>

Try adding

120
180

To one of your pools and see if that pool pre-empts other pools

Re: Hadoop and WikiLeaks

2011-05-22 Thread Edward Capriolo

On Sun, May 22, 2011 at 8:44 PM, Todd Lipcon  wrote:

> On Sun, May 22, 2011 at 5:10 PM, Edward Capriolo 
> wrote:
> >
> > Correct. But it is a place to discuss changing the content of
> > http://hadoop.apache.org which is what I am advocating.
> >
>
> Fair enough. Is anyone -1 on rephrasing the news item to "had the
> potential as a greater catalyst for innovation than other nominees..."
> (ie cutting out the mention of iPad/wikileaks?)
>
> If not, I will change it tomorrow.
>
> -Todd
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

This is a nice slug:

Described by the judging panel as a "Swiss army knife of the 21st century",
Apache Hadoop picked up the innovator of the year award for having the
potential to change the face of media innovations.

Re: Hadoop and WikiLeaks

2011-05-22 Thread Edward Capriolo

On Sun, May 22, 2011 at 7:48 PM, Konstantin Boudnik  wrote:

> On Sun, May 22, 2011 at 15:30, Edward Capriolo 
> wrote:
> but for the
> > reasons I outlined above I would not want to be associated with them at
> all.
>
> "I give no damn about your opinion, but I will defend your right to
> express it with my blood..."
>
> That said, please express such opinions not in the hadoop user list
> simply because common-user@hadoop.apache.org isn't a place to debate
> balances and checks.
>

Correct. But it is a place to discuss changing the content of
http://hadoop.apache.org which is what I am advocating.

Re: Hadoop and WikiLeaks

2011-05-22 Thread Edward Capriolo

On Sun, May 22, 2011 at 7:29 PM, Todd Lipcon  wrote:

> C'mon guys -- while this is of course an interesting debate, can we
> please keep it off common-user?
>
> -Todd
>
> On Sun, May 22, 2011 at 3:30 PM, Edward Capriolo 
> wrote:
> > On Sat, May 21, 2011 at 4:13 PM, highpointe 
> wrote:
> >
> >> >>> Does this copy text bother anyone else? Sure winning any award is
> great
> >> >>> but
> >> >>> does hadoop want to be associated with "innovation" like WikiLeaks?
> >> >>>
> >> >
> >>
> >> [Only] through the free distribution of information, the guaranteed
> >> integrity of said information and an aggressive system of checks and
> >> balances can man truly be free and hold the winning card.
> >>
> >> So...  YES. Hadoop should be considered an innovation that promotes the
> >> free flow of information and a statistical whistle blower.
> >>
> >> Take off your damn aluminum hat. If it doesn't work for you, it will
> work
> >> against you.
> >>
> >> On May 19, 2011, at 8:54 AM, James Seigel  wrote:
> >>
> >> >>>> Does this copy text bother anyone else? Sure winning any award is
> >> great
> >> >>>> but
> >> >>>> does hadoop want to be associated with "innovation" like WikiLeaks?
> >> >>>>
> >> >>>
> >>
> >
> > I do not know how to interpret your lame "aluminum hat" insult.
> >
> > As far as I am concerned WikiLeaks helped reveal classified US
> information
> > across the the internet. We can go back and forth about governments
> having
> > too much secret/classified information and what the public should know,
> > ...BUT... I believe that stealing and broadcasting secret documents is
> not
> > "innovation" and it surely put many lives at risk.
> >
> > I also believe that Wikileaks is tainted with Julian Assange's actions.
> >
> > *Dec 1 : The International Criminal Police Organisation or INTERPOL on
> > Wednesday said it has issued look out notice for arrest of WikiLeaks'
> owner
> > Julian Assange on suspicion of rape charges on the basis of the Swedish
> > Government's arrest warrant.*
> >
> > http://www.newkerala.com/news/world/fullnews-95693.html
> >
> > Those outside the US see wikileaks a different way they I do, but for the
> > reasons I outlined above I would not want to be associated with them at
> all.
> > Moreover, I believe there already is an aggressive system of checks and
> > balances in the US (it could be better of course) and we do not need
> > innovation like wikileaks offers to stay free, like open source the US is
> > always changing and innovating.
> >
> > Wikileaks represents irresponsible use of technology that should be
> avoided.
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Where should it go?

Re: Hadoop and WikiLeaks

2011-05-22 Thread Edward Capriolo

On Sat, May 21, 2011 at 4:13 PM, highpointe  wrote:

> >>> Does this copy text bother anyone else? Sure winning any award is great
> >>> but
> >>> does hadoop want to be associated with "innovation" like WikiLeaks?
> >>>
> >
>
> [Only] through the free distribution of information, the guaranteed
> integrity of said information and an aggressive system of checks and
> balances can man truly be free and hold the winning card.
>
> So...  YES. Hadoop should be considered an innovation that promotes the
> free flow of information and a statistical whistle blower.
>
> Take off your damn aluminum hat. If it doesn't work for you, it will work
> against you.
>
> On May 19, 2011, at 8:54 AM, James Seigel  wrote:
>
>  Does this copy text bother anyone else? Sure winning any award is
> great
>  but
>  does hadoop want to be associated with "innovation" like WikiLeaks?
> 
> >>>
>

I do not know how to interpret your lame "aluminum hat" insult.

As far as I am concerned WikiLeaks helped reveal classified US information
across the the internet. We can go back and forth about governments having
too much secret/classified information and what the public should know,
...BUT... I believe that stealing and broadcasting secret documents is not
"innovation" and it surely put many lives at risk.

I also believe that Wikileaks is tainted with Julian Assange's actions.

*Dec 1 : The International Criminal Police Organisation or INTERPOL on
Wednesday said it has issued look out notice for arrest of WikiLeaks' owner
Julian Assange on suspicion of rape charges on the basis of the Swedish
Government's arrest warrant.*

http://www.newkerala.com/news/world/fullnews-95693.html

Those outside the US see wikileaks a different way they I do, but for the
reasons I outlined above I would not want to be associated with them at all.
Moreover, I believe there already is an aggressive system of checks and
balances in the US (it could be better of course) and we do not need
innovation like wikileaks offers to stay free, like open source the US is
always changing and innovating.

Wikileaks represents irresponsible use of technology that should be avoided.

Re: Using df instead of du to calculate datanode space

2011-05-21 Thread Edward Capriolo

Good job. I brought this up an another thread, but was told it was not a
problem. Good thing I'm not crazy.

On Sat, May 21, 2011 at 12:42 AM, Joe Stein
wrote:

> I came up with a nice little hack to trick hadoop into calculating disk
> usage with df instead of du
>
>
> http://allthingshadoop.com/2011/05/20/faster-datanodes-with-less-wait-io-using-df-instead-of-du/
>
> I am running this in production, works like a charm and already
> seeing benefit, woot!
>
> I hope it works well for others too.
>
> /*
> Joe Stein
> http://www.twitter.com/allthingshadoop
> */
>

Re: Hadoop and WikiLeaks

2011-05-19 Thread Edward Capriolo

On Thu, May 19, 2011 at 11:54 AM, Ted Dunning  wrote:

> ZK started as sub-project of Hadoop.
>
> On Thu, May 19, 2011 at 7:27 AM, M. C. Srivas  wrote:
>
> > Interesting to note that Cassandra and ZK are now considered Hadoop
> > projects.
> >
> > There were independent of Hadoop before the recent update.
> >
> >
> > On Thu, May 19, 2011 at 4:18 AM, Steve Loughran 
> wrote:
> >
> > > On 18/05/11 18:05, javam...@cox.net wrote:
> > >
> > >> Yes!
> > >>
> > >> -Pete
> > >>
> > >>  Edward Capriolo  wrote:
> > >>
> > >> =
> > >> http://hadoop.apache.org/#What+Is+Apache%E2%84%A2+Hadoop%E2%84%A2%3F
> > >>
> > >> March 2011 - Apache Hadoop takes top prize at Media Guardian
> Innovation
> > >> Awards
> > >>
> > >> The Hadoop project won the "innovator of the year"award from the UK's
> > >> Guardian newspaper, where it was described as "had the potential as a
> > >> greater catalyst for innovation than other nominees including
> WikiLeaks
> > >> and
> > >> the iPad."
> > >>
> > >> Does this copy text bother anyone else? Sure winning any award is
> great
> > >> but
> > >> does hadoop want to be associated with "innovation" like WikiLeaks?
> > >>
> > >
> > >
> > > Ian updated the page yesterday with changes I'd put in for trademarks,
> > and
> > > I added this news quote directly from the paper. We could strip out the
> > > quote easily enough.
> > >
> > >
> >
>

Cassandra is not considered to be a hadoop project or sub-project. The site
mentions "Other Hadoop-related projects at Apache include". The relation is
that Cassandra has Input and Output formats and other support.

Hadoop and WikiLeaks

2011-05-18 Thread Edward Capriolo

http://hadoop.apache.org/#What+Is+Apache%E2%84%A2+Hadoop%E2%84%A2%3F

March 2011 - Apache Hadoop takes top prize at Media Guardian Innovation
Awards

The Hadoop project won the "innovator of the year"award from the UK's
Guardian newspaper, where it was described as "had the potential as a
greater catalyst for innovation than other nominees including WikiLeaks and
the iPad."

Does this copy text bother anyone else? Sure winning any award is great but
does hadoop want to be associated with "innovation" like WikiLeaks?

Edward

Re: Memory mapped resources

2011-04-11 Thread Edward Capriolo

On Mon, Apr 11, 2011 at 7:05 PM, Jason Rutherglen
 wrote:
> Yes you can however it will require customization of HDFS.  Take a
> look at HDFS-347 specifically the HDFS-347-branch-20-append.txt patch.
>  I have been altering it for use with HBASE-3529.  Note that the patch
> noted is for the -append branch which is mainly for HBase.
>
> On Mon, Apr 11, 2011 at 3:57 PM, Benson Margulies  
> wrote:
>> We have some very large files that we access via memory mapping in
>> Java. Someone's asked us about how to make this conveniently
>> deployable in Hadoop. If we tell them to put the files into hdfs, can
>> we obtain a File for the underlying file on any given node?
>>
>

This features it not yet part of hadoop so doing this is not "convenient".

Re: How is hadoop going to handle the next generation disks?

2011-04-08 Thread Edward Capriolo

On Fri, Apr 8, 2011 at 12:24 PM, sridhar basam  wrote:
>
> BTW this is on systems which have a lot of RAM and aren't under high load.
> If you find that your system is evicting dentries/inodes from its cache, you
> might want to experiment with drop vm.vfs_cache_pressure from its default so
> that the they are preferred over the pagecache. At the extreme, setting it
> to 0 means they are never evicted.
>  Sridhar
>
> On Fri, Apr 8, 2011 at 11:37 AM, sridhar basam  wrote:
>>
>> How many files do you have per node? What i find is that most of my
>> inodes/dentries are almost always cached so calculating the 'du -sk' on a
>> host even with hundreds of thousands of files the du -sk generally uses high
>> i/o for a couple of seconds. I am using 2TB disks too.
>>  Sridhar
>>
>>
>> On Fri, Apr 8, 2011 at 12:15 AM, Edward Capriolo 
>> wrote:
>>>
>>> I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
>>> tons of disk io doing a 'du -sk' of each data directory. Instead of
>>> 'du -sk' why not just do this with java.io.file? How is this going to
>>> work with 4TB 8TB disks and up ? It seems like calculating used and
>>> free disk space could be done a better way.
>>>
>>> Edward
>>
>
>

Right. Most inodes are always cached when:

1) small disks
2) light load.

But that is not the case with hadoop.

Making the problem worse:
It seems like hadoop seems to issues 'du -sk' for all disks at the
same time. This pulverises cache.

All this to calculate a size that is typically within .01% of what a
df estimate would tell us.

How is hadoop going to handle the next generation disks?

2011-04-07 Thread Edward Capriolo

I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
tons of disk io doing a 'du -sk' of each data directory. Instead of
'du -sk' why not just do this with java.io.file? How is this going to
work with 4TB 8TB disks and up ? It seems like calculating used and
free disk space could be done a better way.

Edward

Re: Is anyone running Hadoop 0.21.0 on Solaris 10 X64?

2011-03-31 Thread Edward Capriolo

On Thu, Mar 31, 2011 at 10:43 AM, XiaoboGu  wrote:
> I have trouble browsing the file system vi namenode web interface, namenode 
> saying in log file that th –G option is invalid to get the groups for the 
> user.
>
>

I thought this was not the case any more but hadoop forks to the 'id'
command to figure out the groups for a user. You need to make sure the
output is what hadoop is expecting.

Re: how to get rid of attempt_201101170925__m_ directories safely?

2011-03-17 Thread Edward Capriolo

On Thu, Mar 17, 2011 at 1:20 PM, jigar shah  wrote:
> Hi,
>
>    we are running a 50 node hadoop cluster and have a problem with these
> attempt directories piling up(for eg attempt_201101170925_126956_m_000232_0)
> and taking a lot of space. when i restart the tasktracker daemon these
> directories get cleaned out and the space usage goes down.
>
>    i understand why these directories are created, but it becomes a pain
> when they just hang around indefinitely. its very inconvenient to restart
> the tasktracker to get rid of them  and reclaim space.
>
>    anyone knows if there is a setting in the conf somewhere i can set that
> will periodically prune these directories or any other way to deal with
> this.
>
>    i appreciate any sort of help
>
>
>
> thanks
>
>
>

Something you can run from cron.
6 3 * * * hadoop find /disk6/hadoop_root/hdfs_data/hadoop/mapred/local
-maxdepth 1 -name "attempt_*" -ctime +7 -delete

Re: check if a sequenceFile is corrupted

2011-03-17 Thread Edward Capriolo

On Thursday, March 17, 2011, Marc Sturlese  wrote:
> Is there any way to check if a seqfile is corrupted without iterate over all
> its keys/values till it crashes?
> I've seen that I can get an IOException when opening it or an IOException
> reading the X key/value (depending on when it was corrupted).
> Thanks in advance
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/check-if-a-sequenceFile-is-corrupted-tp2693230p2693230.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>

No it does not seem possible to know if a file is damaged without
reading it (logically).

Hadoop dfs -text xx ; echo $?

Should give you non zero but that still involves reading the file.

Re: Anyone knows how to attach a figure on Hadoop Wiki page?

2011-03-14 Thread Edward Capriolo

On Mon, Mar 14, 2011 at 1:23 PM, He Chen  wrote:
> Hi all
>
> Any suggestions?
>
> Bests
>
> Chen
>

Images have been banned.

Re: Reason of Formatting Namenode

2011-03-10 Thread Edward Capriolo

On Thu, Mar 10, 2011 at 12:48 AM, Adarsh Sharma
 wrote:
> Thanks Harsh, i.e why if we again format namenode after loading some data
> INCOMATIBLE NAMESPACE ID's error occurs.
>
>
> Best Regards,
>
> Adarsh Sharma
>
>
>
>
> Harsh J wrote:
>>
>> Formatting the NameNode initializes the FSNameSystem in the
>> dfs.name.dir directories, to prepare for use.
>>
>> The format command typically writes a VERSION file that specifies what
>> the NamespaceID for this FS instance is, what was its ctime, and what
>> is the version (of the file's layout) in use.
>>
>> This is helpful in making every NameNode instance unique, among other
>> things. DataNode blocks carry the namespace-id information that lets
>> them relate blocks to a NameNode (and thereby validate, etc.).
>>
>>
>
>

If you do not tell where you NN to store data it stores it to /tmp.
And your operating system cleans up temp.

The reason for the error you see is datanodes don't like to suddenly
connect to new namenodes. So as a safety they do not start up until
they are cleared.

Re: Hadoop and image processing?

2011-03-03 Thread Edward Capriolo

On Thu, Mar 3, 2011 at 10:00 AM, Tom Deutsch  wrote:
> Along with Brian I'd also suggest it depends on what you are doing with
> the images, but we used Hadoop specifically for this purpose in several
> solutions we build to do advanced imaging processing. Both scale out
> ability to large data volumes and (in our case) compute to do the image
> classification was well suited to Hadoop.
>
>
> 
> Tom Deutsch
> Program Director
> CTO Office: Information Management
> Hadoop Product Manager / Customer Exec
> IBM
> 3565 Harbor Blvd
> Costa Mesa, CA 92626-1420
> tdeut...@us.ibm.com
>
>
>
>
> Brian Bockelman 
> 03/03/2011 06:42 AM
> Please respond to
> common-user@hadoop.apache.org
>
>
> To
> common-user@hadoop.apache.org
> cc
>
> Subject
> Re: Hadoop and image processing?
>
>
>
>
>
>
>
> On Mar 3, 2011, at 1:23 AM, nigelsande...@btconnect.com wrote:
>
>> How applicable would Hadoop be to the processing of thousands of large
> (60-100MB) 3D image files accessible via NFS, using a 100+ machine
> cluster?
>>
>> Does the idea have any merit at all?
>>
>
> It may be a good idea.  If you think the above is a viable architecture
> for data processing, then you likely don't "need" Hadoop because your
> problem is small enough, or you spent way too much money on your NFS
> server.
>
> Whether or not you "need" Hadoop for data scalability - petabytes of data
> moved at gigabytes a second - is a small aspect of the question.
>
> Hadoop is a good data processing platform in its own right.  Traditional
> batch systems tend to have very Unix-friendly APIs for data processing
> (you'll find yourself writing perl script that create text submit files,
> shell scripts, and C code), but appear clumsy to "modern developers" (this
> is speaking as someone who lives and breathes batch systems).  Hadoop has
> "nice" Java APIs and is Java developer friendly, has a lot of data
> processing concepts built in compared to batch systems, and extends OK to
> other langauges.
>
> If you write your image processing in Java, it would be silly to not
> consider Hadoop.  If you currently run a bag full of shell scripts and C++
> code, it's a tougher decision to make.
>
> Brian
>

It can't be done.

http://open.blogs.nytimes.com/2008/05/21/the-new-york-times-archives-amazon-web-services-timesmachine/

Just kidding :)

Re: recommendation on HDDs

2011-02-12 Thread Edward Capriolo

On Fri, Feb 11, 2011 at 7:14 PM, Ted Dunning  wrote:
> Bandwidth is definitely better with more active spindles.  I would recommend
> several larger disks.  The cost is very nearly the same.
>
> On Fri, Feb 11, 2011 at 3:52 PM, Shrinivas Joshi wrote:
>
>> Thanks for your inputs, Michael.  We have 6 open SATA ports on the
>> motherboards. That is the reason why we are thinking of 4 to 5 data disks
>> and 1 OS disk.
>> Are you suggesting use of one 2TB disk instead of four 500GB disks lets
>> say?
>> I thought that the HDFS utilization/throughput increases with the # of
>> disks
>> per node (assuming that the total usable IO bandwidth increases
>> proportionally).
>>
>> -Shrinivas
>>
>> On Thu, Feb 10, 2011 at 4:25 PM, Michael Segel > >wrote:
>>
>> >
>> > Shrinivas,
>> >
>> > Assuming you're in the US, I'd recommend the following:
>> >
>> > Go with 2TB 7200 SATA hard drives.
>> > (Not sure what type of hardware you have)
>> >
>> > What  we've found is that in the data nodes, there's an optimal
>> > configuration that balances price versus performance.
>> >
>> > While your chasis may hold 8 drives, how many open SATA ports are on the
>> > motherboard? Since you're using JBOD, you don't want the additional
>> expense
>> > of having to purchase a separate controller card for the additional
>> drives.
>> >
>> > I'm running Seagate drives at home and I haven't had any problems for
>> > years.
>> > When you look at your drive, you need to know total storage, speed
>> (rpms),
>> > and cache size.
>> > Looking at Microcenter's pricing... 2TB 3.0GB SATA Hitachi was $110.00 A
>> > 1TB Seagate was 70.00
>> > A 250GB SATA drive was $45.00
>> >
>> > So 2TB = 110, 140, 180 (respectively)
>> >
>> > So you get a better deal on 2TB.
>> >
>> > So if you go out and get more drives but of lower density, you'll end up
>> > spending more money and use more energy, but I doubt you'll see a real
>> > performance difference.
>> >
>> > The other thing is that if you want to add more disk, you have room to
>> > grow. (Just add more disk and restart the node, right?)
>> > If all of your disk slots are filled, you're SOL. You have to take out
>> the
>> > box, replace all of the drives, then add to cluster as 'new' node.
>> >
>> > Just my $0.02 cents.
>> >
>> > HTH
>> >
>> > -Mike
>> >
>> > > Date: Thu, 10 Feb 2011 15:47:16 -0600
>> > > Subject: Re: recommendation on HDDs
>> > > From: jshrini...@gmail.com
>> > > To: common-user@hadoop.apache.org
>> > >
>> > > Hi Ted, Chris,
>> > >
>> > > Much appreciate your quick reply. The reason why we are looking for
>> > smaller
>> > > capacity drives is because we are not anticipating a huge growth in
>> data
>> > > footprint and also read somewhere that larger the capacity of the
>> drive,
>> > > bigger the number of platters in them and that could affect drive
>> > > performance. But looks like you can get 1TB drives with only 2
>> platters.
>> > > Large capacity drives should be OK for us as long as they perform
>> equally
>> > > well.
>> > >
>> > > Also, the systems that we have can host up to 8 SATA drives in them. In
>> > that
>> > > case, would  backplanes offer additional advantages?
>> > >
>> > > Any suggestions on 5400 vs. 7200 vs. 1 RPM disks?  I guess 10K rpm
>> > disks
>> > > would be overkill comparing their perf/cost advantage?
>> > >
>> > > Thanks for your inputs.
>> > >
>> > > -Shrinivas
>> > >
>> > > On Thu, Feb 10, 2011 at 2:48 PM, Chris Collins <
>> > chris_j_coll...@yahoo.com>wrote:
>> > >
>> > > > Of late we have had serious issues with seagate drives in our hadoop
>> > > > cluster.  These were purchased over several purchasing cycles and
>> > pretty
>> > > > sure it wasnt just a single "bad batch".   Because of this we
>> switched
>> > to
>> > > > buying 2TB hitachi drives which seem to of been considerably more
>> > reliable.
>> > > >
>> > > > Best
>> > > >
>> > > > C
>> > > > On Feb 10, 2011, at 12:43 PM, Ted Dunning wrote:
>> > > >
>> > > > > Get bigger disks.  Data only grows and having extra is always good.
>> > > > >
>> > > > > You can get 2TB drives for <$100 and 1TB for < $75.
>> > > > >
>> > > > > As far as transfer rates are concerned, any 3GB/s SATA drive is
>> going
>> > to
>> > > > be
>> > > > > about the same (ish).  Seek times will vary a bit with rotation
>> > speed,
>> > > > but
>> > > > > with Hadoop, you will be doing long reads and writes.
>> > > > >
>> > > > > Your controller and backplane will have a MUCH bigger vote in
>> getting
>> > > > > acceptable performance.  With only 4 or 5 drives, you don't have to
>> > worry
>> > > > > about super-duper backplane, but you can still kill performance
>> with
>> > a
>> > > > lousy
>> > > > > controller.
>> > > > >
>> > > > > On Thu, Feb 10, 2011 at 12:26 PM, Shrinivas Joshi <
>> > jshrini...@gmail.com
>> > > > >wrote:
>> > > > >
>> > > > >> What would be a good hard drive for a 7 node cluster which is
>> > targeted
>> > > > to
>> > > > >> run a mix of IO and CPU intensive Hadoop workloads? We are looking
>> >

Re: Hadoop is for whom? Data architect or Java Architect or All

2011-01-27 Thread Edward Capriolo

On Thu, Jan 27, 2011 at 5:42 AM, Steve Loughran  wrote:
> On 27/01/11 07:28, Manuel Meßner wrote:
>>
>> Hi,
>>
>> you may want to take a look into the streaming api, which allows users
>> to write there map-reduce jobs with any language, which is capable of
>> writing to stdout and reading from stdin.
>>
>> http://hadoop.apache.org/mapreduce/docs/current/streaming.html
>>
>> furthermore pig and hive are hadoop related projects and may be of
>> interest for non java people:
>>
>> http://pig.apache.org/
>> http://hive.apache.org/
>>
>> So finally my answer: no it isn't ;)
>
> Helps if your ops team have some experience in running java app servers or
> similar, as well as large linux clusters
>

IMHO Hadoop is not a technology you want to use unless you have people
with Java experience on your staff, or you are willing to learn those
skills. Hadoop does not have a standard interface such as SQL. Working
with it involves reading API, reading through source code, reading
blogs, etc.

I would say the average hadoop user is also somewhat of a hadoop
developer/administrator. Where the average MySQL user for example has
never delved into source code.

In other words if you would with hadoop you are bound to see Java
Exception and stack trace in common every day usage.

This does not mean you have to know java to use hadoop but to use it
very effectively I would suggest it.

Re: How to get metrics information?

2011-01-23 Thread Edward Capriolo

On Sat, Jan 22, 2011 at 9:59 PM, Ted Yu  wrote:
> In the test code, JobTracker is returned from:
>
>        mr = new MiniMRCluster(0, 0, 0, "file:///", 1, null, null, null,
> conf);
>        jobTracker = mr.getJobTrackerRunner().getJobTracker();
>
> I guess it is not exposed in non-test code.
>
> On Sat, Jan 22, 2011 at 6:38 PM, Zhenhua Guo  wrote:
>
>> Thanks!
>> How to get JobTracker object?
>>
>> Gerald
>>
>> On Sun, Jan 23, 2011 at 5:46 AM, Ted Yu  wrote:
>> > You can use the following code:
>> >        JobClient jc = new JobClient(jobConf);
>> >        int numReduces = jc.getClusterStatus().getMaxReduceTasks();
>> >
>> > For 0.20.3, you can use:
>> >    ClusterMetrics metrics = jobTracker.getClusterMetrics();
>> >
>> > On Sat, Jan 22, 2011 at 9:57 AM, Zhenhua Guo  wrote:
>> >
>> >> I want to get metrics information (e.g. number of Maps, number of
>> >> Reduces, memory use, load) by APIs. I found two useful classes -
>> >> ClusterStatus and ClusterMetrics. My question is how I can get
>> >> instances of that two classes? From JobClient or JobTracker? Any
>> >> suggested alternative way to get the information?
>> >>
>> >> Thanks
>> >>
>> >> Gerald
>> >>
>> >
>>
>

Correct JobTracker is the JobTracker itself that does not have any interface.
However using JobClient you can mine most of the information out from
the Job Tracker.

My cacti graphing package takes that exact approach to pull
information that is not a JMXCounter such as.
http://www.jointhegrid.com/hadoop-cacti-jtg-walk/maps_v_reduces.jsp

Re: Hive rc location

2011-01-21 Thread Edward Capriolo

On Fri, Jan 21, 2011 at 9:56 AM, abhatna...@vantage.com
 wrote:
>
> Where is this file located?
>
> Also does anyone has a sample
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Hive-rc-tp2296028p2302262.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
Please register for the hive user or dev lists for these type of
questions. Hadoop user and common user is not the ideal place for this
question.

Hive will look for the file in two locations!
env[HIVE_HOME]/bin/.hiverc, property(user.home)/.hiverc

https://issues.apache.org/jira/browse/HIVE-1414

However this feature is currently only in trunk! So if you are running
hive-0.6.0 You do not have hiverc support yet.

Edward

Re: Why Hadoop is slow in Cloud

2011-01-19 Thread Edward Capriolo

On Wed, Jan 19, 2011 at 1:32 PM, Marc Farnum Rendino  wrote:
> On Tue, Jan 18, 2011 at 8:59 AM, Adarsh Sharma  
> wrote:
>> I want to know *AT WHAT COSTS  *it comes.
>> 10-15% is tolerable but at this rate, it needs some work.
>>
>> As Steve rightly suggest , I am in some CPU bound testing work to  know the
>>  exact stats.
>
> Yep; you've got to test your own workflow to see how it's affected by
> your conditions - lots of variables.
>
> BTW: For AWS (Amazon) there are significant differences in I/O, for
> different instance types; if I recall correctly, for best I/O, start
> no lower than m1.large. And the three storage types (instance, EBS,
> and S3) have different characteristics as well; I'd start with EBS,
> though I haven't worked much with S3 yet, and that does offer some
> benefits.
>
As for virtualization,paravirtualization,emulation.(whatever ulization)
There are always a lot of variables, but the net result is always
less. It may be 2% 10% or 15%, but it is always less. A $50,000 server
and such a solution takes 10% performance right off the top. There
goes $5,000.00 performance right out the window. I never think
throwing away performance was acceptable ( I was born without a silver
SSD in my crib).  Plus some people even pay for virtualization
software (vendors will remain nameless) Truly paying for less.

Re: Why Hadoop is slow in Cloud

2011-01-17 Thread Edward Capriolo

On Mon, Jan 17, 2011 at 6:08 AM, Steve Loughran  wrote:
> On 17/01/11 04:11, Adarsh Sharma wrote:
>>
>> Dear all,
>>
>> Yesterday I performed a kind of testing between *Hadoop in Standalone
>> Servers* & *Hadoop in Cloud.
>>
>> *I establish a Hadoop cluster of 4 nodes ( Standalone Machines ) in
>> which one node act as Master ( Namenode , Jobtracker ) and the remaining
>> nodes act as slaves ( Datanodes, Tasktracker ).
>> On the other hand, for testing Hadoop in *Cloud* ( Euclayptus ), I made
>> one Standalone Machine as *Hadoop Master* and the slaves are configured
>> on the VM's in Cloud.
>>
>> I am confused about the stats obtained after the testing. What I
>> concluded that the VM are giving half peformance as compared with
>> Standalone Servers.
>
> Interesting stats, nothing that massively surprises me, especially as your
> benchmarks are very much streaming through datasets. If you were doing
> something more CPU intensive (graph work, for example), things wouldn't look
> so bad
>
> I've done stuff in this area.
> http://www.slideshare.net/steve_l/farming-hadoop-inthecloud
>
>
>
>>
>> I am expected some slow down but at this level I never expect. Would
>> this is genuine or there may be some configuration problem.
>>
>> I am using 1 GB (10-1000mb/s) LAN in VM machines and 100mb/s in
>> Standalone Servers.
>>
>> Please have a look on the results and if interested comment on it.
>>
>
>
> The big killer here is File IO, with today's HDD controllers and virtual
> filesystems, disk IO is way underpowered compared to physical disk IO.
> Networking is reduced (but improving), and CPU can be pretty good, but disk
> is bad.
>
>
> Why?
>
> 1.  Every access to a block in the VM is turned into virtual disk controller
> operations which are then interpreted by the VDC and turned into
> reads/writes in the virtual disk drive
>
> 2. which is turned into seeks, reads and writes in the physical hardware.
>
> Some workarounds
>
> -allocate physical disks for the HDFS filesystem, for the duration of the
> VMs.
>
> -have the local hosts serve up a bit of their filesystem on a fast protocol
> (like NFS), and have every VM mount the local physical NFS filestore as
> their hadoop data dirs.
>
>

Q: "Why is my Nintendo emulator slow on a 800 MHZ computer made 10
years after Nintendo?"
A: Emulation

Everything you emulate you cut X% performance right off the top.

Emulation is great when you want to run mac on windows or freebsd on
linux or nintendo on linux. However most people would do better with
technologies that use kernel level isolation such as Linux containers,
Solaris Zones, Linux VServer (my favorite) http://linux-vserver.org/,
User Mode Linux or similar technologies that ISOLATE rather then
EMULATE.

Sorry list I feel I rant about this bi-annually. I have just always
been so shocked about how many people get lured into cloud and
virtualized solutions for "better management" and "near native
performance"

Re: No locks available

2011-01-17 Thread Edward Capriolo

On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma  wrote:
> Harsh J wrote:
>>
>> Could you re-check your permissions on the $(dfs.data.dir)s for your
>> failing DataNode versus the user that runs it?
>>
>> On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma 
>> wrote:
>>
>>>
>>> Can i know why it occurs.
>>>
>>
>>
>
> Thanx Harsh , I know this issue and I cross-check several times permissions
> of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ).
>
> It is 755 and is owned by hadoop user and group.
>
> I found that in failed datanode dir , it is unable to create 5 files in
> dfs.data.dir whereas on the other hand, it creates following files in
> successsful datanode :
>
> curent
> tmp
> storage
> in_use.lock
>
> Does it helps.
>
> Thanx
>

No locks available can mean that you are trying to use hadoop on a
filesystem that does not support file level locking. Are you trying to
run your name node storage in NFS space?

Re: new mapreduce API and NLineInputFormat

2011-01-14 Thread Edward Capriolo

On Fri, Jan 14, 2011 at 5:05 PM, Attila Csordas  wrote:
> Hi,
>
> what other jars should be added to the build path from 0.21.0
> besides hadoop-common-0.21.0.jar in order to make 0.21.0 NLineInputFormat
> work in 0.20.2 as suggested below?
>
> Generally can somebody provide me a working example code?
>
> Thanks,
> Attila
>
>
> On Wed, Nov 10, 2010 at 5:06 AM, Harsh J  wrote:
>
>> Hi,
>>
>> On Tue, Nov 9, 2010 at 8:42 PM, Henning Blohm 
>> wrote:
>> >
>> > However in 0.20.2 you cannot call
>> >
>> > job.setInputFormatClass(NLineInputFormat.class);
>> >
>> > as NLineInputFormat does not extend the "right" InputFormat interface
>> > (in
>> > contrast to the 0.21 version).
>> >
>>
>> 0.20.2 does not have it. You can pull the implementation from 0.21.0
>> and use it from within your packages if you require it, though. There
>> should be no problems in doing it.
>>
>> Here's the file from the 0.21 branch:
>>
>> http://svn.apache.org/viewvc/hadoop/mapreduce/tags/release-0.21.0/src/java/org/apache/hadoop/mapreduce/lib/input/NLineInputFormat.java?view=co
>>
>> --
>> Harsh J
>> www.harshj.com
>>
>

Should should not add 0.21 jars to the 0.20 classpath. It will likely
cause a conflict. Just put NLineInputFormat.java in your own project
and try to get it to compile.

Edward

Re: Topology : Script Based Mapping

2010-12-29 Thread Edward Capriolo

On Tue, Dec 28, 2010 at 11:36 PM, Hemanth Yamijala  wrote:
> Hi,
>
> On Tue, Dec 28, 2010 at 6:03 PM, Rajgopal Vaithiyanathan
>  wrote:
>> I wrote a script to map the IP's to a rack. The script is as follows. :
>>
>> for i in $* ; do
>>        topo=`echo $i | cut -d"." -f1,2,3 | sed 's/\./-/g'`
>>        topo=/rack-$topo" "
>>        final=$final$topo
>> done
>> echo $final
>>
>> I also did ` chmod +x topology_script.sh`
>>
>> I tried a sample data :
>>
>> [...@localhost bin]$ ./topology_script.sh 172.21.1.2 172.21.3.4
>> /rack-172-21-1 /rack-172-21-3
>>
>> I also made the change in core-site.xml as follows.
>>
>> 
>>  topology.script.file.name
>>  $HOME/sw/hadoop-0.20.2/bin/topology_script.sh
>> 
>>
>
> I am not sure if $HOME gets expanded automatically. Can you try it as
> ${HOME}, or in the worst case specify the expanded path.
>
> Thanks
> Hemanth
>> But while starting the cluster, The namenode logs shows the error (listed
>> below). and every IP gets mapped to the /default-rack
>>
>> Kindly help.:)
>> Thanks in advance.
>>
>> 2010-12-28 17:30:50,549 WARN org.apache.hadoop.net.ScriptBasedMapping:
>> java.io.IOException: Cannot run program
>> "$HOME/sw/hadoop-0.20.2/bin/topology_script.sh" (in directory
>> "/home/joa/sw/hadoop-0.20.2"): java.io.IOException: error=2, No such file or
>> directory
>>        at java.lang.ProcessBuilder.start(ProcessBuilder.java:474)
>>        at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
>>        at org.apache.hadoop.util.Shell.run(Shell.java:134)
>>        at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:286)
>>        at
>> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:148)
>>        at
>> org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:94)
>>        at
>> org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:59)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.resolveNetworkLocation(FSNamesystem.java:2158)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:2129)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.register(NameNode.java:687)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:616)
>>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:416)
>>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>> Caused by: java.io.IOException: java.io.IOException: error=2, No such file
>> or
>> directory
>>        at java.lang.UNIXProcess.(UNIXProcess.java:164)
>>        at java.lang.ProcessImpl.start(ProcessImpl.java:81)
>>        at java.lang.ProcessBuilder.start(ProcessBuilder.java:467)
>>        ... 19 more
>>
>> --
>> Thanks and Regards,
>> Rajgopal Vaithiyanathan.
>>
>

$ is not expanded to shell or environment variables. They are only
expanded to other hadoop configuration variables. Use a full path.

Re: HDFS and libhfds

2010-12-07 Thread Edward Capriolo

2010/12/7 Petrucci Andreas :
>
> hello there, im trying to compile libhdfs in order  but there are some 
> problems. According to http://wiki.apache.org/hadoop/MountableHDFS  i have 
> already installes fuse. With ant compile-c++-libhdfs -Dlibhdfs=1 the buils is 
> successful.
>
> However when i try ant package -Djava5.home=... -Dforrest.home=... the build 
> fails and the output is the below :
>
>  [exec]
>     [exec] Exception in thread "main" java.lang.UnsupportedClassVersionError: 
> Bad version number in .class file
>     [exec]     at java.lang.ClassLoader.defineClass1(Native Method)
>     [exec]     at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
>     [exec]     at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
>     [exec]     at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
>     [exec]     at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
>     [exec]     at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
>     [exec]     at java.security.AccessController.doPrivileged(Native Method)
>     [exec]     at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>     [exec]     at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>     [exec]     at 
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
>     [exec]     at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>     [exec]     at 
> org.apache.avalon.excalibur.logger.DefaultLogTargetFactoryManager.configure(DefaultLogTargetFactoryManager.java:113)
>     [exec]     at 
> org.apache.avalon.framework.container.ContainerUtil.configure(ContainerUtil.java:201)
>     [exec]     at 
> org.apache.avalon.excalibur.logger.LogKitLoggerManager.setupTargetFactoryManager(LogKitLoggerManager.java:436)
>     [exec]     at 
> org.apache.avalon.excalibur.logger.LogKitLoggerManager.configure(LogKitLoggerManager.java:400)
>     [exec]     at 
> org.apache.avalon.framework.container.ContainerUtil.configure(ContainerUtil.java:201)
>     [exec]     at 
> org.apache.cocoon.core.CoreUtil.initLogger(CoreUtil.java:607)
>     [exec]     at org.apache.cocoon.core.CoreUtil.init(CoreUtil.java:169)
>     [exec]     at org.apache.cocoon.core.CoreUtil.(CoreUtil.java:115)
>     [exec]     at 
> org.apache.cocoon.bean.CocoonWrapper.initialize(CocoonWrapper.java:128)
>     [exec]     at 
> org.apache.cocoon.bean.CocoonBean.initialize(CocoonBean.java:97)
>     [exec]     at org.apache.cocoon.Main.main(Main.java:310)
>     [exec] Java Result: 1
>     [exec]
>     [exec]   Copying broken links file to site root.
>     [exec]
>     [exec]
>     [exec] BUILD FAILED
>     [exec] /apache-forrest-0.8/main/targets/site.xml:175: Warning: Could not 
> find file /hadoop-0.20.2/src/docs/build/tmp/brokenlinks.xml to copy.
>     [exec]
>     [exec] Total time: 4 seconds
>
> BUILD FAILED
> /hadoop-0.20.2/build.xml:867: exec returned: 1
>
>
> any ideas what's wrong???
>

I never saw this usage:
-Djava5.home
Try
export JAVA_HOME=/usr/java

" Bad version number in .class file " means you are mixing and
matching java versions somehow.

Re: small files and number of mappers

2010-11-30 Thread Edward Capriolo

On Tue, Nov 30, 2010 at 3:21 AM, Harsh J  wrote:
> Hey,
>
> On Tue, Nov 30, 2010 at 4:56 AM, Marc Sturlese  
> wrote:
>>
>> Hey there,
>> I am doing some tests and wandering which are the best practices to deal
>> with very small files which are continuously being generated(1Mb or even
>> less).
>
> Have a read: http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>
>>
>> I see that if I have hundreds of small files in hdfs, hadoop automatically
>> will create A LOT of map tasks to consume them. Each map task will take 10
>> seconds or less... I don't know if it's possible to change the number of map
>> tasks from java code using the new API (I know it can be done with the old
>> one). I would like to do something like NumMapTasksCalculatedByHadoop * 0.3.
>> This way, less maps tasks would be instanciated and each would be working
>> more time.
>
> Perhaps you need to use MultiFileInputFormat:
> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>
> --
> Harsh J
> www.harshj.com
>

MultiFile and ConbinedInputFormats help.
JVM Re-use helps.

The larger problem is that an average NameNode with 4GB ram will start
JVM pausing with a relatively low number of files/blocks, say
10,000,000. 10mil is not a large number when generating thousands of
files a day.

We open sourced a tool to deal with this problem.
http://www.jointhegrid.com/hadoop_filecrush/index.jsp

Essentially it takes a pass over a directory and combines multiple
files into one. On 'hourly' directories we run it after the hour is
closed out.

V2 (which we should throw over the fence in a week or so) uses the
same techniques but will be optimized for dealing with very large
directories and/or subdirectories of varying sizes by doing more
intelligent planning and grouping of which files an individual mapper
or reducer is going to combine.

Re: Caution using Hadoop 0.21

2010-11-13 Thread Edward Capriolo

On Sat, Nov 13, 2010 at 4:33 PM, Shi Yu  wrote:
> I agree with Steve. That's why I am still using 0.19.2 in my production.
>
> Shi
>
> On 2010-11-13 12:36, Steve Lewis wrote:
>>
>> Our group made a very poorly considered decision to build out cluster
>> using
>> Hadoop 0.21
>> We discovered that a number of programs written and running properly under
>> 0.20.2 did not work
>> under 0.21
>>
>> The first issue is that Mapper.Context and Reducer.Context and many of
>> their
>> superclasses were
>> converted from concrete classes to interfaces. This change, and I have
>> never
>> in 15 years of programming Java seen so major
>> a change to well known public classes is guaranteed to break any code
>> which
>> subclasses these objects.
>>
>> While it is a far better decision to make these classes interface, the
>> manner of the change and the fact that it is poorly
>> documented shows extraordinary poor judgement on the part of the Hadoop
>> developers
>>
>> http://lordjoesoftware.blogspot.com/
>>
>>
>
>
>

At times we have been frustrated by rapidly changing API's

# 23 August, 2010: release 0.21.0 available
# 26 February, 2010: release 0.20.2 available
# 14 September, 2009: release 0.20.1 available
# 23 July, 2009: release 0.19.2 available
# 22 April, 2009: release 0.20.0 available

By the standard major/minor/revision scheme 0.20.X->0.21.X is a minor
release. However since hadoop has never had a major release you might
consider 0.20->0.21 to be a "major" release.

In any case, are you saying that in 15 years of coding you have never
seen an API change between minor releases? I think that is quite
common. It was also more then a year between 0.20.X and 0.21.X.  Again
common to expect a change in that time frame.

Re: 0.21 found interface but class was expected

2010-11-13 Thread Edward Capriolo

On Sat, Nov 13, 2010 at 9:50 PM, Todd Lipcon  wrote:
> We do have policies against breaking APIs between consecutive major versions
> except for very rare exceptions (eg UnixUserGroupInformation went away when
> security was added).
>
> We do *not* have any current policies that existing code can work against
> different major versions without a recompile in between. Switching an
> implementation class to an interface is a case where a simple recompile of
> the dependent app should be sufficient to avoid issues. For whatever reason,
> the JVM bytecode for invoking an interface method (invokeinterface) is
> different than invoking a virtual method in a class (invokevirtual).
>
> -Todd
>
> On Sat, Nov 13, 2010 at 5:28 PM, Lance Norskog  wrote:
>
>> It is considered good manners :)
>>
>> Seriously, if you want to attract a community you have an obligation
>> to tell them when you're going to jerk the rug out from under their
>> feet.
>>
>> On Sat, Nov 13, 2010 at 3:27 PM, Konstantin Boudnik 
>> wrote:
>> > It doesn't answer my question. I guess I will have to look for the answer
>> somewhere else
>> >
>> > On Sat, Nov 13, 2010 at 03:22PM, Steve Lewis wrote:
>> >> Java libraries are VERY reluctant to change major classes in a way that
>> >> breaks backward compatability -
>> >> NOTE that while the 0.18 packages are  deprecated, they are separate
>> from
>> >> the 0.20 packages allowing
>> >> 0.18 code to run on 0.20 systems - this is true of virtually all Java
>> >> libraries
>> >>
>> >> On Sat, Nov 13, 2010 at 3:08 PM, Konstantin Boudnik 
>> wrote:
>> >>
>> >> > As much as I love ranting I can't help but wonder if there were any
>> >> > promises
>> >> > to make 0.21+ be backward compatible with <0.20 ?
>> >> >
>> >> > Just curious?
>> >> >
>> >> > On Sat, Nov 13, 2010 at 02:50PM, Steve Lewis wrote:
>> >> > > I have a long rant at http://lordjoesoftware.blogspot.com/ on this
>> but
>> >> > > the moral is that there seems to have been a deliberate decision
>> that
>> >> >  0,20
>> >> > > code will may not be comparable with -
>> >> > > I have NEVER seen a major library so directly abandon backward
>> >> > compatability
>> >> > >
>> >> > >
>> >> > > On Fri, Nov 12, 2010 at 8:04 AM, Sebastian Schoenherr <
>> >> > > sebastian.schoenh...@student.uibk.ac.at> wrote:
>> >> > >
>> >> > > > Hi Steve,
>> >> > > > we had a similar problem. We've compiled our code with version
>> 0.21 but
>> >> > > > included the wrong jars into the classpath. (version 0.20.2;
>> >> > > > NInputFormat.java). It seems that Hadoop changed this class to an
>> >> > interface,
>> >> > > > maybe you've a simliar problem.
>> >> > > > Hope this helps.
>> >> > > > Sebastian
>> >> > > >
>> >> > > >
>> >> > > > Zitat von Steve Lewis :
>> >> > > >
>> >> > > >
>> >> > > >  Cassandra sees this error with 0.21 of hadoop
>> >> > > >>
>> >> > > >> Exception in thread "main"
>> java.lang.IncompatibleClassChangeError:
>> >> > Found
>> >> > > >> interface org.apache.hadoop.mapreduce.JobContext, but class was
>> >> > expected
>> >> > > >>
>> >> > > >> I see something similar
>> >> > > >> Error: Found interface
>> >> > org.apache.hadoop.mapreduce.TaskInputOutputContext,
>> >> > > >> but class was expected
>> >> > > >>
>> >> > > >> I find this especially puzzling
>> >> > > >> since org.apache.hadoop.mapreduce.TaskInputOutputContext IS a
>> class
>> >> > not an
>> >> > > >> interface
>> >> > > >>
>> >> > > >> Does anyone have bright ideas???
>> >> > > >>
>> >> > > >> --
>> >> > > >> Steven M. Lewis PhD
>> >> > > >> 4221 105th Ave Ne
>> >> > > >> Kirkland, WA 98033
>> >> > > >> 206-384-1340 (cell)
>> >> > > >> Institute for Systems Biology
>> >> > > >> Seattle WA
>> >> > > >>
>> >> > > >>
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Steven M. Lewis PhD
>> >> > > 4221 105th Ave Ne
>> >> > > Kirkland, WA 98033
>> >> > > 206-384-1340 (cell)
>> >> > > Institute for Systems Biology
>> >> > > Seattle WA
>> >> >
>> >> > -BEGIN PGP SIGNATURE-
>> >> > Version: GnuPG v1.4.10 (GNU/Linux)
>> >> >
>> >> > iF4EAREIAAYFAkzfGnwACgkQenyFlstYjhK6RwD+IdUVZuqXACV9+9By7fMiy/MO
>> >> > Uxyt4o4Z4naBzvjMu0cBAMkHLuHFHxuM5Yzb7doeC8eAzq+brsBzVHDKGeUD5FG4
>> >> > =dr5x
>> >> > -END PGP SIGNATURE-
>> >> >
>> >> >
>> >>
>> >>
>> >> --
>> >> Steven M. Lewis PhD
>> >> 4221 105th Ave Ne
>> >> Kirkland, WA 98033
>> >> 206-384-1340 (cell)
>> >> Institute for Systems Biology
>> >> Seattle WA
>> >
>> > -BEGIN PGP SIGNATURE-
>> > Version: GnuPG v1.4.10 (GNU/Linux)
>> >
>> > iF4EAREIAAYFAkzfHswACgkQenyFlstYjhKtUAD+Nu/gL5DQ+v9iC89dIaHltvCK
>> > Oa6HOwVWNXWksUxhZhgBAMueLiItX6y4jhCKA5xCOqAmfxA0KZpTkyZr4+ozrazg
>> > =wScC
>> > -END PGP SIGNATURE-
>> >
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Cloudera and Yahoo have back ported the interesting security
components and some enhancements into their 0.20 based distributions.
Out of the box, I know hive does not (yet

Re: Why hadoop is written in java?

2010-10-12 Thread Edward Capriolo

On Tue, Oct 12, 2010 at 12:20 AM, Chris Dyer  wrote:
> The Java memory overhead is a quite serious problem, and a legitimate
> and serious criticism of Hadoop. For MapReduce applications, it is
> often (although not always) possible to improve performance by doing
> more work in memory (e.g., using combiners and the like) before
> emitting data. Thus, the more memory available to your application,
> the more efficient it runs. Therefore, if you have a framework that
> locks up 500mb rather than 50mb, you systematically get less
> performance out of your cluster.
>
> The second issue is that C/C++ bindings are common and widely used
> from many languages, but it is not generally possible to interface
> directly with Java (or Java libraries) from another language, unless
> that language is also built on top of the JVM. This is a very
> unfortunate because many problems that would be quite naturally
> expressed in MapReduce are better solved in non-JVM languages.
>
> But, Java is what we have, and it works well enough for many things.
>
> On Mon, Oct 11, 2010 at 11:18 PM, Dhruba Borthakur  wrote:
>> I agree with others in this list that Java provides faster software
>> development, the IO cost in Java is practically the same as in C/C++, etc.
>> In short, most pieces of distributed software can be written in Java without
>> any performance hiccups, as long as it is only system metadata that is
>> handled by Java.
>>
>> One problem is when data-flow has to occur in Java. Each record that is read
>> from the storage has to be de-serialized, uncompressed and then processed.
>> This processing could be very slow in Java compared to when written in other
>> languages, especially because of the creation/destruction of too many
>> objects.  It would have been nice if the map/reduce task could have been
>> written in C/C++, or better still, if the sorting inside the MR framework
>> could occur in C/C++.
>>
>> thanks,
>> dhruba
>>
>> On Mon, Oct 11, 2010 at 4:50 PM, helwr  wrote:
>>
>>>
>>> Check out this thread:
>>> https://www.quora.com/Why-was-Hadoop-written-in-Java
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Why-hadoop-is-written-in-java-tp1673148p1684291.html
>>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>>>
>>
>>
>>
>> --
>> Connect to me at http://www.facebook.com/dhruba
>>
>

Hate to say it this way... but yet another "java is slow compared to
the equivalent non existent c/c++ alternative"
Until http://code.google.com/p/qizmt/ wins the TeraSort benchmark or
when Google open sources Google MapReduce, I am sure if someone coded
hadoop in assembler it would trump the theoretical hadoop written in c
as well.

Re: hd fs -head?

2010-09-27 Thread Edward Capriolo

On Mon, Sep 27, 2010 at 11:13 AM, Keith Wiley  wrote:
> On 2010, Sep 27, at 7:02 AM, Edward Capriolo wrote:
>
>> On Mon, Sep 27, 2010 at 3:23 AM, Keith Wiley 
>> wrote:
>>>
>>> Is there a particularly good reason for why the "hadoop fs" command
>>> supports
>>> -cat and -tail, but not -head?
>>>
>>
>> Tail is needed to be done efficiently but head you can just do
>> yourself. Most people probably use
>>
>> hadoop dfs -cat file | head -5.
>
>
> I disagree with your use of the word "efficiently".  :-)  To my
> understanding (and perhaps that's the source of my error), the approach you
> suggested reads the entire file over the net from the cluster to your client
> machine.  That file could conceivably be of HDFS scales (100s of GBs, even
> TBs wouldn't be uncommon).
>
> What do you think?  Am I wrong in my interpretation of how
> hadoopCat-pipe-head would work?
>
> Cheers!
>
> 
> Keith Wiley     kwi...@keithwiley.com     keithwiley.com
>  music.keithwiley.com
>
> "And what if we picked the wrong religion?  Every week, we're just making
> God
> madder and madder!"
>                                           --  Homer Simpson
> 
>
>

'hadoop dfs -cat' will output the file as it is read. head -5 will
kill the first half of the pipe after 5 lines. With buffering more
might be physically read then 5 lines but this invocation does not
read the enter HDFS file before piping it to head.

Re: A new way to merge up those small files!

2010-09-27 Thread Edward Capriolo

Ted,

Good point. Patches are welcome :) I will add it onto my to-do list.

Edward

On Sat, Sep 25, 2010 at 12:05 PM, Ted Yu  wrote:
> Edward:
> Thanks for the tool.
>
> I think the last parameter can be omitted if you follow what hadoop fs -text
> does.
> It looks at a file's magic number so that it can attempt to *detect* the
> type of the file.
>
> Cheers
>
> On Fri, Sep 24, 2010 at 11:41 PM, Edward Capriolo 
> wrote:
>
>> Many times a hadoop job produces a file per reducer and the job has
>> many reducers. Or a map only job one output file per input file and
>> you have many input files. Or you just have many small files from some
>> external process. Hadoop has sub optimal handling of small files.
>> There are some ways to handle this inside a map reduce program,
>> IdentityMapper + IdentityReducer for example, or multi outputs However
>> we wanted a tool that could be used by people using hive, or pig, or
>> map reduce. We wanted to allow people to combine a directory with
>> multiple files or a hierarchy of directories like the root of a hive
>> partitioned table. We also wanted to be able to combine text or
>> sequence files.
>>
>> What we came up with is the filecrusher.
>>
>> Usage:
>> /usr/bin/hadoop jar filecrush.jar crush.Crush /directory/to/compact
>> /user/edward/backup 50 SEQUENCE
>> (50 is the number of mappers here)
>>
>> Code is Apache V2 and you can get it here:
>> http://www.jointhegrid.com/hadoop_filecrush/index.jsp
>>
>> Enjoy,
>> Edward
>>
>

Re: hd fs -head?

2010-09-27 Thread Edward Capriolo

On Mon, Sep 27, 2010 at 3:23 AM, Keith Wiley  wrote:
> Is there a particularly good reason for why the "hadoop fs" command supports
> -cat and -tail, but not -head?
>
> 
> Keith Wiley     kwi...@keithwiley.com     keithwiley.com
>  music.keithwiley.com
>
> "I do not feel obliged to believe that the same God who has endowed us with
> sense, reason, and intellect has intended us to forgo their use."
>                                           --  Galileo Galilei
> 
>
>

Tail is needed to be done efficiently but head you can just do
yourself. Most people probably use

hadoop dfs -cat file | head -5.

A new way to merge up those small files!

2010-09-24 Thread Edward Capriolo

Many times a hadoop job produces a file per reducer and the job has
many reducers. Or a map only job one output file per input file and
you have many input files. Or you just have many small files from some
external process. Hadoop has sub optimal handling of small files.
There are some ways to handle this inside a map reduce program,
IdentityMapper + IdentityReducer for example, or multi outputs However
we wanted a tool that could be used by people using hive, or pig, or
map reduce. We wanted to allow people to combine a directory with
multiple files or a hierarchy of directories like the root of a hive
partitioned table. We also wanted to be able to combine text or
sequence files.

What we came up with is the filecrusher.

Usage:
/usr/bin/hadoop jar filecrush.jar crush.Crush /directory/to/compact
/user/edward/backup 50 SEQUENCE
(50 is the number of mappers here)

Code is Apache V2 and you can get it here:
http://www.jointhegrid.com/hadoop_filecrush/index.jsp

Enjoy,
Edward

Re: Cannot run program "bash": java.io.IOException

2010-09-18 Thread Edward Capriolo

This happens because child processes try to allocate the same memory
as the parent. One way to solve this is setting memory overcommit to
your linux system.

On Sat, Sep 18, 2010 at 4:47 AM, Bradford Stephens
 wrote:
> Hey guys,
>
> I'm running into issues when doing a moderate-size EMR job on 12 m1.large
> nodes. Mappers and Reducers will randomly fail.
>
> The EMR defaults are 2 mappers / 2 reducers per node. I've tried running
> with mapred.child.opts set in the jobconf to -Xmx256m and -Xmx1024m. No
> difference.
>
> There are about 1,000 map tasks. Not very much data, maybe 50G at most?
>
> My job fails to complete. Looking in syslog shows this:
>
> java.io.IOException: Cannot run program "bash": java.io.IOException:
> error=12, Cannot allocate memory
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
>  at org.apache.hadoop.util.Shell.run(Shell.java:134)
> at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
>  at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
>  at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1238)
>  at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1146)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:365)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:312)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
> allocate memory
> at java.lang.UNIXProcess.(UNIXProcess.java:148)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:65)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
>  ... 11 more
>
>
>
> I would ask the EMR forums, but I think I may get faster feedback here :)
>
>
> --
> Bradford Stephens,
> Founder, Drawn to Scale
> drawntoscalehq.com
> 727.697.7528
>
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data solution.
> Process, store, query, search, and serve all your data.
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social Media,
> and Computer Science
>

1 2 3 >

1 - 100 of 295 matches

Mail list logo