Re: EOFException

2012-03-15 Thread Gopal

On 03/15/2012 03:06 PM, Mohit Anchlia wrote:

When I start a job to read data from HDFS I start getting these errors.
Does anyone know what this means and how to resolve it?

2012-03-15 10:41:31,402 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.204:50010java.io.EOFException
2012-03-15 10:41:31,402 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-6402969611996946639_11837
2012-03-15 10:41:31,403 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.204:50010
2012-03-15 10:41:31,406 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.198:50010java.io.EOFException
2012-03-15 10:41:31,406 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-5442664108986165368_11838
2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.197:50010java.io.EOFException
2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-3373089616877234160_11838
2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.198:50010
2012-03-15 10:41:31,409 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.197:50010
2012-03-15 10:41:31,410 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.204:50010java.io.EOFException
2012-03-15 10:41:31,410 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_4481292025401332278_11838
2012-03-15 10:41:31,411 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.204:50010
2012-03-15 10:41:31,412 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.200:50010java.io.EOFException
2012-03-15 10:41:31,412 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-5326771177080888701_11838
2012-03-15 10:41:31,413 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.200:50010
2012-03-15 10:41:31,414 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.197:50010java.io.EOFException
2012-03-15 10:41:31,414 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-8073750683705518772_11839
2012-03-15 10:41:31,415 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.197:50010
2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.199:50010java.io.EOFException
2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Exception in createBlockOutputStream 164.28.62.198:50010java.io.EOFException
2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_441003866688859169_11838
2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Abandoning block blk_-466858474055876377_11839
2012-03-15 10:41:31,417 [Thread-5] INFO  org.apache.hadoop.hdfs.DFSClient -
Excluding datanode 164.28.62.198:50010
2012-03-15 10:41:31,417 [Thread-5] WARN  org.apache.hadoop.hdfs.DFSClient -
   

Try shutting down and  restarting hbase.


Re: EOFException

2012-03-15 Thread Mohit Anchlia
This is actually just hadoop job over HDFS. I am assuming you also know why
this is erroring out?

On Thu, Mar 15, 2012 at 1:02 PM, Gopal  wrote:

>  On 03/15/2012 03:06 PM, Mohit Anchlia wrote:
>
>> When I start a job to read data from HDFS I start getting these errors.
>> Does anyone know what this means and how to resolve it?
>>
>> 2012-03-15 10:41:31,402 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,402 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-6402969611996946639_11837
>> 2012-03-15 10:41:31,403 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.204:50010
>> 2012-03-15 10:41:31,406 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,406 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-5442664108986165368_11838
>> 2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-3373089616877234160_11838
>> 2012-03-15 10:41:31,407 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.198:50010
>> 2012-03-15 10:41:31,409 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.197:50010
>> 2012-03-15 10:41:31,410 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,410 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_4481292025401332278_11838
>> 2012-03-15 10:41:31,411 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.204:50010
>> 2012-03-15 10:41:31,412 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.200:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,412 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-5326771177080888701_11838
>> 2012-03-15 10:41:31,413 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.200:50010
>> 2012-03-15 10:41:31,414 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,414 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-8073750683705518772_11839
>> 2012-03-15 10:41:31,415 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.197:50010
>> 2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.199:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
>> EOFException
>> 2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_441003866688859169_11838
>> 2012-03-15 10:41:31,416 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Abandoning block blk_-466858474055876377_11839
>> 2012-03-15 10:41:31,417 [Thread-5] INFO  org.apache.hadoop.hdfs.**DFSClient
>> -
>> Excluding datanode 164.28.62.198:50010
>> 2012-03-15 10:41:31,417 [Thread-5] WARN  org.apache.hadoop.hdfs.**DFSClient
>> -
>>
>>
> Try shutting down and  restarting hbase.
>


Re: EOFException

2012-03-19 Thread madhu phatak
Hi,
 Seems like HDFS is in safemode.

On Fri, Mar 16, 2012 at 1:37 AM, Mohit Anchlia wrote:

> This is actually just hadoop job over HDFS. I am assuming you also know why
> this is erroring out?
>
> On Thu, Mar 15, 2012 at 1:02 PM, Gopal  wrote:
>
> >  On 03/15/2012 03:06 PM, Mohit Anchlia wrote:
> >
> >> When I start a job to read data from HDFS I start getting these errors.
> >> Does anyone know what this means and how to resolve it?
> >>
> >> 2012-03-15 10:41:31,402 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,402 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-6402969611996946639_11837
> >> 2012-03-15 10:41:31,403 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.204:50010
> >> 2012-03-15 10:41:31,406 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,406 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-5442664108986165368_11838
> >> 2012-03-15 10:41:31,407 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,407 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-3373089616877234160_11838
> >> 2012-03-15 10:41:31,407 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.198:50010
> >> 2012-03-15 10:41:31,409 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.197:50010
> >> 2012-03-15 10:41:31,410 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,410 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_4481292025401332278_11838
> >> 2012-03-15 10:41:31,411 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.204:50010
> >> 2012-03-15 10:41:31,412 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.200:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,412 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-5326771177080888701_11838
> >> 2012-03-15 10:41:31,413 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.200:50010
> >> 2012-03-15 10:41:31,414 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,414 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-8073750683705518772_11839
> >> 2012-03-15 10:41:31,415 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.197:50010
> >> 2012-03-15 10:41:31,416 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.199:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,416 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
> >> EOFException
> >> 2012-03-15 10:41:31,416 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_441003866688859169_11838
> >> 2012-03-15 10:41:31,416 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Abandoning block blk_-466858474055876377_11839
> >> 2012-03-15 10:41:31,417 [Thread-5] INFO
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >> Excluding datanode 164.28.62.198:50010
> >> 2012-03-15 10:41:31,417 [Thread-5] WARN
>  org.apache.hadoop.hdfs.**DFSClient
> >> -
> >>
> >>
> > Try shutting down and  restarting hbase.
> >
>



-- 
https://github.com/zinnia-phatak-dev/Nectar


Re: EOFException

2012-03-19 Thread Mohit Anchlia
I guess I am trying to see how to debug such problems? I don't see enough
info in the logs.


On Mon, Mar 19, 2012 at 12:48 AM, madhu phatak  wrote:

> Hi,
>  Seems like HDFS is in safemode.
>
> On Fri, Mar 16, 2012 at 1:37 AM, Mohit Anchlia  >wrote:
>
> > This is actually just hadoop job over HDFS. I am assuming you also know
> why
> > this is erroring out?
> >
> > On Thu, Mar 15, 2012 at 1:02 PM, Gopal  wrote:
> >
> > >  On 03/15/2012 03:06 PM, Mohit Anchlia wrote:
> > >
> > >> When I start a job to read data from HDFS I start getting these
> errors.
> > >> Does anyone know what this means and how to resolve it?
> > >>
> > >> 2012-03-15 10:41:31,402 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,402 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-6402969611996946639_11837
> > >> 2012-03-15 10:41:31,403 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.204:50010
> > >> 2012-03-15 10:41:31,406 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,406 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-5442664108986165368_11838
> > >> 2012-03-15 10:41:31,407 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,407 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-3373089616877234160_11838
> > >> 2012-03-15 10:41:31,407 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.198:50010
> > >> 2012-03-15 10:41:31,409 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.197:50010
> > >> 2012-03-15 10:41:31,410 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.204:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,410 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_4481292025401332278_11838
> > >> 2012-03-15 10:41:31,411 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.204:50010
> > >> 2012-03-15 10:41:31,412 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.200:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,412 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-5326771177080888701_11838
> > >> 2012-03-15 10:41:31,413 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.200:50010
> > >> 2012-03-15 10:41:31,414 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.197:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,414 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-8073750683705518772_11839
> > >> 2012-03-15 10:41:31,415 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.197:50010
> > >> 2012-03-15 10:41:31,416 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.199:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,416 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Exception in createBlockOutputStream 164.28.62.198:50010java.io.**
> > >> EOFException
> > >> 2012-03-15 10:41:31,416 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_441003866688859169_11838
> > >> 2012-03-15 10:41:31,416 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Abandoning block blk_-466858474055876377_11839
> > >> 2012-03-15 10:41:31,417 [Thread-5] INFO
> >  org.apache.hadoop.hdfs.**DFSClient
> > >> -
> > >> Excluding datanode 164.28.62.198:50010
> > >> 2012-03-15 10:41:31,417 [Thread-5] WARN
> >  org.apache.hadoop.hdfs.**DFSClient
>  > >> -
> > >>
> > >>
> > > Try shutting down and  restarting hbase.
> > >
> >
>
>
>
> --
> https://github.com/zinnia-phatak-dev/Nectar
>


Re: EOFException

2012-05-01 Thread madhu phatak
Hi,
 In write method ,use writeInt() rather than write method. It should solve
your problem.

On Mon, Apr 30, 2012 at 10:40 PM, Keith Thompson wrote:

> I have been running several MapReduce jobs on some input text files. They
> were working fine earlier and then I suddenly started getting EOFException
> every time. Even the jobs that ran fine before (on the exact same input
> files) aren't running now. I am a bit perplexed as to what is causing this
> error. Here is the error:
>
> 12/04/30 12:55:55 INFO mapred.JobClient: Task Id :
> attempt_201202240659_6328_m_01_1, Status : FAILED
> java.lang.RuntimeException: java.io.EOFException
>at
>
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:128)
>at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:967)
>at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:30)
>at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:83)
>at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59)
>at
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1253)
>at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154)
>at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>at java.security.AccessController.doPrivileged(Native Method)
>at javax.security.auth.Subject.doAs(Subject.java:396)
>at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.io.EOFException
>at java.io.DataInputStream.readInt(DataInputStream.java:375)
>at com.xerox.twitter.bin.UserTime.readFields(UserTime.java:31)
>at
>
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:122)
>
> Since the compare function seems to be involved, here is my custom key
> class. Note: I did not include year in the key because all keys have the
> same year.
>
> public class UserTime implements WritableComparable {
>
> int id, month, day, year, hour, min, sec;
>  public UserTime() {
>
> }
>  public UserTime(int u, int mon, int d, int y, int h, int m, int s) {
> id = u;
> month = mon;
> day = d;
> year = y;
> hour = h;
> min = m;
> sec = s;
> }
>  @Override
> public void readFields(DataInput in) throws IOException {
> // TODO Auto-generated method stub
> id = in.readInt();
> month = in.readInt();
> day = in.readInt();
> year = in.readInt();
> hour = in.readInt();
> min = in.readInt();
> sec = in.readInt();
> }
>
> @Override
> public void write(DataOutput out) throws IOException {
> // TODO Auto-generated method stub
> out.write(id);
> out.write(month);
> out.write(day);
> out.write(year);
> out.write(hour);
> out.write(min);
> out.write(sec);
> }
>
> @Override
> public int compareTo(UserTime that) {
> // TODO Auto-generated method stub
> if(compareUser(that) == 0)
> return (compareTime(that));
> else if(compareUser(that) == 1)
> return 1;
> else return -1;
> }
>  private int compareUser(UserTime that) {
> if(id > that.id)
> return 1;
> else if(id == that.id)
> return 0;
> else return -1;
> }
>  //assumes all are from the same year
> private int compareTime(UserTime that) {
> if(month > that.month ||
> (month == that.month && day > that.day) ||
> (month == that.month && day == that.day && hour > that.hour) ||
> (month == that.month && day == that.day && hour == that.hour && min >
> that.min) ||
> (month == that.month && day == that.day && hour == that.hour && min ==
> that.min && sec > that.sec))
> return 1;
> else if(month == that.month && day == that.day && hour == that.hour && min
> == that.min && sec == that.sec)
> return 0;
> else return -1;
> }
>  public String toString() {
> String h, m, s;
> if(hour < 10)
> h = "0"+hour;
> else
> h = Integer.toString(hour);
> if(min < 10)
> m = "0"+min;
> else
> m = Integer.toString(hour);
> if(sec < 10)
> s = "0"+min;
> else
> s = Integer.toString(hour);
> return (id+"\t"+month+"/"+day+"/"+year+"\t"+h+":"+m+":"+s);
> }
> }
>
> Thanks for any help.
>
> Regards,
> Keith
>



-- 
https://github.com/zinnia-phatak-dev/Nectar


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-25 Thread Marcos Ortiz


Regards, waqas. I think that you have to ask to MapR experts.


On 05/25/2012 05:42 AM, waqas latif wrote:

Hi Experts,

I am fairly new to hadoop MapR and I was trying to run a matrix
multiplication example presented by Mr. Norstadt under following link
http://www.norstad.org/matrix-multiply/index.html. I can run it
successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but
I am getting following error. Is it the problem with my hadoop
configuration or it is compatibility problem in the code which was written
in hadoop 0.20 by author.Also please guide me that how can I fix this error
in either case. Here is the error I am getting.

The same code that you write for 0.20.2 should work in 1.0.3 too.



in thread "main" java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at java.io.DataInputStream.readFully(DataInputStream.java:152)
 at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
 at
org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
 at
org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
 at
org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
 at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
 at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
 at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
 at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
 at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
 at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Thanks in advance

Regards,
waqas

Can you put here the completed log for this?
Best wishes

--
Marcos Luis Ortíz Valmaseda
 Data Engineer&&  Sr. System Administrator at UCI
 http://marcosluis2186.posterous.com
 http://www.linkedin.com/in/marcosluis2186
 Twitter: @marcosluis2186


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-25 Thread Harsh J
Waqas,

Can you ensure this file isn't empty (0 in size)?

On Fri, May 25, 2012 at 3:12 PM, waqas latif  wrote:
> Hi Experts,
>
> I am fairly new to hadoop MapR and I was trying to run a matrix
> multiplication example presented by Mr. Norstadt under following link
> http://www.norstad.org/matrix-multiply/index.html. I can run it
> successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but
> I am getting following error. Is it the problem with my hadoop
> configuration or it is compatibility problem in the code which was written
> in hadoop 0.20 by author.Also please guide me that how can I fix this error
> in either case. Here is the error I am getting.
>
> in thread "main" java.io.EOFException
>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>        at java.io.DataInputStream.readFully(DataInputStream.java:152)
>        at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>        at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>        at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>        at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>        at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
>        at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
>        at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
>        at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
>        at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
>        at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> Thanks in advance
>
> Regards,
> waqas



-- 
Harsh J


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-25 Thread Kasi Subrahmanyam
Hi,
If you are using a custom writable object while passing data from the
mapper to the reducer make sure that the read fields and the write has the
same number of variables. It might be possible that you wrote datavtova
file using custom writable but later modified the custom writable (like
adding new attribute to the writable) which the old data doesn't have.

It might be a possibility is please check once

On Friday, May 25, 2012, waqas latif wrote:

> Hi Experts,
>
> I am fairly new to hadoop MapR and I was trying to run a matrix
> multiplication example presented by Mr. Norstadt under following link
> http://www.norstad.org/matrix-multiply/index.html. I can run it
> successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3 but
> I am getting following error. Is it the problem with my hadoop
> configuration or it is compatibility problem in the code which was written
> in hadoop 0.20 by author.Also please guide me that how can I fix this error
> in either case. Here is the error I am getting.
>
> in thread "main" java.io.EOFException
>at java.io.DataInputStream.readFully(DataInputStream.java:180)
>at java.io.DataInputStream.readFully(DataInputStream.java:152)
>at
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
>at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
>at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
>at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
>at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
>at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> Thanks in advance
>
> Regards,
> waqas
>


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-27 Thread Prashant Kommireddi
I have seen this issue with large file writes using SequenceFile writer.
Not found the same issue when testing with writing fairly small files ( <
1GB).

On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
wrote:

> Hi,
> If you are using a custom writable object while passing data from the
> mapper to the reducer make sure that the read fields and the write has the
> same number of variables. It might be possible that you wrote datavtova
> file using custom writable but later modified the custom writable (like
> adding new attribute to the writable) which the old data doesn't have.
>
> It might be a possibility is please check once
>
> On Friday, May 25, 2012, waqas latif wrote:
>
> > Hi Experts,
> >
> > I am fairly new to hadoop MapR and I was trying to run a matrix
> > multiplication example presented by Mr. Norstadt under following link
> > http://www.norstad.org/matrix-multiply/index.html. I can run it
> > successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3
> but
> > I am getting following error. Is it the problem with my hadoop
> > configuration or it is compatibility problem in the code which was
> written
> > in hadoop 0.20 by author.Also please guide me that how can I fix this
> error
> > in either case. Here is the error I am getting.
> >
> > in thread "main" java.io.EOFException
> >at java.io.DataInputStream.readFully(DataInputStream.java:180)
> >at java.io.DataInputStream.readFully(DataInputStream.java:152)
> >at
> > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
> >at
> > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
> >at
> > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> >at
> > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> >at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
> >at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
> >at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
> >at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
> >at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
> >at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >at java.lang.reflect.Method.invoke(Method.java:597)
> >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >
> > Thanks in advance
> >
> > Regards,
> > waqas
> >
>


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-27 Thread waqas latif
But the thing is, it works with hadoop 0.20. even with 100 x100(and even
bigger matrices)  but when it comes to hadoop 1.0.3 then even there is a
problem with 3x3 matrix.

On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi
wrote:

> I have seen this issue with large file writes using SequenceFile writer.
> Not found the same issue when testing with writing fairly small files ( <
> 1GB).
>
> On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
> wrote:
>
> > Hi,
> > If you are using a custom writable object while passing data from the
> > mapper to the reducer make sure that the read fields and the write has
> the
> > same number of variables. It might be possible that you wrote datavtova
> > file using custom writable but later modified the custom writable (like
> > adding new attribute to the writable) which the old data doesn't have.
> >
> > It might be a possibility is please check once
> >
> > On Friday, May 25, 2012, waqas latif wrote:
> >
> > > Hi Experts,
> > >
> > > I am fairly new to hadoop MapR and I was trying to run a matrix
> > > multiplication example presented by Mr. Norstadt under following link
> > > http://www.norstad.org/matrix-multiply/index.html. I can run it
> > > successfully with hadoop 0.20.2 but I tried to run it with hadoop 1.0.3
> > but
> > > I am getting following error. Is it the problem with my hadoop
> > > configuration or it is compatibility problem in the code which was
> > written
> > > in hadoop 0.20 by author.Also please guide me that how can I fix this
> > error
> > > in either case. Here is the error I am getting.
> > >
> > > in thread "main" java.io.EOFException
> > >at java.io.DataInputStream.readFully(DataInputStream.java:180)
> > >at java.io.DataInputStream.readFully(DataInputStream.java:152)
> > >at
> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
> > >at
> > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
> > >at
> > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
> > >at
> > > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
> > >at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
> > >at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
> > >at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
> > >at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
> > >at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
> > >at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
> > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >at
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >at java.lang.reflect.Method.invoke(Method.java:597)
> > >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > >
> > > Thanks in advance
> > >
> > > Regards,
> > > waqas
> > >
> >
>


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-29 Thread waqas latif
So my question is that do hadoop 0.20 and 1.0.3 differ in their support of
writing or reading sequencefiles? same code works fine with hadoop 0.20 but
problem occurs when run it under hadoop 1.0.3.

On Sun, May 27, 2012 at 6:15 PM, waqas latif  wrote:

> But the thing is, it works with hadoop 0.20. even with 100 x100(and even
> bigger matrices)  but when it comes to hadoop 1.0.3 then even there is a
> problem with 3x3 matrix.
>
>
> On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi  > wrote:
>
>> I have seen this issue with large file writes using SequenceFile writer.
>> Not found the same issue when testing with writing fairly small files ( <
>> 1GB).
>>
>> On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
>> wrote:
>>
>> > Hi,
>> > If you are using a custom writable object while passing data from the
>> > mapper to the reducer make sure that the read fields and the write has
>> the
>> > same number of variables. It might be possible that you wrote datavtova
>> > file using custom writable but later modified the custom writable (like
>> > adding new attribute to the writable) which the old data doesn't have.
>> >
>> > It might be a possibility is please check once
>> >
>> > On Friday, May 25, 2012, waqas latif wrote:
>> >
>> > > Hi Experts,
>> > >
>> > > I am fairly new to hadoop MapR and I was trying to run a matrix
>> > > multiplication example presented by Mr. Norstadt under following link
>> > > http://www.norstad.org/matrix-multiply/index.html. I can run it
>> > > successfully with hadoop 0.20.2 but I tried to run it with hadoop
>> 1.0.3
>> > but
>> > > I am getting following error. Is it the problem with my hadoop
>> > > configuration or it is compatibility problem in the code which was
>> > written
>> > > in hadoop 0.20 by author.Also please guide me that how can I fix this
>> > error
>> > > in either case. Here is the error I am getting.
>> > >
>> > > in thread "main" java.io.EOFException
>> > >at java.io.DataInputStream.readFully(DataInputStream.java:180)
>> > >at java.io.DataInputStream.readFully(DataInputStream.java:152)
>> > >at
>> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>> > >at
>> > >
>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>> > >at
>> > >
>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>> > >at
>> > >
>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>> > >at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
>> > >at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
>> > >at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
>> > >at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
>> > >at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
>> > >at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
>> > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > >at
>> > >
>> > >
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> > >at
>> > >
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > >at java.lang.reflect.Method.invoke(Method.java:597)
>> > >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> > >
>> > > Thanks in advance
>> > >
>> > > Regards,
>> > > waqas
>> > >
>> >
>>
>
>


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-30 Thread waqas latif
I got the problem with I am unable to solve it. I need to apply a filter
for _SUCCESS file while using FileSystem.listStatus method. Can someone
please guide me how to filter _SUCCESS files. Thanks

On Tue, May 29, 2012 at 1:42 PM, waqas latif  wrote:

> So my question is that do hadoop 0.20 and 1.0.3 differ in their support of
> writing or reading sequencefiles? same code works fine with hadoop 0.20 but
> problem occurs when run it under hadoop 1.0.3.
>
>
> On Sun, May 27, 2012 at 6:15 PM, waqas latif  wrote:
>
>> But the thing is, it works with hadoop 0.20. even with 100 x100(and even
>> bigger matrices)  but when it comes to hadoop 1.0.3 then even there is a
>> problem with 3x3 matrix.
>>
>>
>> On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi <
>> prash1...@gmail.com> wrote:
>>
>>> I have seen this issue with large file writes using SequenceFile writer.
>>> Not found the same issue when testing with writing fairly small files ( <
>>> 1GB).
>>>
>>> On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
>>> wrote:
>>>
>>> > Hi,
>>> > If you are using a custom writable object while passing data from the
>>> > mapper to the reducer make sure that the read fields and the write has
>>> the
>>> > same number of variables. It might be possible that you wrote datavtova
>>> > file using custom writable but later modified the custom writable (like
>>> > adding new attribute to the writable) which the old data doesn't have.
>>> >
>>> > It might be a possibility is please check once
>>> >
>>> > On Friday, May 25, 2012, waqas latif wrote:
>>> >
>>> > > Hi Experts,
>>> > >
>>> > > I am fairly new to hadoop MapR and I was trying to run a matrix
>>> > > multiplication example presented by Mr. Norstadt under following link
>>> > > http://www.norstad.org/matrix-multiply/index.html. I can run it
>>> > > successfully with hadoop 0.20.2 but I tried to run it with hadoop
>>> 1.0.3
>>> > but
>>> > > I am getting following error. Is it the problem with my hadoop
>>> > > configuration or it is compatibility problem in the code which was
>>> > written
>>> > > in hadoop 0.20 by author.Also please guide me that how can I fix this
>>> > error
>>> > > in either case. Here is the error I am getting.
>>> > >
>>> > > in thread "main" java.io.EOFException
>>> > >at java.io.DataInputStream.readFully(DataInputStream.java:180)
>>> > >at java.io.DataInputStream.readFully(DataInputStream.java:152)
>>> > >at
>>> > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>>> > >at
>>> > >
>>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>>> > >at
>>> > >
>>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>>> > >at
>>> > >
>>> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>>> > >at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
>>> > >at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
>>> > >at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
>>> > >at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
>>> > >at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
>>> > >at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
>>> > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> > >at
>>> > >
>>> > >
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> > >at
>>> > >
>>> > >
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> > >at java.lang.reflect.Method.invoke(Method.java:597)
>>> > >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>> > >
>>> > > Thanks in advance
>>> > >
>>> > > Regards,
>>> > > waqas
>>> > >
>>> >
>>>
>>
>>
>


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-30 Thread Harsh J
When your code does a listStatus, you can pass a PathFilter object
along that can do this filtering for you. See
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.PathFilter)
for the API javadocs on that.

On Wed, May 30, 2012 at 7:46 PM, waqas latif  wrote:
> I got the problem with I am unable to solve it. I need to apply a filter
> for _SUCCESS file while using FileSystem.listStatus method. Can someone
> please guide me how to filter _SUCCESS files. Thanks
>
> On Tue, May 29, 2012 at 1:42 PM, waqas latif  wrote:
>
>> So my question is that do hadoop 0.20 and 1.0.3 differ in their support of
>> writing or reading sequencefiles? same code works fine with hadoop 0.20 but
>> problem occurs when run it under hadoop 1.0.3.
>>
>>
>> On Sun, May 27, 2012 at 6:15 PM, waqas latif  wrote:
>>
>>> But the thing is, it works with hadoop 0.20. even with 100 x100(and even
>>> bigger matrices)  but when it comes to hadoop 1.0.3 then even there is a
>>> problem with 3x3 matrix.
>>>
>>>
>>> On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi <
>>> prash1...@gmail.com> wrote:
>>>
 I have seen this issue with large file writes using SequenceFile writer.
 Not found the same issue when testing with writing fairly small files ( <
 1GB).

 On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
 wrote:

 > Hi,
 > If you are using a custom writable object while passing data from the
 > mapper to the reducer make sure that the read fields and the write has
 the
 > same number of variables. It might be possible that you wrote datavtova
 > file using custom writable but later modified the custom writable (like
 > adding new attribute to the writable) which the old data doesn't have.
 >
 > It might be a possibility is please check once
 >
 > On Friday, May 25, 2012, waqas latif wrote:
 >
 > > Hi Experts,
 > >
 > > I am fairly new to hadoop MapR and I was trying to run a matrix
 > > multiplication example presented by Mr. Norstadt under following link
 > > http://www.norstad.org/matrix-multiply/index.html. I can run it
 > > successfully with hadoop 0.20.2 but I tried to run it with hadoop
 1.0.3
 > but
 > > I am getting following error. Is it the problem with my hadoop
 > > configuration or it is compatibility problem in the code which was
 > written
 > > in hadoop 0.20 by author.Also please guide me that how can I fix this
 > error
 > > in either case. Here is the error I am getting.
 > >
 > > in thread "main" java.io.EOFException
 > >        at java.io.DataInputStream.readFully(DataInputStream.java:180)
 > >        at java.io.DataInputStream.readFully(DataInputStream.java:152)
 > >        at
 > > org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
 > >        at
 > >
 org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
 > >        at
 > >
 org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
 > >        at
 > >
 org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
 > >        at TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
 > >        at TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
 > >        at TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
 > >        at TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
 > >        at TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
 > >        at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
 > >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 > >        at
 > >
 > >
 >
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 > >        at
 > >
 > >
 >
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 > >        at java.lang.reflect.Method.invoke(Method.java:597)
 > >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 > >
 > > Thanks in advance
 > >
 > > Regards,
 > > waqas
 > >
 >

>>>
>>>
>>



-- 
Harsh J


Re: EOFException at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)......

2012-05-30 Thread waqas latif
Thanks Harsh. I got it running.

On Wed, May 30, 2012 at 5:58 PM, Harsh J  wrote:

> When your code does a listStatus, you can pass a PathFilter object
> along that can do this filtering for you. See
>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.PathFilter)
> for the API javadocs on that.
>
> On Wed, May 30, 2012 at 7:46 PM, waqas latif  wrote:
> > I got the problem with I am unable to solve it. I need to apply a filter
> > for _SUCCESS file while using FileSystem.listStatus method. Can someone
> > please guide me how to filter _SUCCESS files. Thanks
> >
> > On Tue, May 29, 2012 at 1:42 PM, waqas latif  wrote:
> >
> >> So my question is that do hadoop 0.20 and 1.0.3 differ in their support
> of
> >> writing or reading sequencefiles? same code works fine with hadoop 0.20
> but
> >> problem occurs when run it under hadoop 1.0.3.
> >>
> >>
> >> On Sun, May 27, 2012 at 6:15 PM, waqas latif 
> wrote:
> >>
> >>> But the thing is, it works with hadoop 0.20. even with 100 x100(and
> even
> >>> bigger matrices)  but when it comes to hadoop 1.0.3 then even there is
> a
> >>> problem with 3x3 matrix.
> >>>
> >>>
> >>> On Sun, May 27, 2012 at 12:00 PM, Prashant Kommireddi <
> >>> prash1...@gmail.com> wrote:
> >>>
>  I have seen this issue with large file writes using SequenceFile
> writer.
>  Not found the same issue when testing with writing fairly small files
> ( <
>  1GB).
> 
>  On Fri, May 25, 2012 at 10:33 PM, Kasi Subrahmanyam
>  wrote:
> 
>  > Hi,
>  > If you are using a custom writable object while passing data from
> the
>  > mapper to the reducer make sure that the read fields and the write
> has
>  the
>  > same number of variables. It might be possible that you wrote
> datavtova
>  > file using custom writable but later modified the custom writable
> (like
>  > adding new attribute to the writable) which the old data doesn't
> have.
>  >
>  > It might be a possibility is please check once
>  >
>  > On Friday, May 25, 2012, waqas latif wrote:
>  >
>  > > Hi Experts,
>  > >
>  > > I am fairly new to hadoop MapR and I was trying to run a matrix
>  > > multiplication example presented by Mr. Norstadt under following
> link
>  > > http://www.norstad.org/matrix-multiply/index.html. I can run it
>  > > successfully with hadoop 0.20.2 but I tried to run it with hadoop
>  1.0.3
>  > but
>  > > I am getting following error. Is it the problem with my hadoop
>  > > configuration or it is compatibility problem in the code which was
>  > written
>  > > in hadoop 0.20 by author.Also please guide me that how can I fix
> this
>  > error
>  > > in either case. Here is the error I am getting.
>  > >
>  > > in thread "main" java.io.EOFException
>  > >at
> java.io.DataInputStream.readFully(DataInputStream.java:180)
>  > >at
> java.io.DataInputStream.readFully(DataInputStream.java:152)
>  > >at
>  > >
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508)
>  > >at
>  > >
> 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486)
>  > >at
>  > >
> 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475)
>  > >at
>  > >
> 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470)
>  > >at
> TestMatrixMultiply.fillMatrix(TestMatrixMultiply.java:60)
>  > >at
> TestMatrixMultiply.readMatrix(TestMatrixMultiply.java:87)
>  > >at
> TestMatrixMultiply.checkAnswer(TestMatrixMultiply.java:112)
>  > >at
> TestMatrixMultiply.runOneTest(TestMatrixMultiply.java:150)
>  > >at
> TestMatrixMultiply.testRandom(TestMatrixMultiply.java:278)
>  > >at TestMatrixMultiply.main(TestMatrixMultiply.java:308)
>  > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>  > >at
>  > >
>  > >
>  >
> 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  > >at
>  > >
>  > >
>  >
> 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  > >at java.lang.reflect.Method.invoke(Method.java:597)
>  > >at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>  > >
>  > > Thanks in advance
>  > >
>  > > Regards,
>  > > waqas
>  > >
>  >
> 
> >>>
> >>>
> >>
>
>
>
> --
> Harsh J
>


Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-03 Thread Meng Mao
also, which is the ulimit that's important, the one for the user who is
running the job, or the hadoop user that owns the Hadoop processes?

On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao  wrote:

> I've been trying to run a fairly small input file (300MB) on Cloudera
> Hadoop 0.20.1. The job I'm using probably writes to on the order of over
> 1000 part-files at once, across the whole grid. The grid has 33 nodes in it.
> I get the following exception in the run logs:
>
> 10/01/30 17:24:25 INFO mapred.JobClient:  map 100% reduce 12%
> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
> attempt_201001261532_1137_r_13_0, Status : FAILED
> java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
> at org.apache.hadoop.io.Text.readString(Text.java:400)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>
> lots of EOFExceptions
>
> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
> attempt_201001261532_1137_r_19_0, Status : FAILED
> java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>  at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>
> 10/01/30 17:24:36 INFO mapred.JobClient:  map 100% reduce 11%
> 10/01/30 17:24:42 INFO mapred.JobClient:  map 100% reduce 12%
> 10/01/30 17:24:49 INFO mapred.JobClient:  map 100% reduce 13%
> 10/01/30 17:24:55 INFO mapred.JobClient:  map 100% reduce 14%
> 10/01/30 17:25:00 INFO mapred.JobClient:  map 100% reduce 15%
>
> From searching around, it seems like the most common cause of BadLink and
> EOFExceptions is when the nodes don't have enough file descriptors set. But
> across all the grid machines, the file-max has been set to 1573039.
> Furthermore, we set ulimit -n to 65536 using hadoop-env.sh.
>
> Where else should I be looking for what's causing this?
>


Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-04 Thread Meng Mao
I wrote a hadoop job that checks for ulimits across the nodes, and every
node is reporting:
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 139264
max locked memory   (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files  (-n) 65536
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 139264
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited


Is anything in there telling about file number limits? From what I
understand, a high open files limit like 65536 should be enough. I estimate
only a couple thousand part-files on HDFS being written to at once, and
around 200 on the filesystem per node.

On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao  wrote:

> also, which is the ulimit that's important, the one for the user who is
> running the job, or the hadoop user that owns the Hadoop processes?
>
>
> On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao  wrote:
>
>> I've been trying to run a fairly small input file (300MB) on Cloudera
>> Hadoop 0.20.1. The job I'm using probably writes to on the order of over
>> 1000 part-files at once, across the whole grid. The grid has 33 nodes in it.
>> I get the following exception in the run logs:
>>
>> 10/01/30 17:24:25 INFO mapred.JobClient:  map 100% reduce 12%
>> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>> attempt_201001261532_1137_r_13_0, Status : FAILED
>> java.io.EOFException
>> at java.io.DataInputStream.readByte(DataInputStream.java:250)
>> at
>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>> at org.apache.hadoop.io.Text.readString(Text.java:400)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>>
>> lots of EOFExceptions
>>
>> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>> attempt_201001261532_1137_r_19_0, Status : FAILED
>> java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>>  at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>>
>> 10/01/30 17:24:36 INFO mapred.JobClient:  map 100% reduce 11%
>> 10/01/30 17:24:42 INFO mapred.JobClient:  map 100% reduce 12%
>> 10/01/30 17:24:49 INFO mapred.JobClient:  map 100% reduce 13%
>> 10/01/30 17:24:55 INFO mapred.JobClient:  map 100% reduce 14%
>> 10/01/30 17:25:00 INFO mapred.JobClient:  map 100% reduce 15%
>>
>> From searching around, it seems like the most common cause of BadLink and
>> EOFExceptions is when the nodes don't have enough file descriptors set. But
>> across all the grid machines, the file-max has been set to 1573039.
>> Furthermore, we set ulimit -n to 65536 using hadoop-env.sh.
>>
>> Where else should I be looking for what's causing this?
>>
>
>


Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-04 Thread Meng Mao
not sure what else I could be checking to see where the problem lies. Should
I be looking in the datanode logs? I looked briefly in there and didn't see
anything from around the time exceptions started getting reported.
lsof during the job execution? Number of open threads?

I'm at a loss here.

On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao  wrote:

> I wrote a hadoop job that checks for ulimits across the nodes, and every
> node is reporting:
> core file size  (blocks, -c) 0
> data seg size   (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size   (blocks, -f) unlimited
> pending signals (-i) 139264
> max locked memory   (kbytes, -l) 32
> max memory size (kbytes, -m) unlimited
> open files  (-n) 65536
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority  (-r) 0
> stack size  (kbytes, -s) 10240
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 139264
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited
>
>
> Is anything in there telling about file number limits? From what I
> understand, a high open files limit like 65536 should be enough. I estimate
> only a couple thousand part-files on HDFS being written to at once, and
> around 200 on the filesystem per node.
>
> On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao  wrote:
>
>> also, which is the ulimit that's important, the one for the user who is
>> running the job, or the hadoop user that owns the Hadoop processes?
>>
>>
>> On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao  wrote:
>>
>>> I've been trying to run a fairly small input file (300MB) on Cloudera
>>> Hadoop 0.20.1. The job I'm using probably writes to on the order of over
>>> 1000 part-files at once, across the whole grid. The grid has 33 nodes in it.
>>> I get the following exception in the run logs:
>>>
>>> 10/01/30 17:24:25 INFO mapred.JobClient:  map 100% reduce 12%
>>> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>>> attempt_201001261532_1137_r_13_0, Status : FAILED
>>> java.io.EOFException
>>> at java.io.DataInputStream.readByte(DataInputStream.java:250)
>>> at
>>> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>>> at
>>> org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>>> at org.apache.hadoop.io.Text.readString(Text.java:400)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>>>
>>> lots of EOFExceptions
>>>
>>> 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>>> attempt_201001261532_1137_r_19_0, Status : FAILED
>>> java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>>>  at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>>> at
>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
>>>
>>> 10/01/30 17:24:36 INFO mapred.JobClient:  map 100% reduce 11%
>>> 10/01/30 17:24:42 INFO mapred.JobClient:  map 100% reduce 12%
>>> 10/01/30 17:24:49 INFO mapred.JobClient:  map 100% reduce 13%
>>> 10/01/30 17:24:55 INFO mapred.JobClient:  map 100% reduce 14%
>>> 10/01/30 17:25:00 INFO mapred.JobClient:  map 100% reduce 15%
>>>
>>> From searching around, it seems like the most common cause of BadLink and
>>> EOFExceptions is when the nodes don't have enough file descriptors set. But
>>> across all the grid machines, the file-max has been set to 1573039.
>>> Furthermore, we set ulimit -n to 65536 using hadoop-env.sh.
>>>
>>> Where else should I be looking for what's causing this?
>>>
>>
>>
>


Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-05 Thread Todd Lipcon
Yes, you're likely to see an error in the DN log. Do you see anything
about max number of xceivers?

-Todd

On Thu, Feb 4, 2010 at 11:42 PM, Meng Mao  wrote:
> not sure what else I could be checking to see where the problem lies. Should
> I be looking in the datanode logs? I looked briefly in there and didn't see
> anything from around the time exceptions started getting reported.
> lsof during the job execution? Number of open threads?
>
> I'm at a loss here.
>
> On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao  wrote:
>
>> I wrote a hadoop job that checks for ulimits across the nodes, and every
>> node is reporting:
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 139264
>> max locked memory       (kbytes, -l) 32
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 65536
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 139264
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>>
>> Is anything in there telling about file number limits? From what I
>> understand, a high open files limit like 65536 should be enough. I estimate
>> only a couple thousand part-files on HDFS being written to at once, and
>> around 200 on the filesystem per node.
>>
>> On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao  wrote:
>>
>>> also, which is the ulimit that's important, the one for the user who is
>>> running the job, or the hadoop user that owns the Hadoop processes?
>>>
>>>
>>> On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao  wrote:
>>>
 I've been trying to run a fairly small input file (300MB) on Cloudera
 Hadoop 0.20.1. The job I'm using probably writes to on the order of over
 1000 part-files at once, across the whole grid. The grid has 33 nodes in 
 it.
 I get the following exception in the run logs:

 10/01/30 17:24:25 INFO mapred.JobClient:  map 100% reduce 12%
 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
 attempt_201001261532_1137_r_13_0, Status : FAILED
 java.io.EOFException
     at java.io.DataInputStream.readByte(DataInputStream.java:250)
     at
 org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
     at
 org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
     at org.apache.hadoop.io.Text.readString(Text.java:400)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)

 lots of EOFExceptions

 10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
 attempt_201001261532_1137_r_19_0, Status : FAILED
 java.io.IOException: Bad connect ack with firstBadLink 10.2.19.1:50010
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
      at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)

 10/01/30 17:24:36 INFO mapred.JobClient:  map 100% reduce 11%
 10/01/30 17:24:42 INFO mapred.JobClient:  map 100% reduce 12%
 10/01/30 17:24:49 INFO mapred.JobClient:  map 100% reduce 13%
 10/01/30 17:24:55 INFO mapred.JobClient:  map 100% reduce 14%
 10/01/30 17:25:00 INFO mapred.JobClient:  map 100% reduce 15%

 From searching around, it seems like the most common cause of BadLink and
 EOFExceptions is when the nodes don't have enough file descriptors set. But
 across all the grid machines, the file-max has been set to 1573039.
 Furthermore, we set ulimit -n to 65536 using hadoop-env.sh.

 Where else should I be looking for what's causing this?

>>>
>>>
>>
>


Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-05 Thread Meng Mao
ack, after looking at the logs again, there are definitely xcievers errors.
It's set to 256!
I had thought I had cleared this a possible cause, but guess I was wrong.
Gonna retest right away.
Thanks!

On Fri, Feb 5, 2010 at 11:05 AM, Todd Lipcon  wrote:

> Yes, you're likely to see an error in the DN log. Do you see anything
> about max number of xceivers?
>
> -Todd
>
> On Thu, Feb 4, 2010 at 11:42 PM, Meng Mao  wrote:
> > not sure what else I could be checking to see where the problem lies.
> Should
> > I be looking in the datanode logs? I looked briefly in there and didn't
> see
> > anything from around the time exceptions started getting reported.
> > lsof during the job execution? Number of open threads?
> >
> > I'm at a loss here.
> >
> > On Thu, Feb 4, 2010 at 2:52 PM, Meng Mao  wrote:
> >
> >> I wrote a hadoop job that checks for ulimits across the nodes, and every
> >> node is reporting:
> >> core file size  (blocks, -c) 0
> >> data seg size   (kbytes, -d) unlimited
> >> scheduling priority (-e) 0
> >> file size   (blocks, -f) unlimited
> >> pending signals (-i) 139264
> >> max locked memory   (kbytes, -l) 32
> >> max memory size (kbytes, -m) unlimited
> >> open files  (-n) 65536
> >> pipe size(512 bytes, -p) 8
> >> POSIX message queues (bytes, -q) 819200
> >> real-time priority  (-r) 0
> >> stack size  (kbytes, -s) 10240
> >> cpu time   (seconds, -t) unlimited
> >> max user processes  (-u) 139264
> >> virtual memory  (kbytes, -v) unlimited
> >> file locks  (-x) unlimited
> >>
> >>
> >> Is anything in there telling about file number limits? From what I
> >> understand, a high open files limit like 65536 should be enough. I
> estimate
> >> only a couple thousand part-files on HDFS being written to at once, and
> >> around 200 on the filesystem per node.
> >>
> >> On Wed, Feb 3, 2010 at 4:04 PM, Meng Mao  wrote:
> >>
> >>> also, which is the ulimit that's important, the one for the user who is
> >>> running the job, or the hadoop user that owns the Hadoop processes?
> >>>
> >>>
> >>> On Tue, Feb 2, 2010 at 7:29 PM, Meng Mao  wrote:
> >>>
>  I've been trying to run a fairly small input file (300MB) on Cloudera
>  Hadoop 0.20.1. The job I'm using probably writes to on the order of
> over
>  1000 part-files at once, across the whole grid. The grid has 33 nodes
> in it.
>  I get the following exception in the run logs:
> 
>  10/01/30 17:24:25 INFO mapred.JobClient:  map 100% reduce 12%
>  10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>  attempt_201001261532_1137_r_13_0, Status : FAILED
>  java.io.EOFException
>  at java.io.DataInputStream.readByte(DataInputStream.java:250)
>  at
>  org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>  at
>  org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>  at org.apache.hadoop.io.Text.readString(Text.java:400)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
> 
>  lots of EOFExceptions
> 
>  10/01/30 17:24:25 INFO mapred.JobClient: Task Id :
>  attempt_201001261532_1137_r_19_0, Status : FAILED
>  java.io.IOException: Bad connect ack with firstBadLink
> 10.2.19.1:50010
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2871)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
>   at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
>  at
> 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)
> 
>  10/01/30 17:24:36 INFO mapred.JobClient:  map 100% reduce 11%
>  10/01/30 17:24:42 INFO mapred.JobClient:  map 100% reduce 12%
>  10/01/30 17:24:49 INFO mapred.JobClient:  map 100% reduce 13%
>  10/01/30 17:24:55 INFO mapred.JobClient:  map 100% reduce 14%
>  10/01/30 17:25:00 INFO mapred.JobClient:  map 100% reduce 15%
> 
>  From searching around, it seems like the most common cause of BadLink
> and
>  EOFExceptions is when the nodes don't have enough file descriptors
> set. But
>  across all the grid machines, the file-max has been set to 1573039.
>  Furthermore, we set ulimit -n to 65536 using hadoop-env.sh.
> 
>  Where else should I be looking for what's causing this?
> 
> >>>
>