Re: hadoop file system error

2008-06-26 Thread brainstorm
lohit, thanks for your quick reply, no need for those logs, I found the error:

I did hadoop namenode -format ***after*** start-all.sh. I did the
following to fix it up:

1) rm -rf'd hdfs directory
2) hadoop namenode -format
3) start-all.sh
4) put, etc... everything ok :)

That's why the web DFS browser on:

http://localhost:50070/nn_browsedfscontent.jsp

Failed miserably to browse the filesystem... I hope that my experience
could be useful for others (managing this undesirable situation with
an Exception, for instance).

Thanks again !
Roman

On Thu, Jun 26, 2008 at 5:44 PM, lohit <[EMAIL PROTECTED]> wrote:
> Hi Roman,
>
> Which version of hadoop are you running. And do you see any errors/stack 
> trace dumps in log files?
> Can you check $HADOOP_LOG_DIR/*-datanode-*.log and 
> $HADOOP_LOG_DIR/*-namenode-*.log
>
> Can you also make sure you have NameNode and DataNode running.
>
> Thanks,
> Lohit
>
> - Original Message 
> From: brainstorm <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Thursday, June 26, 2008 8:24:49 AM
> Subject: Re: hadoop file system error
>
> I'm having a similar problem but with the hadoop CLI tool (not
> programatically), and it's driving me nuts:
>
> [EMAIL PROTECTED]:~/nutch/trunk$ cat urls/urls.txt
> http://escert.upc.edu/
>
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
> Found 0 items
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -put urls urls
>
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
> Found 1 items
> /user/hadoop/urls2008-06-26 17:20rwxr-xr-xhadoop
> supergroup
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls urls
> Found 1 items
> /user/hadoop/urls/urls.txt02008-06-26 17:20rw-r--r--
> hadoopsupergroup
>
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -cat urls/urls.txt
> [EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -get urls/urls.txt .
> [EMAIL PROTECTED]:~/nutch/trunk$ cat urls.txt
> [EMAIL PROTECTED]:~/nutch/trunk$
>
> As you see, I put a txt file on HDFS from local, containing a line,
> but afterwards, this file is empty... amb I missing any "close",
> "flush" or "commit" command ?
>
> Thanks in advance,
> Roman
>
> On Thu, Jun 19, 2008 at 7:23 PM, Mori Bellamy <[EMAIL PROTECTED]> wrote:
>> might it be a synchronization problem? i don't know if hadoops DFS magically
>> takes care of that, but if it doesn't then you might have a problem because
>> of multiple processes trying to write to the same file?
>>
>> perhaps as a control experiment you could run your process on some small
>> input, making sure that each reduce task outputs to a different filename (i
>> just use Math.random()*Integer.MAX_VALUE and cross my fingers).
>> On Jun 18, 2008, at 6:01 PM, 晋光峰 wrote:
>>
>>> i'm sure i close all the files in the reduce step. Any other reasons cause
>>> this problem?
>>>
>>> 2008/6/18 Konstantin Shvachko <[EMAIL PROTECTED]>:
>>>
>>>> Did you close those files?
>>>> If not they may be empty.
>>>>
>>>>
>>>>
>>>> ??? wrote:
>>>>
>>>>> Dears,
>>>>>
>>>>> I use hadoop-0.16.4 to do some work and found a error which i can't get
>>>>> the
>>>>> reasons.
>>>>>
>>>>> The scenario is like this: In the reduce step, instead of using
>>>>> OutputCollector to write result, i use FSDataOutputStream to write
>>>>> result
>>>>> to
>>>>> files on HDFS(becouse i want to split the result by some rules). After
>>>>> the
>>>>> job finished, i found that *some* files(but not all) are empty on HDFS.
>>>>> But
>>>>> i'm sure in the reduce step the files are not empty since i added some
>>>>> logs
>>>>> to read the generated file. It seems that some file's contents are lost
>>>>> after the reduce step. Is anyone happen to face such errors? or it's a
>>>>> hadoop bug?
>>>>>
>>>>> Please help me to find the reason if you some guys know
>>>>>
>>>>> Thanks & Regards
>>>>> Guangfeng
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Guangfeng Jin
>>>
>>> Software Engineer
>>>
>>> iZENEsoft (Shanghai) Co., Ltd
>>> Room 601 Marine Tower, No. 1 Pudong Ave.
>>> Tel:86-21-68860698
>>> Fax:86-21-68860699
>>> Mobile: 86-13621906422
>>> Company Website:www.izenesoft.com
>>
>>
>
>


Re: hadoop file system error

2008-06-26 Thread lohit
Hi Roman,

Which version of hadoop are you running. And do you see any errors/stack trace 
dumps in log files?
Can you check $HADOOP_LOG_DIR/*-datanode-*.log and 
$HADOOP_LOG_DIR/*-namenode-*.log

Can you also make sure you have NameNode and DataNode running. 

Thanks,
Lohit

- Original Message 
From: brainstorm <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Thursday, June 26, 2008 8:24:49 AM
Subject: Re: hadoop file system error

I'm having a similar problem but with the hadoop CLI tool (not
programatically), and it's driving me nuts:

[EMAIL PROTECTED]:~/nutch/trunk$ cat urls/urls.txt
http://escert.upc.edu/

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
Found 0 items
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -put urls urls

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
Found 1 items
/user/hadoop/urls2008-06-26 17:20rwxr-xr-xhadoop
supergroup
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls urls
Found 1 items
/user/hadoop/urls/urls.txt02008-06-26 17:20rw-r--r--
hadoopsupergroup

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -cat urls/urls.txt
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -get urls/urls.txt .
[EMAIL PROTECTED]:~/nutch/trunk$ cat urls.txt
[EMAIL PROTECTED]:~/nutch/trunk$

As you see, I put a txt file on HDFS from local, containing a line,
but afterwards, this file is empty... amb I missing any "close",
"flush" or "commit" command ?

Thanks in advance,
Roman

On Thu, Jun 19, 2008 at 7:23 PM, Mori Bellamy <[EMAIL PROTECTED]> wrote:
> might it be a synchronization problem? i don't know if hadoops DFS magically
> takes care of that, but if it doesn't then you might have a problem because
> of multiple processes trying to write to the same file?
>
> perhaps as a control experiment you could run your process on some small
> input, making sure that each reduce task outputs to a different filename (i
> just use Math.random()*Integer.MAX_VALUE and cross my fingers).
> On Jun 18, 2008, at 6:01 PM, 晋光峰 wrote:
>
>> i'm sure i close all the files in the reduce step. Any other reasons cause
>> this problem?
>>
>> 2008/6/18 Konstantin Shvachko <[EMAIL PROTECTED]>:
>>
>>> Did you close those files?
>>> If not they may be empty.
>>>
>>>
>>>
>>> ??? wrote:
>>>
>>>> Dears,
>>>>
>>>> I use hadoop-0.16.4 to do some work and found a error which i can't get
>>>> the
>>>> reasons.
>>>>
>>>> The scenario is like this: In the reduce step, instead of using
>>>> OutputCollector to write result, i use FSDataOutputStream to write
>>>> result
>>>> to
>>>> files on HDFS(becouse i want to split the result by some rules). After
>>>> the
>>>> job finished, i found that *some* files(but not all) are empty on HDFS.
>>>> But
>>>> i'm sure in the reduce step the files are not empty since i added some
>>>> logs
>>>> to read the generated file. It seems that some file's contents are lost
>>>> after the reduce step. Is anyone happen to face such errors? or it's a
>>>> hadoop bug?
>>>>
>>>> Please help me to find the reason if you some guys know
>>>>
>>>> Thanks & Regards
>>>> Guangfeng
>>>>
>>>>
>>
>>
>> --
>> Guangfeng Jin
>>
>> Software Engineer
>>
>> iZENEsoft (Shanghai) Co., Ltd
>> Room 601 Marine Tower, No. 1 Pudong Ave.
>> Tel:86-21-68860698
>> Fax:86-21-68860699
>> Mobile: 86-13621906422
>> Company Website:www.izenesoft.com
>
>



Re: hadoop file system error

2008-06-26 Thread brainstorm
I'm having a similar problem but with the hadoop CLI tool (not
programatically), and it's driving me nuts:

[EMAIL PROTECTED]:~/nutch/trunk$ cat urls/urls.txt
http://escert.upc.edu/

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
Found 0 items
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -put urls urls

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls
Found 1 items
/user/hadoop/urls  2008-06-26 17:20rwxr-xr-x   
hadoop  supergroup
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -ls urls
Found 1 items
/user/hadoop/urls/urls.txt 0   2008-06-26 17:20
rw-r--r--   hadoop  supergroup

[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -cat urls/urls.txt
[EMAIL PROTECTED]:~/nutch/trunk$ bin/hadoop dfs -get urls/urls.txt .
[EMAIL PROTECTED]:~/nutch/trunk$ cat urls.txt
[EMAIL PROTECTED]:~/nutch/trunk$

As you see, I put a txt file on HDFS from local, containing a line,
but afterwards, this file is empty... amb I missing any "close",
"flush" or "commit" command ?

Thanks in advance,
Roman

On Thu, Jun 19, 2008 at 7:23 PM, Mori Bellamy <[EMAIL PROTECTED]> wrote:
> might it be a synchronization problem? i don't know if hadoops DFS magically
> takes care of that, but if it doesn't then you might have a problem because
> of multiple processes trying to write to the same file?
>
> perhaps as a control experiment you could run your process on some small
> input, making sure that each reduce task outputs to a different filename (i
> just use Math.random()*Integer.MAX_VALUE and cross my fingers).
> On Jun 18, 2008, at 6:01 PM, 晋光峰 wrote:
>
>> i'm sure i close all the files in the reduce step. Any other reasons cause
>> this problem?
>>
>> 2008/6/18 Konstantin Shvachko <[EMAIL PROTECTED]>:
>>
>>> Did you close those files?
>>> If not they may be empty.
>>>
>>>
>>>
>>> ??? wrote:
>>>
 Dears,

 I use hadoop-0.16.4 to do some work and found a error which i can't get
 the
 reasons.

 The scenario is like this: In the reduce step, instead of using
 OutputCollector to write result, i use FSDataOutputStream to write
 result
 to
 files on HDFS(becouse i want to split the result by some rules). After
 the
 job finished, i found that *some* files(but not all) are empty on HDFS.
 But
 i'm sure in the reduce step the files are not empty since i added some
 logs
 to read the generated file. It seems that some file's contents are lost
 after the reduce step. Is anyone happen to face such errors? or it's a
 hadoop bug?

 Please help me to find the reason if you some guys know

 Thanks & Regards
 Guangfeng


>>
>>
>> --
>> Guangfeng Jin
>>
>> Software Engineer
>>
>> iZENEsoft (Shanghai) Co., Ltd
>> Room 601 Marine Tower, No. 1 Pudong Ave.
>> Tel:86-21-68860698
>> Fax:86-21-68860699
>> Mobile: 86-13621906422
>> Company Website:www.izenesoft.com
>
>


Re: hadoop file system error

2008-06-19 Thread Mori Bellamy
might it be a synchronization problem? i don't know if hadoops DFS  
magically takes care of that, but if it doesn't then you might have a  
problem because of multiple processes trying to write to the same file?


perhaps as a control experiment you could run your process on some  
small input, making sure that each reduce task outputs to a different  
filename (i just use Math.random()*Integer.MAX_VALUE and cross my  
fingers).

On Jun 18, 2008, at 6:01 PM, 晋光峰 wrote:

i'm sure i close all the files in the reduce step. Any other reasons  
cause

this problem?

2008/6/18 Konstantin Shvachko <[EMAIL PROTECTED]>:


Did you close those files?
If not they may be empty.



??? wrote:


Dears,

I use hadoop-0.16.4 to do some work and found a error which i  
can't get

the
reasons.

The scenario is like this: In the reduce step, instead of using
OutputCollector to write result, i use FSDataOutputStream to write  
result

to
files on HDFS(becouse i want to split the result by some rules).  
After the
job finished, i found that *some* files(but not all) are empty on  
HDFS.

But
i'm sure in the reduce step the files are not empty since i added  
some

logs
to read the generated file. It seems that some file's contents are  
lost
after the reduce step. Is anyone happen to face such errors? or  
it's a

hadoop bug?

Please help me to find the reason if you some guys know

Thanks & Regards
Guangfeng





--
Guangfeng Jin

Software Engineer

iZENEsoft (Shanghai) Co., Ltd
Room 601 Marine Tower, No. 1 Pudong Ave.
Tel:86-21-68860698
Fax:86-21-68860699
Mobile: 86-13621906422
Company Website:www.izenesoft.com




Re: hadoop file system error

2008-06-18 Thread 晋光峰
i'm sure i close all the files in the reduce step. Any other reasons cause
this problem?

2008/6/18 Konstantin Shvachko <[EMAIL PROTECTED]>:

> Did you close those files?
> If not they may be empty.
>
>
>
> ??? wrote:
>
>> Dears,
>>
>> I use hadoop-0.16.4 to do some work and found a error which i can't get
>> the
>> reasons.
>>
>> The scenario is like this: In the reduce step, instead of using
>> OutputCollector to write result, i use FSDataOutputStream to write result
>> to
>> files on HDFS(becouse i want to split the result by some rules). After the
>> job finished, i found that *some* files(but not all) are empty on HDFS.
>> But
>> i'm sure in the reduce step the files are not empty since i added some
>> logs
>> to read the generated file. It seems that some file's contents are lost
>> after the reduce step. Is anyone happen to face such errors? or it's a
>> hadoop bug?
>>
>> Please help me to find the reason if you some guys know
>>
>> Thanks & Regards
>> Guangfeng
>>
>>


-- 
Guangfeng Jin

Software Engineer

iZENEsoft (Shanghai) Co., Ltd
Room 601 Marine Tower, No. 1 Pudong Ave.
Tel:86-21-68860698
Fax:86-21-68860699
Mobile: 86-13621906422
Company Website:www.izenesoft.com


Re: hadoop file system error

2008-06-18 Thread Konstantin Shvachko

Did you close those files?
If not they may be empty.


??? wrote:

Dears,

I use hadoop-0.16.4 to do some work and found a error which i can't get the
reasons.

The scenario is like this: In the reduce step, instead of using
OutputCollector to write result, i use FSDataOutputStream to write result to
files on HDFS(becouse i want to split the result by some rules). After the
job finished, i found that *some* files(but not all) are empty on HDFS. But
i'm sure in the reduce step the files are not empty since i added some logs
to read the generated file. It seems that some file's contents are lost
after the reduce step. Is anyone happen to face such errors? or it's a
hadoop bug?

Please help me to find the reason if you some guys know

Thanks & Regards
Guangfeng



hadoop file system error

2008-06-18 Thread 晋光峰
Dears,

I use hadoop-0.16.4 to do some work and found a error which i can't get the
reasons.

The scenario is like this: In the reduce step, instead of using
OutputCollector to write result, i use FSDataOutputStream to write result to
files on HDFS(becouse i want to split the result by some rules). After the
job finished, i found that *some* files(but not all) are empty on HDFS. But
i'm sure in the reduce step the files are not empty since i added some logs
to read the generated file. It seems that some file's contents are lost
after the reduce step. Is anyone happen to face such errors? or it's a
hadoop bug?

Please help me to find the reason if you some guys know

Thanks & Regards
Guangfeng

-- 
Guangfeng Jin

Software Engineer

iZENEsoft (Shanghai) Co., Ltd