Re: HDFS undo Overwriting

2014-06-02 Thread Amjad ALSHABANI
Thanx Zesheng,

I should admit that I m not an expert in Hadoop infrastructure, but I have
heard my colleagues talking about HDFS replicas?
Couldn't that help in retrieving the lost data??

Amjad


On Fri, May 30, 2014 at 1:44 PM, Zesheng Wu  wrote:

> I am afraid this cannot undo, in HDFS only the data which is deleted by
> the dfs client and goes into the trash can be undone.
>
>
> 2014-05-30 18:18 GMT+08:00 Amjad ALSHABANI :
>
> Hello Everybody,
>>
>> I ve made a mistake when writing to HDFS. I created new database in Hive
>> giving the location on HDFS but I found that it removed all other data that
>> exist already.
>>
>> =
>> before creation, the directory on HDFS contains :
>> pns@app11:~$ hadoop fs -ls /user/hive/warehouse
>> Found 25 items
>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
>> */user/hive/warehouse/*dfy_ans_autres
>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
>> /user/hive/warehouse/dfy_ans_maillog
>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 14:28
>> /user/hive/warehouse/dfy_cnx
>> drwxr-xr-x   - user2   supergroup  0 2014-05-30 06:05
>> /user/hive/warehouse/pns.db
>> drwxr-xr-x   - user2  supergroup  0 2014-02-24 17:00
>> /user/hive/warehouse/pns_fr_integ
>> drwxr-xr-x   - user2  supergroup  0 2014-05-06 15:33
>> /user/hive/warehouse/pns_logstat.db
>> ...
>> ...
>> ...
>>
>>
>> hive -e "CREATE DATABASE my_stats LOCATION 'hdfs://:9000
>> */user/hive/warehouse/*mystats.db'"
>>
>> but now I couldn't see the other directories on HDFS:
>>
>> pns@app11:~/aalshabani$ hls /user/hive/warehouse
>> Found 1 items
>> drwxr-xr-x   - user2 supergroup  0 2014-05-30 11:37
>> */user/hive/warehouse*/mystats.db
>>
>>
>> Is there anyway I could restore the other directories??
>>
>>
>> Best regards.
>>
>
>
>
> --
> Best Wishes!
>
> Yours, Zesheng
>


Re: HDFS undo Overwriting

2014-06-02 Thread varun kumar
Nope.

Sorry :(


On Mon, Jun 2, 2014 at 1:31 PM, Amjad ALSHABANI 
wrote:

> Thanx Zesheng,
>
> I should admit that I m not an expert in Hadoop infrastructure, but I have
> heard my colleagues talking about HDFS replicas?
> Couldn't that help in retrieving the lost data??
>
> Amjad
>
>
> On Fri, May 30, 2014 at 1:44 PM, Zesheng Wu  wrote:
>
>> I am afraid this cannot undo, in HDFS only the data which is deleted by
>> the dfs client and goes into the trash can be undone.
>>
>>
>> 2014-05-30 18:18 GMT+08:00 Amjad ALSHABANI :
>>
>> Hello Everybody,
>>>
>>> I ve made a mistake when writing to HDFS. I created new database in Hive
>>> giving the location on HDFS but I found that it removed all other data that
>>> exist already.
>>>
>>> =
>>> before creation, the directory on HDFS contains :
>>> pns@app11:~$ hadoop fs -ls /user/hive/warehouse
>>> Found 25 items
>>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
>>> */user/hive/warehouse/*dfy_ans_autres
>>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
>>> /user/hive/warehouse/dfy_ans_maillog
>>> drwxr-xr-x   - user1 supergroup  0 2013-11-20 14:28
>>> /user/hive/warehouse/dfy_cnx
>>> drwxr-xr-x   - user2   supergroup  0 2014-05-30 06:05
>>> /user/hive/warehouse/pns.db
>>> drwxr-xr-x   - user2  supergroup  0 2014-02-24 17:00
>>> /user/hive/warehouse/pns_fr_integ
>>> drwxr-xr-x   - user2  supergroup  0 2014-05-06 15:33
>>> /user/hive/warehouse/pns_logstat.db
>>> ...
>>> ...
>>> ...
>>>
>>>
>>> hive -e "CREATE DATABASE my_stats LOCATION 'hdfs://:9000
>>> */user/hive/warehouse/*mystats.db'"
>>>
>>> but now I couldn't see the other directories on HDFS:
>>>
>>> pns@app11:~/aalshabani$ hls /user/hive/warehouse
>>> Found 1 items
>>> drwxr-xr-x   - user2 supergroup  0 2014-05-30 11:37
>>> */user/hive/warehouse*/mystats.db
>>>
>>>
>>> Is there anyway I could restore the other directories??
>>>
>>>
>>> Best regards.
>>>
>>
>>
>>
>> --
>> Best Wishes!
>>
>> Yours, Zesheng
>>
>
>


-- 
Regards,
Varun Kumar.P


Hadoop usage in uploading & downloading big data

2014-06-02 Thread rahul.soa
Hello All,
I'm newbie to Hadoop and interested to know if hadoop can be useful
in order to solve the problem I am seeing.

We have big data (sometimes b/w 200 - 600 GB) to store in the release data
management (repository server, currently we are using synchronicity
designsync) which takes roughly about  *3-7 hours* to upload/checkin this
data (and download from repository server).

I would like to know if application of hadoop can be useful in order to
reduce this big time. The time taken is bothering design engineers to
upload and download which further lead to delay in deliveries.

Please note I'm new to hadoop and checking the possibility of usage in this
scenario.

Best Regards,
Rahul


Re: HDFS undo Overwriting

2014-06-02 Thread Amr Shahin
There is no built in functionality in HDFS to do this. However, there are
some software that can restore the hard-drive to a previous date (they work
on the OS level transparent to HDFS), you might wanna try one of those. Not
sure how the namenode will react to this though, you could at least get the
old raw data and figure something out after that.


On Mon, Jun 2, 2014 at 1:40 PM, varun kumar  wrote:

> Nope.
>
> Sorry :(
>
>
> On Mon, Jun 2, 2014 at 1:31 PM, Amjad ALSHABANI 
> wrote:
>
>> Thanx Zesheng,
>>
>> I should admit that I m not an expert in Hadoop infrastructure, but I
>> have heard my colleagues talking about HDFS replicas?
>> Couldn't that help in retrieving the lost data??
>>
>> Amjad
>>
>>
>> On Fri, May 30, 2014 at 1:44 PM, Zesheng Wu 
>> wrote:
>>
>>> I am afraid this cannot undo, in HDFS only the data which is deleted by
>>> the dfs client and goes into the trash can be undone.
>>>
>>>
>>> 2014-05-30 18:18 GMT+08:00 Amjad ALSHABANI :
>>>
>>> Hello Everybody,

 I ve made a mistake when writing to HDFS. I created new database in
 Hive giving the location on HDFS but I found that it removed all other data
 that exist already.

 =
 before creation, the directory on HDFS contains :
 pns@app11:~$ hadoop fs -ls /user/hive/warehouse
 Found 25 items
 drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
 */user/hive/warehouse/*dfy_ans_autres
 drwxr-xr-x   - user1 supergroup  0 2013-11-20 13:40
 /user/hive/warehouse/dfy_ans_maillog
 drwxr-xr-x   - user1 supergroup  0 2013-11-20 14:28
 /user/hive/warehouse/dfy_cnx
 drwxr-xr-x   - user2   supergroup  0 2014-05-30 06:05
 /user/hive/warehouse/pns.db
 drwxr-xr-x   - user2  supergroup  0 2014-02-24 17:00
 /user/hive/warehouse/pns_fr_integ
 drwxr-xr-x   - user2  supergroup  0 2014-05-06 15:33
 /user/hive/warehouse/pns_logstat.db
 ...
 ...
 ...


 hive -e "CREATE DATABASE my_stats LOCATION 'hdfs://:9000
 */user/hive/warehouse/*mystats.db'"

 but now I couldn't see the other directories on HDFS:

 pns@app11:~/aalshabani$ hls /user/hive/warehouse
 Found 1 items
 drwxr-xr-x   - user2 supergroup  0 2014-05-30 11:37
 */user/hive/warehouse*/mystats.db


 Is there anyway I could restore the other directories??


 Best regards.

>>>
>>>
>>>
>>> --
>>> Best Wishes!
>>>
>>> Yours, Zesheng
>>>
>>
>>
>
>
> --
> Regards,
> Varun Kumar.P
>


Problems Starting NameNode on hadoop-2.2.0

2014-06-02 Thread ishan patwa
Hi,
I recenetly isntalled hadoop-2.2.0 on my machine running :Linux
livingstream 3.2.0-29

However I am unable to start the namenode using

*hadoop-daemon.sh start namenode*
In the log files I can see the following errors :
+++
2014-05-31 14:03:12,844 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode:
java.lang.IllegalArgumentException: Does not contain a valid host:port
authority: file:///
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:280)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:569)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)

2014-05-31 14:03:12,845 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down NameNode at livingstream/127.0.1.1
/



I googled around a little bit , and people mentioned that it might be
beause I havent gives fs,default.name in my core-site-xml.
I checked my core-site.xml , it looks fine

+++


fs.default.name
hdfs://localhost:9000


+=

Do you guys have any suggesntions as to what else might cause this???

Regards,
Ishan


-- 
Ishan Patwa
Software Developer
Zynga Game Network Pvt Ltd.
Bangalore


Rebuild Hadoop 1.2.1 Ubuntu

2014-06-02 Thread giorgiozullino
I must modify Mapper.java and rebuild the hadoop source code. How is it 
possible?
Thanks

Hadoop Job Tracker User Interface not accessible

2014-06-02 Thread Ashish Dobhal
Hey,
Could anyone please help me I am not able to access by Job Tracker
interface on the defined port.
http://localhost:50030
Please help.
Thank you


Re: Hadoop Job Tracker User Interface not accessible

2014-06-02 Thread Mohammad Tariq
Hi Ashish,

Could you please make sure that JT daemon is running fine? Use JPS to
verify that. Also, make sure that you haven't changed the web ui port. If
problem still persists please show us your logs.

*Warm regards,*
*Mohammad Tariq*
*cloudfront.blogspot.com *


On Fri, May 30, 2014 at 3:48 PM, Ashish Dobhal 
wrote:

> Hey,
> Could anyone please help me I am not able to access by Job Tracker
> interface on the defined port.
> http://localhost:50030
> Please help.
>  Thank you
>
>
>


Re: Rebuild Hadoop 1.2.1 Ubuntu

2014-06-02 Thread karthikeyan S
The following commands executed from the root of the source directory
compiles and build the package.

$mvn compile
$mvn package -Pdist -DskipTests -Dtar
On Jun 2, 2014 11:41 AM,  wrote:

> I must modify Mapper.java and rebuild the hadoop source code. How is it
> possible?
> Thanks


Building Mahout Issue

2014-06-02 Thread Botelho, Andrew
I am trying to build Mahout version 0.9 and make it compatible with Hadoop 
2.4.0.
I unpacked mahout-distribution-0.9-src.tar.gz and then ran the following 
command:

mvn -Phadoop-0.23 clean install -Dhadoop.version=2.4.0 -DskipTests

Then I get the following error:

[ERROR] Failed to execute goal on project mahout-integration: Could not resolve 
dependencies for project org.apache.mahout:mahout-integration:jar:0.9: Could 
not find artifact org.apache.hadoop:hadoop-core:jar:2.4.0 in central 
(http://repo.maven.apache.org/maven2) -> [Help 1]

Any ideas what is causing this problem and how to fix it?  Any advice would be 
much appreciated.

Thanks,

Andrew Botelho
Intern
EMC Corporation
Education Services
Email: andrew.bote...@emc.com


Re: Problems Starting NameNode on hadoop-2.2.0

2014-06-02 Thread Rajat Jain
Have you tried setting fs.defaultFS with the same value?


On Sat, May 31, 2014 at 11:22 AM, ishan patwa  wrote:

> Hi,
> I recenetly isntalled hadoop-2.2.0 on my machine running :Linux
> livingstream 3.2.0-29
>
> However I am unable to start the namenode using
>
> *hadoop-daemon.sh start namenode *
> In the log files I can see the following errors :
> +++
> 2014-05-31 14:03:12,844 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode:
> java.lang.IllegalArgumentException: Does not contain a valid host:port
> authority: file:///
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:280)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:569)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
>
> 2014-05-31 14:03:12,845 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down NameNode at livingstream/127.0.1.1
> /
>
> 
>
> I googled around a little bit , and people mentioned that it might be
> beause I havent gives fs,default.name in my core-site-xml.
> I checked my core-site.xml , it looks fine
>
> +++
> 
> 
> fs.default.name
> hdfs://localhost:9000
> 
> 
> +=
>
> Do you guys have any suggesntions as to what else might cause this???
>
> Regards,
> Ishan
>
>
> --
> Ishan Patwa
> Software Developer
> Zynga Game Network Pvt Ltd.
> Bangalore
>


RE: change yarn application priority

2014-06-02 Thread Henry Hung
@Rohith Sharma

Thank you for the confirmation.
I already google some information regarding scheduler in Hadoop 2.2.0, there 
are 2 ways to do it: FairScheduler or CapacityScheduler.
Apparently from google result, there are more article mentions that 
FairScheduler is better than CapacityScheduler.
So, I intend to use FairScheduler first, if you have any more suggestion, 
please let me know, thank you again.

Best regards,
Henry

From: Rohith Sharma K S [mailto:rohithsharm...@huawei.com]
Sent: Friday, May 30, 2014 5:49 PM
To: user@hadoop.apache.org
Subject: RE: change yarn application priority

Hi

   Currently there is no provision for changing application priority within the 
same queue.  Follow the Jira https://issues.apache.org/jira/i#browse/YARN-1963 
for this new feature.

One way you can achieve by using enabling scheduler monitors for 
CapacitySchedulers.
Steps to be follow is

1.   Configure 2 queues, follow 
http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

2.   Enable scheduler monitor

yarn.resourcemanager.scheduler.monitor.enable = true

One job you submit to queue 1 which run 2hours. Another job you submit queue 2.

Hope this will help you.

Thanks & Regards
Rohith Sharma K S


This e-mail and its attachments contain confidential information from HUAWEI, 
which
is intended only for the person or entity whose address is listed above. Any 
use of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender by
phone or email immediately and delete it!

From: Henry Hung [mailto:ythu...@winbond.com]
Sent: 30 May 2014 11:53
To: user@hadoop.apache.org
Subject: change yarn application priority

HI All,

I have an application that consumes all of nodemanager capacity (30 Map and 1 
Reducer) and will need 4 hours to finish.
Let's say I need to run another application that will be quicker to finish (30 
minutes) and only need 1 Map and 1 Reducer.
If I just execute the new application, it will be in queue waiting for the 1st 
application to finish.
Is there a way to change the 2nd application priority to higher than the 1st 
and let resourcemanager immediately execute the 2nd application?

I'm using Hadoop-2.2.0.

Best regards,
Henry


The privileged confidential information contained in this email is intended for 
use only by the addressees as indicated by the original sender of this email. 
If you are not the addressee indicated in this email or are not responsible for 
delivery of the email to such a person, please kindly reply to the sender 
indicating this fact and delete all copies of it from your computer and network 
server immediately. Your cooperation is highly appreciated. It is advised that 
any unauthorized use of confidential information of Winbond is strictly 
prohibited; and any information in this email irrelevant to the official 
business of Winbond shall be deemed as neither given nor endorsed by Winbond.


The privileged confidential information contained in this email is intended for 
use only by the addressees as indicated by the original sender of this email. 
If you are not the addressee indicated in this email or are not responsible for 
delivery of the email to such a person, please kindly reply to the sender 
indicating this fact and delete all copies of it from your computer and network 
server immediately. Your cooperation is highly appreciated. It is advised that 
any unauthorized use of confidential information of Winbond is strictly 
prohibited; and any information in this email irrelevant to the official 
business of Winbond shall be deemed as neither given nor endorsed by Winbond.


Re: Problems Starting NameNode on hadoop-2.2.0

2014-06-02 Thread Stanley Shi
Another possible reason is that you are not using the correct conf file;

Regards,
*Stanley Shi,*



On Tue, Jun 3, 2014 at 6:53 AM, Rajat Jain  wrote:

> Have you tried setting fs.defaultFS with the same value?
>
>
> On Sat, May 31, 2014 at 11:22 AM, ishan patwa 
> wrote:
>
>> Hi,
>> I recenetly isntalled hadoop-2.2.0 on my machine running :Linux
>> livingstream 3.2.0-29
>>
>> However I am unable to start the namenode using
>>
>> *hadoop-daemon.sh start namenode *
>> In the log files I can see the following errors :
>> +++
>> 2014-05-31 14:03:12,844 ERROR
>> org.apache.hadoop.hdfs.server.namenode.NameNode:
>> java.lang.IllegalArgumentException: Does not contain a valid host:port
>> authority: file:///
>> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:212)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:244)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:280)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:569)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
>> at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
>>
>> 2014-05-31 14:03:12,845 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
>> /
>> SHUTDOWN_MSG: Shutting down NameNode at livingstream/127.0.1.1
>> /
>>
>> 
>>
>> I googled around a little bit , and people mentioned that it might be
>> beause I havent gives fs,default.name in my core-site-xml.
>> I checked my core-site.xml , it looks fine
>>
>> +++
>> 
>> 
>> fs.default.name
>> hdfs://localhost:9000
>> 
>> 
>> +=
>>
>> Do you guys have any suggesntions as to what else might cause this???
>>
>> Regards,
>> Ishan
>>
>>
>> --
>> Ishan Patwa
>> Software Developer
>> Zynga Game Network Pvt Ltd.
>> Bangalore
>>
>
>