Max. Possible No. of Files

2009-06-05 Thread Wasim Bari
Hi,
 Does someone has some data regarding maximum possible number of files over 
HDFS ?

my second question is, I created small files with small block size up to one 
lac and read the files from HDFS, reading performance remains almost unaffected 
with increasing number of files.

The possible reasons I could think are:

1  . One lac isn't a big number to disturb HDFS performance (I used 1 namenode 
and 4 data nodes)

2.  As reading is done directly from datanode with first time interaction with 
namenode, so reading from different nodes doesn't affect the performance. 


If someone could add or negate some information it will be highly appreciated. 

Cheers,
Wasim

HDFS data for HBase and other projects

2009-05-26 Thread Wasim Bari
Hi,
 If we have already data stored in HDFS. Which of following sub-projects 
can use this data for further processing/operations:

  1.. Pig 
  2.. HBase 
  3.. ZooKeeper 
  4.. Hive 
  5.. Any other Hadoop related project

Thanks,

Wasim 


Append in Hadoop

2009-05-14 Thread Wasim Bari
Hi,
 Can someone tell about Append functionality in Hadoop. Is it available now 
in 0.20 ??

Regards,

Wasim

Some Storage communication related questions

2009-02-15 Thread Wasim Bari
Hi,
 I have multiple questions:

Does hadoop use some parallel technique for CopyFromLocal and CopyToLocal  
(like DistCp) Or its simple ONE stream writing?

For Amazon S3 to Local system communication, Hadoop uses Rest service interface 
or SOAP ?

Are there some new storage systems currently in pipeline to be interfaced with 
hadoop ?


Thanks,

Wasim

File Transfer Rates

2009-02-10 Thread Wasim Bari
Hi,
Could someone help me to find some real Figures (transfer rate) about 
Hadoop File transfer  from local filesystem to HDFS, S3 etc and among Storage 
Systems (HDFS to S3 etc)

Thanks,

Wasim 

Hadoop-KFS-FileSystem API

2009-02-03 Thread Wasim Bari
Hi,
I am looking to use KFS as storage with Hadoop FileSystem API.

http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/fs/kfs/package-summary.html

This page states about KFS usage with Hadoop and stated as last step to run 
map/reduce tracker.

Is it necessary to turn it on? 

How only storage works with FileSystem API ?

Thanks

Wasim

Hadoop copy on same cluster

2008-12-01 Thread Wasim Bari
Hi, 
Is there any API which COPY files from one folder to another on same HADOOP 
cluster( DistCp can be used but its not effective with performance)
Sth like CopyFromLocal but with source and destination both on same hadoop 
cluster.


Cheers,
Wasim



Conf Object witout hadoop-default.xml and hadoop-site.xml

2008-11-20 Thread Wasim Bari
Hi,
 Is it possible: I create Configuration object without hadoop-default.xml 
and hadoop-site.xml  files and after creation set the values in Configuration 
Object?

If yes which are the values that I need to set in configuration Object to get 
FileSystem Object.

Thanks,

Wasim

FileSystem.append and FSDataOutputStream.seek

2008-11-18 Thread Wasim Bari
Hello,
  Does anyone know when Hadoop team has plan to Implement 
FileSystem.append(Path) functionality  and Something seekable with 
FSDataOutputStream (mean seek capability) ?

On which forum we can ask for some functionalities inclusion ? 

Thanks,

Wasim

Anything like RandomAccessFile in Hadoop FS ?

2008-11-13 Thread Wasim Bari
Hi,
 Is there any Utility for Hadoop files which can work same as 
RandomAccessFile in Java ? 
Thanks,

Wasim


DistCp 0.18 Vs DistCp 0.17

2008-11-11 Thread Wasim Bari
Hi,
The package for DistCp in 0.18 is:  org.Apache.Hadoop.tools. Is it same 
in 0.17 or different one ?
is there any difference among these two versions for DistCp ?
Thanks,

Wasim


DistCp and CopyFiles

2008-11-11 Thread Wasim Bari
Hi,
 In 0.18 CopyFiles.java(0.17) is changed with DistCp.java.  Is there any 
difference between these ?

Thanks,

Wasim

HDFS from non-hadoop Program

2008-11-07 Thread Wasim Bari
Hello, 
 I am trying to access HDFS from a non-hadoop program using java.
When I try to get Configuration file, it shows exception both in DEBUG mode and 
normal one:

org.apache.hadoop.conf.Configuration: java.io.IOException: config()at 
org.apache.hadoop.conf.Configuration.init(Configuration.java:156)

With the same Configuration files when I try to access from a single stand 
alone program, it runs perfectly fine. 
Some people posted same issues before but no solution is posted. anyone found 
the solution ?

Thanks

wasim


HDFS Login Security

2008-11-04 Thread Wasim Bari
Hi,
 Do we have any Java class for Login purpose to HDFS programmatically like 
traditional UserName/Password mechanism ? or we can have only system user or 
user who started NameNode ?

Thanks,

Wasim

Data Transfer mechanism between different clusters

2008-09-20 Thread Wasim Bari


Hello All,
  what kind of support Hadoop provides for data transfer 
between more than one cluster residing on different geographical locations 
(might be by using WAN) ?
is there any fast and efficient method available ? (Like  GridFTP in 
Globus )


Thanks,

Wasim 



HDFS Vs KFS

2008-08-21 Thread Wasim Bari
Hi,
 Can some expert differentiate or compare HDFS with KFS ? Apparently it 
looks like similar architecture with little difference and same objective.

Thanks,

Wasim

Re: HDFS Vs KFS

2008-08-21 Thread Wasim Bari


KFS is also another Distributed file system implemented in C++. Here you can 
get details:


http://kosmosfs.sourceforge.net/


--
From: rae l [EMAIL PROTECTED]
Sent: Thursday, August 21, 2008 4:52 PM
To: core-user@hadoop.apache.org
Subject: Re: HDFS Vs KFS


On Thu, Aug 21, 2008 at 9:44 PM, Wasim Bari [EMAIL PROTECTED] wrote:

Hi,
Can some expert differentiate or compare HDFS with KFS ? Apparently 
it looks like similar architecture with little difference and same 
objective.

What's KFS? Which KFS?

Here all ones know HDFS, but someone like me didn't know KFS, please
specify which KFS in detail.



Hadoop on Suse

2008-08-21 Thread Wasim Bari
Hi,
Anyone experience with installing Hadoop or HDFS on Suse Linux?  

Thanks

Hadoop DFS

2008-07-24 Thread Wasim Bari
Hi,
I am new to Hadoop. Right now, I am Only interested to Work with Hadoop 
DFS. Can some one guide me where to start?  Anyone has information about some 
application has already integrated Hadoop DFS ?  

Any information regarding Material about Hadoop DFS, case studies, Articles, 
books etc will be very nice.

Thanks,

Wasim