HBase Table Pool

2011-09-13 Thread jagaran das
Hi, Has anybody used HBase Table Pool to connect and load data into Hbase Table?? Regards, JD

Re: Hadoop integration with SAS

2011-08-23 Thread jagaran das
R has a connector for Hadoop if it helps.. From: "jonathan.hw...@accenture.com" To: common-user@hadoop.apache.org Sent: Tuesday, 23 August 2011 2:21 PM Subject: Hadoop integration with SAS Anyone had worked on Hadoop data integration with SAS? Does SAS have a c

Python map reduce problem

2011-08-23 Thread jagaran das
Hi, I am newbie in Python I was looking in to the Python example of running map reduce job of Michael Noll's article. I was trying to run this example in CDH3. Map tasks is running in a loop and the reducer is not running.It is showing   Map 50%       Map 100%      Map 50%     Map 100%Map tasks

MR job to copy to hadoop

2011-08-13 Thread jagaran das
Hi, What is the best and fast way to achieve parallel copy to hadoop from an NFS mount? We have a mount with huge number of files and we need to copy it into hdfs. Some options: 1. Run copyFromLocal in a multithreaded way 2. Use distcp in an isolated way. 3. Can i write a map only job to do cop

Re: Namenode Scalability

2011-08-10 Thread jagaran das
you have a fast car, you can race and win against a slow train, it all depends from what reference frame you are in :) Regards, Jagaran  From: Michel Segel To: "common-user@hadoop.apache.org" Cc: "common-user@hadoop.apache.org" ; jagaran das

Re: Namenode Scalability

2011-08-10 Thread jagaran das
To be precise, the projected data is around 1 PB. But the publishing rate is also around 1GBPS. Please suggest. From: jagaran das To: "common-user@hadoop.apache.org" Sent: Wednesday, 10 August 2011 12:58 AM Subject: Namenode Scalability In my curre

Namenode Scalability

2011-08-10 Thread jagaran das
cycle kicks in. 2. Can we have multiple federated Name nodes  sharing the same slaves and then we can distribute the writes accordingly. 3. Can multiple region servers of HBase help us ?? Please suggest how we can design the streaming part to handle such scale of data.  Regards, Jagaran Das 

NameNode Profiling Tools

2011-08-06 Thread jagaran das
Hi, Please suggest what would be the best way to profile NameNode? Any specific tools. We would streaming transaction data using around 2000 threads concurrently to NameNode continuously. Size is around 300 KB/transaction I am using DataInputStream and writing continuously for through each 2000

Help on DFSClient

2011-08-06 Thread jagaran das
I am keeping a Stream Open and writing through it using a multithreaded application. The application is in a different box and I am connecting to NN remotely. I was using FileSystem and getting same error and now I am trying DFSClient and getting the same error. When I am running it via simple

Re: java.io.IOException: config()

2011-08-06 Thread jagaran das
I am accessing through threads in parallel. What is the concept of Lease in HDFS?? Regards, JD From: Harsh J To: jagaran das Sent: Friday, 5 August 2011 11:37 PM Subject: Re: java.io.IOException: config() How long are you keeping it open for? On 06-Aug

Re: java.io.IOException: config() IMP

2011-08-05 Thread jagaran das
ner-1] (RPC.java:230) - Call: complete 3 Please help as it a production enhancement for us. Regards Jagaran  From: Harsh J To: u...@pig.apache.org; jagaran das Sent: Friday, 5 August 2011 8:54 PM Subject: Re: java.io.IOException: config() Could you explain ho

java.io.IOException: config()

2011-08-05 Thread jagaran das
Hi, I have been struck with this exception: java.io.IOException: config() at org.apache.hadoop.conf.Configuration.(Configuration.java:211) at org.apache.hadoop.conf.Configuration.(Configuration.java:198) at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:99) at test.Test

Max Number of Open Connections

2011-08-01 Thread jagaran das
Hi, What is the max number of open connections to a namenode? I am using  FSDataOutputStream out = dfs.create(src); Cheers, JD 

DFSClient Protocol and FileSystem class

2011-07-31 Thread jagaran das
What is the difference between DFSClient Protocol and FileSystem class in Hadoop DFS (HDFS). Both of these classes are used for connecting a remote client to the namenode in HDFS. So,  I wanted to know the advantages of one over the other and which one is suitable for remote-client connection

Hadoop Production Issue

2011-07-15 Thread jagaran das
Hi, Due to requirements in our current production CDH3 cluster we need to copy around 11520 small size files (Total Size 12 GB) to the cluster for one application. Like this we have 20 applications that would run in parallel So one set would have 11520 files of total size 12 GB Like this we wou

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-25 Thread jagaran das
yeah, tats what we do. But its again an extra process, if hadoop had an ability, then it would be great. it uses log4j, i tired to tweak it, but it is throwing error. Regards, Jagaran From: Michael Segel To: common-user@hadoop.apache.org Sent: Sat, 25 June, 2

Re: Any reason Hadoop logs cant be directed to a separate filesystem?

2011-06-22 Thread jagaran das
Hi, Can I limit the log file duration ? I want to keep files for last 15 days only. Regards, Jagaran From: Jack Craig To: "common-user@hadoop.apache.org" Sent: Wed, 22 June, 2011 2:00:23 PM Subject: Re: Any reason Hadoop logs cant be directed to a separate f

Re: Automatic Configuration of Hadoop Clusters

2011-06-22 Thread jagaran das
Pupetize From: gokul To: common-user@hadoop.apache.org Sent: Wed, 22 June, 2011 8:38:13 AM Subject: Automatic Configuration of Hadoop Clusters Dear all, for benchmarking purposes we would like to adjust configurations as well as flexibly adding/removing machine

Re: Append to Existing File

2011-06-21 Thread jagaran das
orking but need to know how stable it is to deploy and use in >> production >> clusters ? >> >> Regards, >> Jagaran >> >> >> >> >> From: jagaran das >> To: common-user@hadoop.apache.org >> Sent: Mon

Re: HDFS File Appending URGENT

2011-06-17 Thread jagaran das
t be fixed. - 0.22, 0.23 Not yet released. Regards, Tsz-Wo ____ From: jagaran das To: common-user@hadoop.apache.org Sent: Fri, June 17, 2011 11:15:04 AM Subject: Fw: HDFS File Appending URGENT Please help me on this. I need it very urgently Regard

Fw: HDFS File Appending URGENT

2011-06-17 Thread jagaran das
Please help me on this. I need it very urgently Regards, Jagaran - Forwarded Message From: jagaran das To: common-user@hadoop.apache.org Sent: Thu, 16 June, 2011 9:51:51 PM Subject: Re: HDFS File Appending URGENT Thanks a lot Xiabo. I have tried with the below code in HDFS version

Re: HDFS File Appending URGENT

2011-06-16 Thread jagaran das
Gu To: common-user@hadoop.apache.org Sent: Thu, 16 June, 2011 8:01:14 PM Subject: Re: HDFS File Appending URGENT You can merge multiple files into a new one, there is no means to append to a existing file. On Fri, Jun 17, 2011 at 10:29 AM, jagaran das wrote: > Is the hadoop version Hadoop

Re: HDFS File Appending URGENT

2011-06-16 Thread jagaran das
From: Xiaobo Gu To: common-user@hadoop.apache.org Sent: Thu, 16 June, 2011 6:26:45 PM Subject: Re: HDFS File Appending please refer to FileUtil.CopyMerge On Fri, Jun 17, 2011 at 8:33 AM, jagaran das wrote: > Hi, > > We have a requirement where > >

HDFS File Appending

2011-06-16 Thread jagaran das
Hi, We have a requirement where There would be huge number of small files to be pushed to hdfs and then use pig to do analysis. To get around the classic "Small File Issue" we merge the files and push a bigger file in to HDFS. But we are loosing time in this merging process of our pipeline

Re: Append to Existing File

2011-06-13 Thread jagaran das
I am using hadoop-0.20.203.0 version. I have set dfs.support.append to true and then using append method It is working but need to know how stable it is to deploy and use in production clusters ? Regards, Jagaran From: jagaran das To: common-user

Append to Existing File

2011-06-13 Thread jagaran das
Hi All, Is append to an existing file is now supported in Hadoop for production clusters? If yes, please let me know which version and how Thanks Jagaran

Re: NameNode is starting with exceptions whenever its trying to start datanodes

2011-06-07 Thread jagaran das
start datanodes how shall I clean my data dir ??? Cleaning data dir .. u mean to say is deleting all files from hdfs ???.. is there any special command to clean all the datanodes in one step ??? On Tue, Jun 7, 2011 at 11:46 PM, jagaran das wrote: > Cleaning data from data dir of datanode

Re: NameNode is starting with exceptions whenever its trying to start datanodes

2011-06-07 Thread jagaran das
datanodes >>Sorry I mean Some of your data nodes are not getting connected.. So are you sticking with your solution that you are saying to me.. to go for passwordless ssh for all datanodes.. because for my hadoop.. all datanodes are running fine On Tue, Jun 7, 2011 at 11:32 PM, jagar

Re: NameNode is starting with exceptions whenever its trying to start datanodes

2011-06-07 Thread jagaran das
e have to do passwordless ssh among datanodes also ??? On Tue, Jun 7, 2011 at 11:15 PM, jagaran das wrote: > Check two things: > > 1. Some of your data node is getting connected, that means password less > SSH is > not working within nodes. > 2. Then Clear the Dir where you data

Re: NameNode is starting with exceptions whenever its trying to start datanodes

2011-06-07 Thread jagaran das
Sorry I mean Some of your data nodes are not getting connected From: jagaran das To: common-user@hadoop.apache.org Sent: Tue, 7 June, 2011 10:45:59 AM Subject: Re: NameNode is starting with exceptions whenever its trying to start datanodes Check two things

Re: NameNode is starting with exceptions whenever its trying to start datanodes

2011-06-07 Thread jagaran das
Check two things: 1. Some of your data node is getting connected, that means password less SSH is not working within nodes. 2. Then Clear the Dir where you data is persisted in data nodes and format the namenode. It should definitely work then Cheers, Jagaran __

Re: Reducing Mapper InputSplit size

2011-06-06 Thread jagaran das
Correct reduce the dfs.block.size to increase the number of mappers. - Jagaran From: Mark question To: common-user Sent: Mon, 6 June, 2011 7:31:17 PM Subject: Reducing Mapper InputSplit size Hi, Does anyone have a way to reduce InputSplit size in general ?

Re: Adding first datanode isn't working

2011-06-01 Thread jagaran das
leBii > Thx, already did that > so I can ssh phraseless master to master and master to slave1. > Same as before datanode & tasktracker are starting up/shuting down well on > slave1 > > > > > > 2011/6/1 jagaran das > >> Check the password less

Re: Adding first datanode isn't working

2011-06-01 Thread jagaran das
Check the password less SSH is working or not Regards, Jagaran From: MilleBii To: common-user@hadoop.apache.org Sent: Wed, 1 June, 2011 12:28:54 PM Subject: Adding first datanode isn't working Newbie on hadoop clusters. I have setup my two nodes conf as descr

Real Time BI On Hadoop

2011-05-31 Thread jagaran das
Hi All, Please let me know is there anything by which we can do some basic BI features on hadoop. Idea is once the raw data is fed on the system, I run some pig scripts to aggregate the data. Now I need some BI ability to work on this files. Thanks Jagaran

Re: trying to select technology

2011-05-31 Thread jagaran das
Think of Lucene and Apache SOLR Cheers, Jagaran From: cs230 To: core-u...@hadoop.apache.org Sent: Tue, 31 May, 2011 10:50:49 AM Subject: trying to select technology Hello All, I am planning to start project where I have to do extensive storage of xml and te

Re: Hadoop project - help needed

2011-05-31 Thread jagaran das
Hi, To be very precise, input to the mapper should be something you want to filter on basis of which you want to do the aggregation. The Reducer is where you aggregate the output from mapper. Check the WordCount Example in Hadoop, it can help you to understand the basic concepts. Cheers, Jaga

Re: Poor IO performance on a 10 node cluster.

2011-05-30 Thread jagaran das
Your Font block size got increased dynamically , check in core-site :) :) - Jagaran From: He Chen To: common-user@hadoop.apache.org Sent: Mon, 30 May, 2011 11:39:35 AM Subject: Re: Poor IO performance on a 10 node cluster. Hi Gyuribácsi I would suggest you d

Re: No. of Map and reduce tasks

2011-05-26 Thread jagaran das
the contents by name. But it only created one mapper. How >>> can I change this to distribute accross multiple machines? >>> >>> On Thu, May 26, 2011 at 3:08 PM, jagaran das wrote: >>>> Hi Mohit, >>>> >>>> No of Maps - It depends on what i

Re: No. of Map and reduce tasks

2011-05-26 Thread jagaran das
Hi Mohit, No of Maps - It depends on what is the Total File Size / Block Size No of Reducers - You can specify. Regards, Jagaran From: Mohit Anchlia To: common-user@hadoop.apache.org Sent: Thu, 26 May, 2011 2:48:20 PM Subject: No. of Map and reduce tasks Ho