Hi,
Has anybody used HBase Table Pool to connect and load data into Hbase Table??
Regards,
JD
R has a connector for Hadoop if it helps..
From: "jonathan.hw...@accenture.com"
To: common-user@hadoop.apache.org
Sent: Tuesday, 23 August 2011 2:21 PM
Subject: Hadoop integration with SAS
Anyone had worked on Hadoop data integration with SAS?
Does SAS have a c
Hi,
I am newbie in Python
I was looking in to the Python example of running map reduce job of Michael
Noll's article.
I was trying to run this example in CDH3.
Map tasks is running in a loop and the reducer is not running.It is showing
Map 50%
Map 100%
Map 50%
Map 100%Map tasks
Hi,
What is the best and fast way to achieve parallel copy to hadoop from an NFS
mount?
We have a mount with huge number of files and we need to copy it into hdfs.
Some options:
1. Run copyFromLocal in a multithreaded way
2. Use distcp in an isolated way.
3. Can i write a map only job to do cop
you have a fast car, you can race and win against a slow
train, it all depends from what reference frame you are in :)
Regards,
Jagaran
From: Michel Segel
To: "common-user@hadoop.apache.org"
Cc: "common-user@hadoop.apache.org" ; jagaran
das
To be precise, the projected data is around 1 PB.
But the publishing rate is also around 1GBPS.
Please suggest.
From: jagaran das
To: "common-user@hadoop.apache.org"
Sent: Wednesday, 10 August 2011 12:58 AM
Subject: Namenode Scalability
In my curre
cycle kicks in.
2. Can we have multiple federated Name nodes sharing the same slaves and then
we can distribute the writes accordingly.
3. Can multiple region servers of HBase help us ??
Please suggest how we can design the streaming part to handle such scale of
data.
Regards,
Jagaran Das
Hi,
Please suggest what would be the best way to profile NameNode?
Any specific tools.
We would streaming transaction data using around 2000 threads concurrently to
NameNode continuously. Size is around 300 KB/transaction
I am using DataInputStream and writing continuously for through each 2000
I am keeping a Stream Open and writing through it using a multithreaded
application.
The application is in a different box and I am connecting to NN remotely.
I was using FileSystem and getting same error and now I am trying DFSClient and
getting the same error.
When I am running it via simple
I am accessing through threads in parallel.
What is the concept of Lease in HDFS??
Regards,
JD
From: Harsh J
To: jagaran das
Sent: Friday, 5 August 2011 11:37 PM
Subject: Re: java.io.IOException: config()
How long are you keeping it open for?
On 06-Aug
ner-1] (RPC.java:230) - Call:
complete 3
Please help as it a production enhancement for us.
Regards
Jagaran
From: Harsh J
To: u...@pig.apache.org; jagaran das
Sent: Friday, 5 August 2011 8:54 PM
Subject: Re: java.io.IOException: config()
Could you explain ho
Hi,
I have been struck with this exception:
java.io.IOException: config()
at org.apache.hadoop.conf.Configuration.(Configuration.java:211)
at org.apache.hadoop.conf.Configuration.(Configuration.java:198)
at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:99)
at test.Test
Hi,
What is the max number of open connections to a namenode?
I am using
FSDataOutputStream out = dfs.create(src);
Cheers,
JD
What is the difference between DFSClient Protocol and FileSystem class in
Hadoop DFS (HDFS). Both of these classes are used for connecting a remote
client to the namenode in HDFS. So,
I wanted to know the advantages of one over the other and which one is
suitable for remote-client connection
Hi,
Due to requirements in our current production CDH3 cluster we need to copy
around 11520 small size files (Total Size 12 GB) to the cluster for one
application.
Like this we have 20 applications that would run in parallel
So one set would have 11520 files of total size 12 GB
Like this we wou
yeah, tats what we do.
But its again an extra process, if hadoop had an ability, then it would be
great.
it uses log4j, i tired to tweak it, but it is throwing error.
Regards,
Jagaran
From: Michael Segel
To: common-user@hadoop.apache.org
Sent: Sat, 25 June, 2
Hi,
Can I limit the log file duration ?
I want to keep files for last 15 days only.
Regards,
Jagaran
From: Jack Craig
To: "common-user@hadoop.apache.org"
Sent: Wed, 22 June, 2011 2:00:23 PM
Subject: Re: Any reason Hadoop logs cant be directed to a separate f
Pupetize
From: gokul
To: common-user@hadoop.apache.org
Sent: Wed, 22 June, 2011 8:38:13 AM
Subject: Automatic Configuration of Hadoop Clusters
Dear all,
for benchmarking purposes we would like to adjust configurations as well as
flexibly adding/removing machine
orking but need to know how stable it is to deploy and use in
>> production
>> clusters ?
>>
>> Regards,
>> Jagaran
>>
>>
>>
>>
>> From: jagaran das
>> To: common-user@hadoop.apache.org
>> Sent: Mon
t be fixed.
- 0.22, 0.23
Not yet released.
Regards,
Tsz-Wo
____
From: jagaran das
To: common-user@hadoop.apache.org
Sent: Fri, June 17, 2011 11:15:04 AM
Subject: Fw: HDFS File Appending URGENT
Please help me on this.
I need it very urgently
Regard
Please help me on this.
I need it very urgently
Regards,
Jagaran
- Forwarded Message
From: jagaran das
To: common-user@hadoop.apache.org
Sent: Thu, 16 June, 2011 9:51:51 PM
Subject: Re: HDFS File Appending URGENT
Thanks a lot Xiabo.
I have tried with the below code in HDFS version
Gu
To: common-user@hadoop.apache.org
Sent: Thu, 16 June, 2011 8:01:14 PM
Subject: Re: HDFS File Appending URGENT
You can merge multiple files into a new one, there is no means to
append to a existing file.
On Fri, Jun 17, 2011 at 10:29 AM, jagaran das wrote:
> Is the hadoop version Hadoop
From: Xiaobo Gu
To: common-user@hadoop.apache.org
Sent: Thu, 16 June, 2011 6:26:45 PM
Subject: Re: HDFS File Appending
please refer to FileUtil.CopyMerge
On Fri, Jun 17, 2011 at 8:33 AM, jagaran das wrote:
> Hi,
>
> We have a requirement where
>
>
Hi,
We have a requirement where
There would be huge number of small files to be pushed to hdfs and then use
pig
to do analysis.
To get around the classic "Small File Issue" we merge the files and push a
bigger file in to HDFS.
But we are loosing time in this merging process of our pipeline
I am using hadoop-0.20.203.0 version.
I have set
dfs.support.append to true and then using append method
It is working but need to know how stable it is to deploy and use in production
clusters ?
Regards,
Jagaran
From: jagaran das
To: common-user
Hi All,
Is append to an existing file is now supported in Hadoop for production
clusters?
If yes, please let me know which version and how
Thanks
Jagaran
start
datanodes
how shall I clean my data dir ???
Cleaning data dir .. u mean to say is deleting all files from hdfs ???..
is there any special command to clean all the datanodes in one step ???
On Tue, Jun 7, 2011 at 11:46 PM, jagaran das wrote:
> Cleaning data from data dir of datanode
datanodes
>>Sorry I mean Some of your data nodes are not getting connected..
So are you sticking with your solution that you are saying to me.. to go for
passwordless ssh for all datanodes..
because for my hadoop.. all datanodes are running fine
On Tue, Jun 7, 2011 at 11:32 PM, jagar
e have to do passwordless ssh among datanodes also ???
On Tue, Jun 7, 2011 at 11:15 PM, jagaran das wrote:
> Check two things:
>
> 1. Some of your data node is getting connected, that means password less
> SSH is
> not working within nodes.
> 2. Then Clear the Dir where you data
Sorry I mean Some of your data nodes are not getting connected
From: jagaran das
To: common-user@hadoop.apache.org
Sent: Tue, 7 June, 2011 10:45:59 AM
Subject: Re: NameNode is starting with exceptions whenever its trying to start
datanodes
Check two things
Check two things:
1. Some of your data node is getting connected, that means password less SSH is
not working within nodes.
2. Then Clear the Dir where you data is persisted in data nodes and format the
namenode.
It should definitely work then
Cheers,
Jagaran
__
Correct reduce the dfs.block.size to increase the number of mappers.
- Jagaran
From: Mark question
To: common-user
Sent: Mon, 6 June, 2011 7:31:17 PM
Subject: Reducing Mapper InputSplit size
Hi,
Does anyone have a way to reduce InputSplit size in general ?
leBii
> Thx, already did that
> so I can ssh phraseless master to master and master to slave1.
> Same as before datanode & tasktracker are starting up/shuting down well on
> slave1
>
>
>
>
>
> 2011/6/1 jagaran das
>
>> Check the password less
Check the password less SSH is working or not
Regards,
Jagaran
From: MilleBii
To: common-user@hadoop.apache.org
Sent: Wed, 1 June, 2011 12:28:54 PM
Subject: Adding first datanode isn't working
Newbie on hadoop clusters.
I have setup my two nodes conf as descr
Hi All,
Please let me know is there anything by which we can do some basic BI features
on hadoop.
Idea is once the raw data is fed on the system, I run some pig scripts to
aggregate the data.
Now I need some BI ability to work on this files.
Thanks
Jagaran
Think of Lucene and Apache SOLR
Cheers,
Jagaran
From: cs230
To: core-u...@hadoop.apache.org
Sent: Tue, 31 May, 2011 10:50:49 AM
Subject: trying to select technology
Hello All,
I am planning to start project where I have to do extensive storage of xml
and te
Hi,
To be very precise,
input to the mapper should be something you want to filter on basis of which
you
want to do the aggregation.
The Reducer is where you aggregate the output from mapper.
Check the WordCount Example in Hadoop, it can help you to understand the basic
concepts.
Cheers,
Jaga
Your Font block size got increased dynamically , check in core-site :) :)
- Jagaran
From: He Chen
To: common-user@hadoop.apache.org
Sent: Mon, 30 May, 2011 11:39:35 AM
Subject: Re: Poor IO performance on a 10 node cluster.
Hi Gyuribácsi
I would suggest you d
the contents by name. But it only created one mapper. How
>>> can I change this to distribute accross multiple machines?
>>>
>>> On Thu, May 26, 2011 at 3:08 PM, jagaran das
wrote:
>>>> Hi Mohit,
>>>>
>>>> No of Maps - It depends on what i
Hi Mohit,
No of Maps - It depends on what is the Total File Size / Block Size
No of Reducers - You can specify.
Regards,
Jagaran
From: Mohit Anchlia
To: common-user@hadoop.apache.org
Sent: Thu, 26 May, 2011 2:48:20 PM
Subject: No. of Map and reduce tasks
Ho
40 matches
Mail list logo