I do ...
$ ls -l /cs/student/mark/tmp/hodhod
total 4
drwxr-xr-x 3 mark grad 4096 May 24 21:10 dfs
and ..
$ ls -l /tmp/hadoop-mark
total 4
drwxr-xr-x 3 mark grad 4096 May 24 21:10 dfs
$ ls -l /tmp/hadoop-maha/dfs/name/ only name is created here no
data
Thanks agian,
Mark
On Tue, Ma
Hi All,
I am running a process to extract feature vectors from images and write as
SequenceFiles on HDFS. My dataset of images is very large (~46K images).
The writing process worked all fine for half of the process but all of
sudden following problem occured:
org.apache.hadoop.hdfs.protocol.Alre
Do u Hv right permissions on the new dirs ?
Try stopping n starting cluster...
-JJ
On May 24, 2011, at 9:13 PM, Mark question wrote:
> Well, you're right ... moving it to hdfs-site.xml had an effect at least.
> But now I'm in the NameSpace incompatable error:
>
> WARN org.apache.hadoop.hdfs.s
Well, you're right ... moving it to hdfs-site.xml had an effect at least.
But now I'm in the NameSpace incompatable error:
WARN org.apache.hadoop.hdfs.server.common.Util: Path
/tmp/hadoop-mark/dfs/data should be specified as a URI in configuration
files. Please update hdfs configuration.
java.io.
Try moving the the configuration to hdfs-site.xml.
One word of warning, if you use /tmp to store your HDFS data, you risk
data loss. On many operating systems, files and directories in /tmp
are automatically deleted.
-Joey
On Tue, May 24, 2011 at 10:22 PM, Mark question wrote:
> Hi guys,
>
> I'
Hi guys,
I'm using an NFS cluster consisting of 30 machines, but only specified 3 of
the nodes to be my hadoop cluster. So my problem is this. Datanode won't
start in one of the nodes because of the following error:
org.apache.hadoop.hdfs.server.
common.Storage: Cannot lock storage /cs/student/ma
Hi guys,
I'm using an NFS cluster consisting of 30 machines, but only specified 3 of
the nodes to be my hadoop cluster. So my problem is this. Datanode won't
start in one of the nodes because of the following error:
org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage
/cs/student/mar
Hello,
We currently have complicated process which has more then 20 jobs piped to each
other.
We are using shell script to control the flow, I saw some other company they
were using spring batch. We use pig, streaming and hive
Not one thing if you are using ec2 for your jobs all local files n
Try cloudera specific lisls with your questions.
--
Take care,
Konstantin (Cos) Boudnik
2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622
Disclaimer: Opinions expressed in this email are those of the author,
and do not necessarily represent the views of any company the author
might be affiliate
Hi Sulabh,
Neither of these nodes have been "productionized" -- so I don't think
anyone would have a good answer for you about what works in
production. They are only available in 0.21 and haven't had any
substantial QA.
One of the potential issues with the BN is that it can delay the
logging of
Thanks some more questions :)
On Tue, May 24, 2011 at 4:54 PM, Aleksandr Elbakyan wrote:
> Can you please give more info?
>>> We currently have off hadoop process which uses java xml parser to convert
>>> it to flat file. We have files from couple kb to 10of GB.
Do you convert it into a flat fi
As far as my understanding goes, I feel that Backup node is much more
efficient then the Checkpoint node, as it has the current(up-to-date) copy
of file system too.
I do not understand what would be the use case (in a production environment)
tin which someone would prefer Checkpoint node over Backu
I look into different cluster and configurations from cloudera and came with
this number let me know what do you think...
Machine
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very
Can you please give more info?
>> We currently have off hadoop process which uses java xml parser to convert
>> it to flat file. We have files from couple kb to 10of GB.
Do you append multiple xml files data as a line into one file? Or
someother way? If so then how big do you let files to be.
W
Thanks Chris, these are quite helpful.
Thanks,
Tom
On Tue, May 24, 2011 at 11:13 AM, Chris Smith wrote:
> Worth a look at OpenTSDB ( http://opentsdb.net/ ) as it doesn't lose
> precision on the historical data.
> It also has some neat tracks around the collection and display of data.
>
> Anothe
Thanks Luca, but what other way to sort a directory of sequence files?
I don't plan to write a sorting algorithm in mappers/reducers, but hoping to
use the sequenceFile.sorter instead.
Any ideas?
Mark
On Mon, May 23, 2011 at 12:33 AM, Luca Pireddu wrote:
>
> On May 22, 2011 03:21:53 Mark ques
On Tue, May 24, 2011 at 4:25 PM, Aleksandr Elbakyan wrote:
> Hello,
>
> We have the same type of data, we currently convert it to tab delimited file
> and use it as input for streaming
>
Can you please give more info?
Do you append multiple xml files data as a line into one file? Or
someother w
Hello,
We have the same type of data, we currently convert it to tab delimited file
and use it as input for streaming
Regards,
Aleksandr
--- On Tue, 5/24/11, Mohit Anchlia wrote:
From: Mohit Anchlia
Subject: Processing xml files
To: common-user@hadoop.apache.org
Date: Tuesday, May 24, 2011,
Hello,
I am want to use cc1.4xlarge cluster for some data processing, to spin clusters
I am using cloudera scripts. hadoop-ec2-init-remote.sh has default
configuration until c1.xlarge but not configuration for cc1.4xlarge, can
someone give formula how does this values calculated based on hardwa
I just started learning hadoop and got done with wordcount mapreduce
example. I also briefly looked at hadoop streaming.
Some questions
1) What should be my first step now? Are there more examples
somewhere that I can try out?
2) Second question is around pracitcal usability using xml files. Our
thanks both for the comments, but even though finally, I managed to get the
output file of the current mapper, I couldn't use it because apparently,
mappers uses " _temporary" file while it's in process. So in Mapper.close ,
the file for eg. "part-0" which it wrote to, does not exists yet.
The
Worth a look at OpenTSDB ( http://opentsdb.net/ ) as it doesn't lose
precision on the historical data.
It also has some neat tracks around the collection and display of data.
Another useful tool is 'collectl' ( http://collectl.sourceforge.net/ )
which is a light weight Perl script that
both captur
Ahh, well that's embarrassing and explains the situation where it runs
for many hours.
I am still baffled as to the split on delimiter version timing out,
though.
String line = value.toString();
String[] splitLine = line.split(",");
if( splitLine.length >= 5 )
{
word.set(spli
Hi all,
I had a question regarding the setHosts method of the BlockLocation
class in hadoop hdfs. Does this make the block in question to be moved
to the specified host?
Furthermore, where does the getHosts method of block location get the
host names?
Thanks,
George
--
itr.nextToken() is inside the if.
On Tue, May 24, 2011 at 7:29 AM, wrote:
>while (itr.hasMoreTokens()) {
>if(count == 5)
>{
>word.set(itr.nextToken());
>output.collect(word, one);
>}
>count++;
> }
>
I am attempting to familiarize myself with hadoop and utilizing
MapReduce in order to process system log files. I had tried to start
small with a simple map reduce program similar to the word count example
provided. I wanted for each line that I had read in, to grab the 5th
word as my output key,
Praveenesh,
Ah yes it would not work on the older 0.20.x releases; The command
exists in the current HBase release.
On Tue, May 24, 2011 at 5:11 PM, praveenesh kumar wrote:
> Hey harsh,
>
> I tried that.. its not working.
> I am using hbase 0.20.6.
> there is no command like bin/hbase classpath
Hey harsh,
I tried that.. its not working.
I am using hbase 0.20.6.
there is no command like bin/hbase classpath :
hadoop@ub6:/usr/local/hadoop/hbase$ hbase
Usage: hbase
where is one of:
shellrun the HBase shell
master run an HBase HMaster node
regionserver run a
Praveenesh,
On Tue, May 24, 2011 at 4:31 PM, praveenesh kumar wrote:
> Hey Harsh,
>
> Actually I mailed to HBase mailing list also.. but since I wanted to get
> this thing done as soon as possible so I mailed in this group also..
> anyways I will take care of this in future , although I got more
Hey Harsh,
Actually I mailed to HBase mailing list also.. but since I wanted to get
this thing done as soon as possible so I mailed in this group also..
anyways I will take care of this in future , although I got more responses
in this mailing list only :-)
Anyways problem is solved..
What i d
Praveenesh,
HBase has their own user mailing lists where such queries ought to go.
Am moving the discussion to u...@hbase.apache.org and bcc-ing
common-user@ here. Also added you to cc.
Regarding your first error, going forward you can use the useful
`hbase classpath` to generate a HBase-provided
Are you sure that the directory where your ExampleClient.class is locates is
part of the MYCLASSPATH?
regards
Christian
---8<
Siemens AG
Corporate Technology
Corporate Research and Technologies
CT T DE IT3
Otto-Hahn-Ring 6
81739 München, Deutschland
I am simply using HBase API, not doing any Map-reduce work on it.
Following is the code I have written , simply creating the file on HBase:
import java.io.IOException;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hba
How do you execute the client (command line) do you use the java or the hadoop
command?
It seems that there is an error in your classpath when running the client job.
The classpath when compiling classes that implement the client is different
from the classpath when your client is executed since
Hello guys,
In case any of you are working on HBASE, I just wrote a program by reading
some tutorials..
But no where its mentioned how to run codes on HBASE. In case anyone of you
has done some coding on HBASE , can you please tell me how to run it.
I am able to compile my code by adding hbase-co
35 matches
Mail list logo