Hi there
I got millions of rather small PDF-Files which I want to load into HDFS for
later analysis. Also I need to re-encode them as base64-stream to get the
MR-Job for parsing work.
Is there any better/faster method of just calling the 'put' function in a
huge (bash) loop? Maybe I could
Hi Roger,
you can use Apache Flume to ingest this files into your cluster. Store it
in an HBase table for fast random access and extract the metadata on the
fly using morphlines (See:
http://kitesdk.org/docs/0.11.0/kite-morphlines/index.html). Even then
base64 conversion can be done on the fly if
Hello all,
I'm trying to run TestDFSIO using a different file system other than the
configured defaultFS and it doesn't work for me:
$ hadoop org.apache.hadoop.fs.TestDFSIO
-Dtest.build.data=ofs://test/user/$USER/TestDFSIO -write -nrFiles 1
-fileSize 10240
14/10/02 11:24:19 INFO fs.TestDFSIO:
Hello Pramod,
This is great work !. Thank you for sharing the report.
Thanks Regards,
Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX
Ph: 281-685-0388
From: Pramod Biligiri pramodbilig...@gmail.com
Reply-To: user@hadoop.apache.org
Date: Thursday, October 2,
Hi
For learning purposes, I am trying to set up my own hadoop/hdfs system at
home. I am running openSuse 13 and Hadoop 2.5.1.
I followed the explanations in the Singe Node Setup:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
My problem is, the data
Hi jeff. Wrong fs means that your configuration doesn't know how to bind ofs
to the OrangeFS file system class.
You can debug the configuration using fs.dumpConfiguration(), and you will
likely see references to hdfs in there.
By the way, have you tried our bigtop hcfs tests yet? We now
Hi I am trying to get updated or newly added data from relational database
using Sqoop. Sqoop command is working fine but when I try to execute it
through the oozie workflow it does not work.
It is giving me below error:
Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
In case anyone wants to know: There was trash in the /tmp dir. I stopped
all nodes, formatted the HDFS and then re-started the nodes. That seems to
have solved the problem.
2014-10-02 20:58 GMT+02:00 Roger Maillist darkchanterl...@gmail.com:
Hi
For learning purposes, I am trying to set up my
hi, folks,
To get the size of a hdfs file, jave API has
FileSystem#getFileStatus(PATH)#getLen();
now I am trying to use a C client to do the same thing.
For a file on local file system, I can grab the info like this:
fseeko(file, 0, SEEK_END);
size = ftello(file);
But I can't find the SEEK_END
Hello,
As you suggested I have changed the hdfs-site.xml file of datanodes and name
node as below and formatted the name node.
/property
property
namedfs.datanode.data.dir/name
value/mnt/value
descriptionComma separated list of paths. Use the list of directories from
$DFS_DATA_DIR.
What is the block placement policy hadoop follows when rack aware is not
enabled?
Does it just round robin?
Thanks.
It’s still random.
If rack aware is not enabled, all nodes are in “default-rack”, and random nodes
are chosen for block replications.
Regards,
Yi Liu
From: SF Hadoop [mailto:sfhad...@gmail.com]
Sent: Friday, October 03, 2014 7:12 AM
To: user@hadoop.apache.org
Subject: Block placement without
It appears to be randomly chosen. I just came across this blog post from
Lars George about HBase file locality in HDFS
http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html
On Thu, Oct 2, 2014 at 4:12 PM, SF Hadoop sfhad...@gmail.com wrote:
What is the block placement policy hadoop
Please see http://hadoop.apache.org/mailing_lists.html#User
On Oct 2, 2014, at 7:37 PM, Igor Gatis igorga...@gmail.com wrote:
Thanks for the info. Exactly what I needed.
Cheers.
On Thu, Oct 2, 2014 at 4:21 PM, Pradeep Gollakota pradeep...@gmail.com
wrote:
It appears to be randomly chosen. I just came across this blog post from
Lars George about HBase file locality in HDFS
16 matches
Mail list logo