Please unsubscribe me from this mailing list!
--
Thanks Regards,
Sugandha Naolekar
So basically what I can deduce from it is, isSplittable() only applies to
stream compressed files. Right?
--
Thanks Regards,
Sugandha Naolekar
On Wed, Feb 26, 2014 at 2:06 PM, Jeff Zhang jezh...@gopivotal.com wrote:
Hi Sugandha,
Take gz file as an example, It is not splittable because
Oh. Ok. Thanks. So basically, to be on the safer side, one can always set
its value as false and keep the data of records consistent. I mean, the
length of all the records should be the same.
--
Thanks Regards,
Sugandha Naolekar
On Wed, Feb 26, 2014 at 3:57 PM, Dieter De Witte drdwi
Can I use SequenceFileInputFormat to do the same?
--
Thanks Regards,
Sugandha Naolekar
On Wed, Feb 26, 2014 at 4:38 PM, Mohammad Tariq donta...@gmail.com wrote:
Since there is no OOTB feature that allows this, you have to write your
custom InputFormat to handle JSON data. Alternatively
Joao Paulo,
Your suggestion is appreciated. Although, on a side note, what is more
tedious: Writing a custom InputFormat or changing the code which is
generating the input splits.?
--
Thanks Regards,
Sugandha Naolekar
On Wed, Feb 26, 2014 at 8:03 PM, João Paulo Forny jpfo...@gmail.com
be then
read by RecordReader in K,V format.
Please correct me if I don't make sense.
--
Thanks Regards,
Sugandha Naolekar
On Tue, Feb 25, 2014 at 2:07 PM, Bertrand Dechoux decho...@gmail.comwrote:
The wiki (or Hadoop The Definitive Guide) are good ressources.
https://www.inkling.com
polygon
geometries.
Internally, I would like to read each of the block of HDFS in such a way
where, each polygon geometry is fed to the map() task. THus, 100 map()
tasks per block per machine.
--
Thanks Regards,
Sugandha Naolekar
it accordingly by giving relevant k,V pairs to the map
function.*
--
Thanks Regards,
Sugandha Naolekar
On Wed, Feb 26, 2014 at 2:09 AM, Mohammad Tariq donta...@gmail.com wrote:
Hi Sugandha,
Please find my comments embedded below :
No. of mappers are decided as: Total_File_Size
Yes. Got it. Thanks
--
Thanks Regards,
Sugandha Naolekar
On Tue, Feb 25, 2014 at 10:14 PM, java8964 java8...@hotmail.com wrote:
Hi, Naolekar:
The blocks in HDFS just store the bytes. It has no idea nor cares what
kind of data, or how many ploygons in this block. It just store 128M
will be considered as single input
split? How will TextInputFormat react to it?
--
Thanks Regards,
Sugandha Naolekar
? If
the no. of mappers are more than the no. of slaves, what happens?
--
Thanks Regards,
Sugandha Naolekar
One more thing to ask: No. of blocks = no. of mappers. Thus, those many no.
of times the map() function will be called right?
--
Thanks Regards,
Sugandha Naolekar
On Tue, Feb 25, 2014 at 11:27 AM, Sugandha Naolekar
sugandha@gmail.comwrote:
Hello,
As per the various articles I went
the file from 2 blocks ? e,g; Each feature as seen in the json is a
combination of multiple lines. Now, can there be a possibility where, the
one line of the feature tag is placed in pne block of one m/c and rest of
the lines in other machine's block?
--
Thanks Regards,
Sugandha Naolekar
18, 2010 at 11:01 AM, Sugandha Naolekar
sugandha@gmail.comwrote:
Now the jar is getting built but, when i try 2 run it, it diplays
following.
bin/hadoop jar Sample.jar
RunJar jarFile [mainClass] args...
PLease suggest something...if possible, d procedures I have followed can
I need to execute a code through the propmt of hadoop,i.e; bin/hadoop.
So, I built the jar of it using jar cfmv Jarfile_name Manifest_filename -C
directory_name/ .(in which d jars,and class files are added).
After that, I simply execute the code thro' bin/hadoop Jarfilename.
But, I get an error
!
Sugandha
On Fri, Jun 18, 2010 at 7:32 AM, Raghava Mutharaju
m.vijayaragh...@gmail.com wrote:
did you use the following?
bin/hadoop jar JarFilename fullyQualifiedClassName
Raghava.
On Thu, Jun 17, 2010 at 9:21 PM, Sugandha Naolekar
sugandha@gmail.comwrote:
I need to execute a code
, Sugandha Naolekar
sugandha@gmail.comwrote:
Following things I did::
1) I went into the hadoop diectory- the path is
/home/hadoop/Softwares/hadoop0.19.0
2) Then I made a folder named Try under the above path. I added all the
jars under lib directory and the bin folder in which, my code lies
Hello!!
I have developed a GUI in JAVA using Swings!! Now, I want to perform
simple HDFS operations such as, data transfer and retrieval of the same.
These all operations I want to perform on the button click. So, I have
written a simple code, in which I have used FileSystem API and invoked
Hello!
We have a cluster of 5 nodes and we are concentrating on the development of
a DFS(Distributed File System). with the incorporation of Hadoop.
Now, Can I get some ideas on how can I design package diagrams.
--
Regards!
Sugandha
Hello!
I have a 4 node cluster and one remote machine(3rd party App.) which is
not a part of the hadoop cluster(mastre-slave configuration).
Now, I want to dump the data from this remote m/c into the hadoop
cluster. But, this data dumping is to be done dynamically. Meaning,. lets
say
Hello!
I have a 4 node cluster and one remote machine(3rd party App.) which is
not a part of the hadoop cluster(mastre-slave configuration).
Now, I want to dump the data from this remote m/c into the hadoop
cluster. But, this data dumping is to be done dynamically. Meaning,. lets
say
Hello!
Can I use RMI to dump the files or data from a remote machine into the
hadoop cluster, by executing that code from the local host?
--
Regards!
Sugandha
Hello!
Running a simple MR job, and setting a replication factor of 2. Now,
after its execution, the output is split in files named as part-0 and so
on. I want to ask is, can't we avoid these keys or key values to get printed
in output files? I mean, I am getting the output in the
Hello!
It takes a lot of time to invoke all the nodes in a cluster and run the
corresponding daemons. Why is it so???It seems to be a very tedious job!
--
Regards!
Sugandha
Hello!
I have 6 nodes and I want to configure them in racks. Below are the details
of machines::
*Name of the machine* *IP's* *Roles Played*
namenode 10.20.220.30 namenode
jobsec 10.20.220.31 jobtracker and secondaryNN
repository1 10.20.220.35 DN and TT -1 repository2
Hello!
I am currently using 0.19.0 version of hadoop. If I need to upgrade to the
latest one, what am I supposed to do? It should be least pain taking, as i
Have already setup a 6 node cluster running many jobs currently..!
--
Regards!
Sugandha
I want to encrypt the data that would be placed in HDFS. So I will have to
use some kind of encryption algorithms, right?
Also, This encryption is to be done on data before placing it in HDFS. How
this can be done? Any special API's available in HADOOP for the above
purpose?
--
Regards!
Sugandha
I am very sorry for the inconvenience caused. From next time, will take care
of the questions to be asked in a precise manner.
On Mon, Aug 3, 2009 at 3:58 PM, Steve Loughran ste...@apache.org wrote:
Sugandha Naolekar wrote:
I want to encrypt the data that would be placed in HDFS. So I
Hello!
I want to know - what's the difference between zipping a file(compressing)
and actually implementing compression algorithms for compressing some sort
of data?
How much difference does it make and which one is preferable.
I want to compress data to be placed in HDFS.
Thanking You,
--
Hello!
Can any kind of testing be done on Namenode and the hadoop cluster in order
to judge and measure the overall performance of Hadoop cluster in order to
prove the power,efficiency and all those quality factors.?
--
Regards!
Sugandha
Hello!
Few days back, I had asked about the compression of data placed in hadoop..I
did get apt replies as::
Place the data first in HDFS and then compress it, so that the data would be
in sequence files.
But, here my query is, I want to compress the data before placing it in
HDFS, so that
hello!
Can you precisely tell me baout why to use HBASE??
Also, as my data is going to increase daya by day, Will I be able to search
for a particular file or folder in present in HDFS efficiently nad in a fast
manner?
i have written a small java code, using Filesystem api of hadoop and in
-- Forwarded message --
From: Sugandha Naolekar sugandha@gmail.com
Date: Thu, Jul 9, 2009 at 1:41 PM
Subject: how to compress..!
To: core-u...@hadoop.apache.org
Hello!
How to compress data by using hadoop api's??
I want to write a java code to comperss the core files
Hello!
I have a 7 node hadoop cluster!
As of now, I am able to transfer(dump) the data in HDFS from a remote
node(not a part of hadoop cluster). And through web UI, I am able to
download the same.
- but, If I need to restrict that web UI to few users only, what am I
supposed to do?
- Also, if
Hello!
How to configure the machines in different racks?
I have in all 10 machines.
Now I want the heirarchy as follows::
machine1
machine2
machine3--these are all DN and TT
machine4
machine5 -JT1
machine7
machine8-- JT2
machine10NN and Sec.NN
As of now I have 7
Hello!
I have a 4 node cluster of hadoop running. Now, there is 5th machine which
is acting as a client of hadoop. It's not a part of the hadoop
cluster(master/slave config file). Now I have to writer a JAVA code that
gets executed on this client which will simply put the client ystem's data
Hi Ashish!
Try for the following things::
- Check the config file(hadoop-site.xml) of namenode.
- Make sure, the tag(dfs.datanode.addres)'s value you have given correctly
it's IP,and the name of that machine.
- Also, check for the name added in /etc/hosts file.
- Check for the ssh keys of
Hello!
I want to execute all my code on a machine that's remote(not a part of
hadoop cluster).
This code includes ::file transfers between any nodes (remote or within
hadoop cluster or within same LAN)-irrespective.; and HDFS. I will have to
simply write a code for this.
Is it possible?
Thanks,
Hello!
8
Following is the code that's not working::
package data.pkg;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileUtil;
import
Hello!
My files are getting transferred within the cluster of 7 nodes. But, if I
trye to do the same thing between a host that's remote but within the same
LAN, I am not able to do that. Basically, how to specify the path of the
other node in the confug file so that, It will come to know that
Hello!!
I have a 7 node cluster. In which, 3 machines are individually dedicated for
namenode,secondaryNN and Jobtracker and the other 4 are datanodes. Now, I
want to transfer files and dump them into HDFS on the click of a button. For
that purpose, I will have to write a code(preferably java
Hello!
I want to execute all my code on a machine that's remote(not a part of
hadoop cluster).
This code includes ::file transfers between any nodes (remote or within
hadoop cluster or within same LAN)-irrespective.; and HDFS. I will have to
simply write a code for this.
Is it possible?
Thanks,
I have a 7 node cluster.
Now if ssh to NN, and type in-hadoop -put /home/hadoop/Desktop/test.java
/user/hadoop -- the file gets placed in HDFS and gets replicated
automatically.
Now if the same file is in one of the datanodes in the same location. And I
want to place it in HDFS through NN,
Hello!
I am trying to transfer data from a remote node's filesystem to HDFS. But,
somehow, it's not working.!
***
I have a 7 node cluster, It's config file(hadoop-site.xml) is as follows::
property
namefs.default.name/name
Hello!
If I try to transfer a 5GB VDI file from a remote host(not a part of hadoop
cluster) into HDFS, and get it back, how much time is it supposed to take?
No map-reduce involved. Simply Writing files in and out from HDFS through a
simple code of java (usage of API's).
--
Regards!
Sugandha
file.
Secura
On Wed, Jun 10, 2009 at 11:56 AM, Sugandha Naolekar
sugandha@gmail.comwrote:It
Hello!
If I try to transfer a 5GB VDI file from a remote host(not a part of
hadoop
cluster) into HDFS, and get it back, how much time is it supposed to
take?
No map-reduce involved
If I want to make the data transfer fast, then what am I supposed
to do? I want to place the data in HDFS and replicate it in fraction of
seconds. Can that be possible. and How? Placing a 5GB file will take atleast
half n hour...or so...but, if its a large cluster, lets say, of 7nodes,
Hello!
I have a 7 node cluster. But there is one remote node(8th machine) within
the same LAN which holds some kind of data. Now, I need to place this data
into HDFS. This 8th machine is not a part of the hadoop
cluster(master/slave) config file.
So, what I have thought is::
- Will get the
make any sense, because the client does everything
for you already.
Hope this clears things up.
Alex
On Fri, Jun 5, 2009 at 12:53 AM, Sugandha Naolekar
sugandha@gmail.comwrote:
Hello!
Placing any kind of data into HDFS and then getting it back, can this
activity be fast? Also
Hello!
As far as I have read the forums of Map-reduce, it is basically used to
process large amount of data speedily. right??
But, can you please give me some instances or examples wherein, I can use
map-reduce..???
--
Regards!
Sugandha
answered your question. Please stop reposting the same
question over and over.
Thanks
-Todd
On Mon, Jun 8, 2009 at 7:05 AM, Sugandha Naolekar sugandha@gmail.com
wrote:
Hello!
I have A 7 node cluster. Now there is 8th machine (called as remote)
which
will bw acting just as a client
Hello!
I am have following queries related to Hadoop::
- Once I place my data in HDFS, it gets replicated and chunked
automatically over the datanodes. Right? Hadoop takes care of all those
things.
- Now, if there is some third party who is not participating in the Hadoop
program. Means, he is
Hello!
I am have following queries related to Hadoop::
- Once I place my data in HDFS, it gets replicated and chunked
automatically over the datanodes. Right? Hadoop takes care of all those
things.
- Now, if there is some third party who is not participating in the Hadoop
program. Means, he is
I have a 7 node cluster working as of now. I want to place the data into
HDFS, from a machine which is not a part of the hadoop cluster. How can I do
that.? It's in a way, a remote machine.
Will I have to use RPC mechanism or simply I can use FileSystem api and do
some kind of coding and make it
Thanks a lot! Will try it out initially with the machines within LAN and
then later on with the remote machines.
Will let you know, if something gets on my way!
On Fri, Jun 5, 2009 at 3:07 PM, Usman Waheed usm...@opera.com wrote:
I have setup machines just to act as HADOOP clients which are
55 matches
Mail list logo