Configured Memory Capacity

2011-04-25 Thread maha
n Apr 24 15:50:36 PDT 2011 Can I change the configured capacity ? or is it set up automatically by Hadoop based on available resources? Thanks, Maha

Re: Reading Records from a Sequence File

2011-04-02 Thread maha
from memory .. right? if yes, what parameter is used for the buffer size? Thank you, Maha On Mar 31, 2011, at 11:59 PM, Harsh J wrote: > On Fri, Apr 1, 2011 at 9:00 AM, maha wrote: >> Hello Everyone, >> >>As far as I know, when my java program opens a sequenc

Reading Records from a Sequence File

2011-03-31 Thread maha
with input about 6 MB, but the memory allocated was 13 MB! .. which might be a fragmentation problem, but I doubt it. Thank you, Maha

Re: Map Tasks re-executing

2011-03-30 Thread maha
ase that is shown in the UI .. which I think is supposed to be ... right? Thank you, Maha > Hello, > > My map tasks are freezing after 100% .. I'm suspecting my mapper.close(). > > output is the following: > > 11/03/30 08:13:54 INFO mapred.JobClient: map 9

Map Tasks re-executing

2011-03-30 Thread maha
uce 0% ... Thank you for any thought, Maha

Re: Mappers and RecordReaders

2011-03-22 Thread maha
er get the record-by-record from memory ? Assuming the single-thread mapper class. Thanks, Maha On Mar 22, 2011, at 11:22 AM, Harsh J wrote: > NullOutputFormat

Mappers output collector

2011-03-22 Thread maha
I get: java.io.IOException: Undefined job output-path How can I tell the job configuration not to prepare an output path (or anything produced by output.collect()) ? Thank you, Maha

Re: Check file is sequence?

2011-03-22 Thread maha
The Reader idea worked fine I guess :) Thanks, Maha On Mar 22, 2011, at 1:28 AM, Harsh J wrote: > I do not know of an API-side thing that does this, but basically the > first three bytes of a given sequence file would be 'S', 'E', 'Q' > (which is ch

Check file is sequence?

2011-03-22 Thread maha
Hello, Is there a way to check if a file foo is a Hadoop SequenceFile ? Thanks, Maha

Re: WritableName can't load class ... for custom WritableClasses

2011-03-19 Thread maha
That's absolutely correct :) thanks Simon. Maha On Mar 19, 2011, at 7:13 PM, Simon wrote: > It is hard to judge without the code. But my guess is that your > TermFreqArrayWritable > is not properly compiled or imported into your job control file. > > HTH. > Simon >

WritableName can't load class ... for custom WritableClasses

2011-03-18 Thread maha
and TermFreqArrayWritable is inside the same project under a default package. Has any one tried their custom Writable with SequenceFiles ? Thank you, Maha

Re: Writable Class with an Array

2011-03-17 Thread maha
causes a NullPointerException. > Do you mean I have to use Integer objects instead of primitive types ?? > You can fix this in CreateNewVector(), by explicitly allocating a new > twoInteger object for each location in the "vector" array, or in the > readFields() lo

Writable Class with an Array

2011-03-17 Thread maha
Hello, I'm stuck with this for two days now ...I found a previous post discussing this, but not with arrays. I know how to write Writable class with primitive type elements but this time I'm using an ARRAY of primitive type element, here it is in a simplified version for easy readability :

Re: is a single thread allocated to a single output file ?

2011-03-15 Thread maha
I found it :) http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.html Maha On Mar 15, 2011, at 2:18 PM, maha wrote: > By the way, how do I know if my map task is single threaded (ie. one thread > executing for each record ) ? and

Re: is a single thread allocated to a single output file ?

2011-03-15 Thread maha
By the way, how do I know if my map task is single threaded (ie. one thread executing for each record ) ? and how to change that into multi-threading ? Thank you, Maha On Mar 12, 2011, at 9:11 PM, Harsh J wrote: > Hello, > > On Sat, Mar 12, 2011 at 3:51 PM, Jun Young Kim wro

Re: problem in running mapreduce task

2011-03-14 Thread maha
Try running ... $bin/hadoop dfs -lsr / To view your HD-fileSystem ... do you see your input file in there? Maha On Mar 14, 2011, at 10:46 AM, vishalgoyal wrote: > hello, > > i am new user to hadoop. when i tried to run a task, it successfully > compiled my file wo

Re: Reading SequenceFileAsBinaryOutputFormat

2011-03-14 Thread maha
I'd better restate my problem as it turns out to be my SequenceFile.Writer. Thanks everyone, Maha On Mar 14, 2011, at 10:29 AM, maha wrote: > Hi, > > I'm using SequenceFileAsBinaryOutputFormat to write the job output. Both > Reduce key,value are of type BytesWritabl

Reading SequenceFileAsBinaryOutputFormat

2011-03-14 Thread maha
adoop/hadoop-0.20.2/SeqFile at 0 By the way, I "-copyToLocal" the output file then try to read it using the SequenceFile.reader. Any idea is appreciated. Maha

Re: Open HDFS in mappers

2011-03-10 Thread maha
rcome this problem, which I don't appreciate :( If you have any other idea, let me know. Thank you, Maha On Mar 10, 2011, at 6:44 PM, Harsh J wrote: > Once you have a JobConf/Configuration conf object in your Mapper (via > setup/configure methods), you can do the following to get the

Open HDFS in mappers

2011-03-10 Thread maha
?The answer is NO. After checking, I realized that my mapper HDFS isn't the same as the hdfs in my main function. How can I open the same HDFS in maps as the one used in main? Thank you, Maha

Re: Binary Input Files

2011-03-09 Thread Maha
So you're suggesting that using HBase will be an alternative to creating my own stuff?!! By the way, why don't you use Binary inputs? do you think it's not gonna have great affect on performance? Thanks Mike. On Mar 9, 2011, at 5:27 PM, Michael Segel wrote: > > &

Binary Input Files

2011-03-09 Thread maha
eliminate the benefits of using Binary files. If I decided to write my own InputFormat that defines Splits based on my binary protocol and a recordReader also on my binary protocol. Will that interfere with the streaming stuff ? or it is doable ? Thank you, Maha

Re: Using SequenceFile instead of TextFiles

2011-03-04 Thread maha
Thanks again Harsh, I actually got the book 2 days ago, but didn't have time to read it yet. Maha On Mar 4, 2011, at 7:54 PM, Harsh J wrote: > Hi, > > On Sat, Mar 5, 2011 at 9:03 AM, maha wrote: >> Hi, >> >> I have 2 questions: >> >> 1) Is a Se

Using SequenceFile instead of TextFiles

2011-03-04 Thread maha
InputFormat, Do I need to stick to the header protocol defined in http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html ? Thanks everyone, Maha

Do Mappers run on different machines?

2011-03-03 Thread maha
Hi, Using 3 Machines, each has an input-File ' f ' in its local disk in addition to HDFS , assuming my program spawns a mapper/file . Does that mean that mappers will be running on different machines? Thank you, Maha

Re: Specify File Name to mappers

2011-03-03 Thread maha
;map.input.file"); } :) Maha On Mar 2, 2011, at 9:03 PM, maha wrote: > Thanks Harsh!!! but you don't think there is another way for the mappers to > get configuration and access 'map.input.file' because I didn't write my > recordReader. > > I truly apprecia

Re: Specify File Name to mappers

2011-03-02 Thread maha
Thanks Harsh!!! but you don't think there is another way for the mappers to get configuration and access 'map.input.file' because I didn't write my recordReader. I truly appreciate it. Maha On Mar 2, 2011, at 8:09 PM, Harsh J wrote: > The property 'map.input

Specify File Name to mappers

2011-03-02 Thread maha
Hi, If FileInputFormat is used with File.splitable(false) then each mapper will be getting a full file. I want the mapper to also know the path or at least name of the file it's assigned to. Please help, any ideas are appreciated. Thank you, Maha

Re: ToolRunner run function

2011-03-02 Thread maha
On a pseudo distributed mode, it actually just "move" the copy and not reproduce it :) Thanks anyways, Maha On Mar 2, 2011, at 1:04 PM, maha wrote: > Thanks Mike :) > > I was also wondering what if: > > hdfs.CopyToLocal( src-file, dst-file) ; // is executed

Re: ToolRunner run function

2011-03-02 Thread maha
just move that copy to dst-file path ? OR Will hdfs go ahead with the copy and hence node N will have two copies of the src-file? (ie. one on HDFS namespace and another in the local file system) Thanks, Maha On Mar 2, 2011, at 12:38 PM, Michael Segel wrote: > > > Run is local to

ToolRunner run function

2011-03-02 Thread maha
Hi, Assuming my program implements the ToolRunner, my question is where does the "run" function execute? ie. which daemon (DataNode/TT) ? or is it on the local machine where it is run? Thank you, Maha

why are they different?

2011-02-27 Thread maha
Hi, Is it right that Map-output-bytes are different that Map-FILE-BYTES-WRITTEN are a little different because of serialization to store in sequence files? Thank you, Maha

Re: Bug in my configurations, help!

2011-02-26 Thread maha
0 9 9 Thanks gain, Maha On Feb 26, 2011, at 8:45 PM, maha wrote: > Ok got this point, thanks Harsh. But my experiment now is to eliminate # of > spilled records for this small light job. > > This part of the map log: > 2011-02-26 16:05:35,307 INFO org.apache.had

Re: Bug in my configurations, help!

2011-02-26 Thread maha
rsh J wrote: > Hello, > > On Sun, Feb 27, 2011 at 9:30 AM, maha wrote: >> 2011-02-26 16:05:35,571 INFO org.apache.hadoop.mapred.MapTask: Finished >> spill 0 <--- WHY IS THIS ZERO WHEN FINAL JOB COUNTER >> SAYS IT'S 9 SPILLED RECORDS FROM MA

Bug in my configurations, help!

2011-02-26 Thread maha
the factor and the sort.mb parameters but no way. Is that how it's supposed to be ??? Please any idea would be helpful. Thank you, Maha

Re: Lost in HDFS_BYTES_READ/WRITTEN

2011-02-25 Thread maha
ance created is distributed. but the jobCoutners never uses it for intermediate results (Eg. for reducers to read map-outputs) So if you can answer my question further, I truly appreciate it ! Maha On Feb 25, 2011, at 12:00 PM, Harsh J wrote: > From what I could gather, all FileSystem instanc

Lost in HDFS_BYTES_READ/WRITTEN

2011-02-25 Thread maha
user to see. Thank you in advance, Maha

Re: File size shown in HDFS using "-lsr"

2011-02-24 Thread maha
and other times they're different (nothing else was changed). Please any explanation is appreciated ! Thank you, Maha On Feb 24, 2011, at 11:00 AM, maha wrote: > Silly question.. > > > bin/hadoop dfs -

File size shown in HDFS using "-lsr"

2011-02-24 Thread maha
has a size of 83 bytes?? Thanks, Maha

Re: Current available Memory

2011-02-24 Thread maha
Hi Yang, The problem could be solved using the following link: http://www.roseindia.net/java/java-get-example/get-memory-usage.shtml You need to use other memory managers like the Garbage collector and its finalize method to measure memory accurately. Good Luck, Maha On Feb 23, 2011

Re: Current available Memory

2011-02-23 Thread maha
Based on the Java function documentation, it gives approximately the available memory, so I need to tweak it with other functions. So it's a Java issue not Hadoop. Thanks anyways, Maha On Feb 23, 2011, at 6:31 PM, maha wrote: > Hello Everyone, > > I'm using &

Current available Memory

2011-02-23 Thread maha
Hello Everyone, I'm using " Runtime.getRuntime().freeMemory()" to see current memory available before and after creation of an object, but this doesn't seem to work well with Hadoop? Why? and is there another alternative? Thank you, Maha

Re: Spilled Records

2011-02-22 Thread maha
Thanks a bunch Saurabh! I'd better start optimizing my code then :) Maha On Feb 22, 2011, at 3:26 PM, Saurabh Dutta wrote: > Even if you have 4 GB RAM you should be able to optimize spills. I don't > think it should be an issue. What you need to do is write the program &

Re: Spilled Records

2011-02-21 Thread maha
mory being 4GB ?? I'm using the pseudo distributed mode. Thank you, Maha On Feb 21, 2011, at 7:46 PM, Saurabh Dutta wrote: > Hi Maha, > > The spilled record has to do with the transient data during the map and > reduce operations. Note that it's not just the

Spilled Records

2011-02-21 Thread maha
? Does changing io.sort.record.percent to be .9 instead .8 might produce unexpected exceptions ? Thank you, Maha

Re: Quick question

2011-02-21 Thread maha
How can then I produce an output/file per mapper not map-task? Thank you, Maha On Feb 20, 2011, at 10:22 PM, Ted Dunning wrote: > This is the most important thing that you have said. The map function > is called once per unit of input but the mapper object persists for > many input

Re: Quick question

2011-02-21 Thread maha
Thanks for your answers Ted and Jim :) Maha On Feb 21, 2011, at 6:41 AM, Jim Falgout wrote: > You're scenario matches the capability of NLineInputFormat exactly, so that > looks to be the best solution. If you wrote your own input format, it would > have to basically do what NL

Re: Quick question

2011-02-20 Thread maha
Yet the map-function was processed 16 times as described by the NLineInputSplit. I want the map-function to be one for the whole inputSplit of 5 Lines and not for each of the 16 lines. Any ideas other than building my own inputFormat? Thank you, Maha On Feb 20, 2011, at 11:59 AM, maha

Re: Quick question

2011-02-20 Thread maha
etInt("mapred.line.input.format.linespermap", 5); //# of lines per mapper = 5 If you have any thought of whether the upper solution is worst that writing my own inputSplit to be about 5 lines, let me know. Thanks everyone ! Maha On Feb 20, 2011, at 11:47 AM, maha wrote: > Hi again J

Re: Quick question

2011-02-20 Thread maha
like map1 has 8 lines and map2 has 8 lines. So first question: is there a difference between Mappers and maps ? Second: Does that mean I need to write my own inputFormat to make the InputSplit equal to multipleLines ??? Thank you, Maha On Feb 18, 2011, at 11:55 AM, Jim Falgout wrote

Re: Quick question

2011-02-18 Thread maha
Thanks Ted and Jim :) Maha On Feb 18, 2011, at 11:55 AM, Jim Falgout wrote: > That's right. The TextInputFormat handles situations where records cross > split boundaries. What your mapper will see is "whole" records. > > -Original Message- > From: ma

Quick question

2011-02-18 Thread maha
that right? Thank you, Maha

custom designed profiling

2011-02-17 Thread maha
opening a file. Is there a faster way to do that such as background loggers saving mappers output ?? Thank you, Maha

Check/Compare mappers output

2011-02-14 Thread maha
27;+word1.substring(word1.indexOf(',')+1, word1.indexOf('>'))+'#')); Yet the intermediate output still includes "d1": #d1##1# #d1##2# #d1##1# #d1##5# #d1##3# .. I put '#' to see if there was a space or newline included. Any ideas? Thank you, Maha

Re: Quick Question: LineSplit or BlockSplit

2011-02-07 Thread maha
Thanks Ted. Then I have to write my own InputFormat to read a block-of-lines per mapper. NLineInputFormat didn't work with me, any working example about it is appreciate it. Thanks again, Maha On Feb 7, 2011, at 6:32 PM, Mark Kerzner wrote: > Thanks! > Mark > > On Mo

Quick Question: LineSplit or BlockSplit

2011-02-07 Thread maha
) will be slower because of scheduling time ? Thank you, Maha

Re: Mappers reading from a Global inverted Index

2011-02-07 Thread maha
Thanks Ted, I needed to know that there is no way I can make my program less IO-intensive. Maha On Feb 7, 2011, at 12:04 PM, Ted Dunning wrote: > That isn't going to happen. > > Remember that all of the mappers are running in different JVM's on > (typically) different

Re: Mappers reading from a Global inverted Index

2011-02-07 Thread maha
My question is simply how to have a global variable (eg. HashTable) in hadoop ? To be available for all mappers. Please help, Thank you, Maha On Feb 7, 2011, at 11:21 AM, maha wrote: > Thanks Vijay, now my question is how can I build one inverted index and have > it ready to be acces

Mappers reading from a Global inverted Index

2011-02-07 Thread maha
Null. Any help is appreciated , Maha Depending on the scale of data, between the two, it would be best stored in hdfs , and use the built-in InputFormat-s , as that is more scalable. If necessary, (depending on how the data is stored), build a custom InputFormat, as per the API and set it

Mapper reading from local directory or global variable?

2011-02-06 Thread maha
until now is around 1000 mappers. Appreciate any thought :) Thank you, Maha

Namenode capabilities

2011-01-18 Thread maha
is still vague, is there a way to skip reading a specific disk-block ? Thanks, Maha

Re: Optimization

2011-01-15 Thread maha
Forgot to mention that, the benchmark is for hadoop so any parallel system optimization provided by hadoop is appreciated. Maha On Jan 15, 2011, at 11:25 AM, maha wrote: > Hi, > > I'm preparing a benchmark and would like to know how to best optimize my > java program (ign

Optimization

2011-01-15 Thread maha
Hi, I'm preparing a benchmark and would like to know how to best optimize my java program (ignoring the IO/time). Any links to read from? or did anyone tried Java-Optimizer-and-Decompile-Environment (JODE) ? Thanks in advance, Maha

Re: ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Datanode state: LV = -19 CTime = 1294051643891 is newer than the namespace state: LV = -19 CTime = 0

2011-01-09 Thread maha
I also use another solution for the namespace incompatibility which is to run : rm -Rf /tmp/hadoop-/ * then format the namenode. Hope that helps, Maha On Jan 9, 2011, at 9:08 PM, Adarsh Sharma wrote: > Shuja Rehman wrote: >> hi >> >> i have format the name node a

Re: Accessing HDFS

2011-01-07 Thread maha
Nice ! I'd better try that. So the trick is only to add "hdfs" to the path to access that namespace. Thanks a ton :) Maha On Jan 7, 2011, at 1:55 PM, Jacob R Rideout wrote: >> I'm wondering if there is a way to doing the following commands to HDFS >>

Accessing HDFS

2011-01-07 Thread maha
Hi everyone, I'm wondering if there is a way to doing the following commands to HDFS ... File LocalinputDir = new File ("/user/maha/inputDir"); String[] file = LocalinputDir.list(); I'm given Hadoop and input directory with files {f1,f2 ..}. I wo

Re: where can I see those email answers?

2011-01-03 Thread maha
Never mind. I just saw the left tags on the side of the page in question found in "Search Hadoop" site. Thanks all, Maha On Jan 3, 2011, at 11:29 AM, maha wrote: > Hi, > > I remember discussing the following error one time, but when I searched for > it I can

where can I see those email answers?

2011-01-03 Thread maha
object heap Could not create the Java virtual machine. Thank you, Maha

Re: Flow of control

2010-12-30 Thread maha
Very helpful :) thanks Ping. Maha On Dec 30, 2010, at 6:13 PM, li ping wrote: > On Fri, Dec 31, 2010 at 9:28 AM, maha wrote: > >> Hi, >> >> (1) I declared a global variable in my hadoop mainClass which gets >> initialized in the 'run' function of this

Flow of control

2010-12-30 Thread maha
nning before the maps. My question is in which node? The JobTracker node? Thank you, Maha

Re: Retrying connect to server

2010-12-30 Thread maha
Hi Cavus, Please check that hadoop JobTracker and other daemons are running by typing "jps". If you see one of (JobTracker,TaskTracker,namenode,datanode) missing then you need to 'stop-all' then format the namenode and start-all again. Maha On Dec 30, 2010, at 7:52 A

Re: UI doesn't work

2010-12-28 Thread maha
hadoop deamons. Isn't this a clean start?? Maha On Dec 28, 2010, at 6:02 PM, Sudhir Vallamkondu wrote: > I recently had this issue. UI links were working for some nodes meaning when > I go to dfsHealth.jsp page and following some cluster data node links some > would work and some wo

Re: help for using mapreduce to run different code?

2010-12-28 Thread maha
Hi Jander, You mean write Map in another language? like python or C, then yes. Check this http://hadoop.apache.org/common/docs/r0.18.0/streaming.html for Hadoop Streaming. Maha On Dec 28, 2010, at 2:53 PM, Jander g wrote: > Hi, all > > Whether Hadoop supports the map functio

Re: UI doesn't work

2010-12-28 Thread maha
adoop.mapred.JobTracker: Initializing job_201012281415_0001 2010-12-28 14:18:29,386 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201012281415_0001 2010-12-28 14:18:29,585 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_201012281415_0001 = 459393. Number

Re: UI doesn't work

2010-12-28 Thread maha
Hi James, I'm accessing ---> http://speed.cs.ucsb.edu:50030/ for the job tracker and port: 50070 for the name node just like Hadoop quick start. Did you mean to change the port in my mapred-site.xml file ? mapred.job.tracker speed.cs.ucsb.edu:9001 Maha On Dec

RE: UI doesn't work

2010-12-28 Thread maha
that? Harsh said: Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? NO, I get the following error: BUILD FAILED /cs/sandbox/student/maha/hadoop-0.20.2/build.xml:316: Unable to find a javac compiler; com.sun.tools.javac.Main is not o

UI doesn't work

2010-12-27 Thread maha
acker speed.cs.ucsb.edu:9001 when I try to open: http://speed.cs.ucsb.edu:50070/ I get the 404 Error. Any ideas? Thank you, Maha

Re: Needs a simple answer

2010-12-18 Thread maha
tions: FileSystem fs = FileSystem.get(new Configuration())// it worked ! Any reason for that? Thank you, Maha On Dec 17, 2010, at 2:59 PM, Peng, Wei wrote: > > You can put your local file to distributed file system by hadoop fs -put > localfile DFSfile. >

Re: Needs a simple answer

2010-12-17 Thread maha
( eg. split1: /tmp/f1, split2:/tmp/f2 split4: /tmp/f4) instead I want -> ( split1: content of file1 , ). Thank you, Maha On Dec 16, 2010, at 2:49 PM, Ted Dunning wrote: > Maha, > > Remember that the mapper is not running on the same machine as the main > clas

Needs a simple answer

2010-12-16 Thread maha
unt.myconf); hdfs.copyFromLocalFile(new Path("/Users/file"), new Path("/tmp/file")); }catch(Exception e) { System.err.print("\nError");} Also, the print statement will never print on console unless it's in my run function.. Appreciate it :) Maha

Re: Deprecated ... damaged?

2010-12-15 Thread maha
Hi Allen and thanks for responding .. You're answer actually gave me another clue, I set numSplits = numFiles*100; in myInputFormat and it worked :D ... Do you think there are side effects for doing that? Thank you, Maha On Dec 15, 2010, at 12:16 PM, Allen Wittenauer

Re: Deprecated ... damaged?

2010-12-15 Thread maha
Actually, I just realized that numSplits can't be modified "definitely". Even if I write numSplits = 5, it's just a hint. Then how come MultiFileInputFormat claims to use MultiFileSplit to contain one file/split ?? or is that also just a hint? Maha On Dec 15, 2010, at

Deprecated ... damaged?

2010-12-15 Thread maha
new myRecordReader((MultiFileSplit) split)); } Yet, in myRecordReader, for example one split has the following; " /tmp/input/file1:0+300 /tmp/input/file2:0+199 " instead of each line in its own split. Why? Any clues? Thank you, Maha

Re: Error: ... It is indirectly referenced from required .class files - implements

2010-12-12 Thread maha
Thanks for the advice Harsh! This worked :) Maha On Dec 11, 2010, at 8:48 PM, Harsh J wrote: > Try adding the commons-logging jar to your build path. It is available > in the lib/ folder of your Hadoop distribution. > > If you use the MapReduce eclipse plugin which comes wit

Re: Error: ... It is indirectly referenced from required .class files - implements

2010-12-11 Thread maha
ugh, I still appreciate your thoughts, thanks, Maha What I do is create a new pro On Dec 11, 2010, at 6:27 PM, li ping wrote: > Can you try to add the jar file in your Hadoop lib directory. > > On Sun, Dec 12, 2010 at 8:00 AM, Maha A. Alabduljalil > wrote: > >> >>

Error: ... It is indirectly referenced from required .class files - implements

2010-12-11 Thread Maha A. Alabduljalil
return (new LineRecordReader()); } } Can someone guide me to how to solve this in a different way (ie.Make each input file unSplittable) ... or how to add the required missing log.class? Thank you so much, Maha

Re: InputSplit is confusing me .. Any clarifications ??

2010-11-27 Thread maha
der. > > How can I change this property to be FileInputSpilt and Record is the whole > File ? > > something like JobConf.set ("File.input.format","FileInptSplit"); > > Is there such way? > > Thanks in advance, > Maha >

Re: InputSplit is confusing me .. Any clarifications ??

2010-11-27 Thread maha
ot;File.input.format","FileInptSplit"); Is there such way? Thanks in advance, Maha On Nov 26, 2010, at 9:09 PM, li ping wrote: > org.apache.hadoop.mapreduce.lib.input.TextInputForma

InputSplit is confusing me .. Any clarifications ??

2010-11-26 Thread maha
(); How did we know that map in this case is taking a line and not the whole input document ? Happy Thanksgiving everyone, Maha

Re: ask problem

2010-11-26 Thread maha
A much easier way is use the open source wordcount.java example and give it an input directory including all the text files. This will output one text file containing all the words and their frequencies from all the files. Maha On Nov 25, 2010, at 1:31 PM, Tri Doan wrote: > Thurday 25

Re: repeat a job for different files

2010-11-18 Thread maha
FileOutputFormat) ? Maha On Nov 17, 2010, at 10:11 PM, Alex Baranau wrote: > In case you need to process the files separately, use one MR job for each > file. You can add a single file as input. I believe you'll need to iterate > over all files in input dir and start job instance for

repeat a job for different files

2010-11-17 Thread maha
for fileN.txt Thanks, Maha

Re: JobConf

2010-11-15 Thread maha
That is exactly what I needed :) Thanks again Alex, Maha On Nov 14, 2010, at 9:54 PM, Alex Baranau wrote: > You might find this search tool valuable: http://search-hadoop.com. You can > do search in sources and javadocs separately. > > Alex Baranau > > Sematext :: h

Re: JobConf

2010-11-14 Thread maha
Never mind Jeff ... I guess your answer would be to read Hadoop manual pages and to keep practicing Java programming! Because I'm trying to write a Hadoop program and it's taking me time to know which class to use for myPurpose. So thanks anyways, Maha On Nov 11, 2010,

Re: JobConf

2010-11-11 Thread maha
Thanks Jeff :) How could you recall all possible readInputFile methods from different classes? Is there like a spacial way to search Java APIs? Maha On Nov 10, 2010, at 5:13 PM, Jeff Zhang wrote: > Use FileInputFormat.setInputPaths > > > > On Thu, Nov 11, 2010 at 5:45

JobConf

2010-11-10 Thread maha
can I add it to the list ? I couldn't even edit the JobConf.class because the source code is unavailable. any link to where is this issue handled ? Thanks, Maha

intermediate values of Maps

2010-11-09 Thread maha
intermediate values? Thanks, Maha

Re: Not able to execute MaxTemperature example

2010-10-19 Thread Maha A. Alabduljalil
Hi Rohit, I really learned alot from this link: http://www.infosci.cornell.edu/hadoop/windows.html Maha Quoting Rohit Mishra : I need clarification on how to run a Hadoop program. I am getting a ClassNotFoundException error when I try to run the test example given in the book [Ch 2]. Do

Re: Question on distributed hadoop setup

2010-10-13 Thread maha
That's exactly what I needed to know! Thanks for the thorough explaination HJ :) I'll try this today without the ssh and see how it goes. Maha On Oct 13, 2010, at 9:56 AM, Harsh J wrote: > Do the 12 hosts have no identity/address known? AFAIK, you need to > install Hadoop to

Re: Question on distributed hadoop setup

2010-10-13 Thread maha
mputer? is it through the hadoop.tmp.dir by including 'snoopy.cs.ucsb.edu' and 'booboo.cs.ucsb.edu' as hosts?master and slave? Thanks, Maha On Oct 12, 2010, at 9:04 PM, Medha Atre wrote: > http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_

  1   2   >