Re: Non utf-8 chars in input

2012-09-11 Thread Joshi, Rekha
Hi Ajay, Try SequenceFileAsBinaryInputFormat ? Thanks Rekha On 11/09/12 11:24 AM, "Ajay Srivastava" wrote: >Hi, > >I am using default inputFormat class for reading input from text files >but the input file has some non utf-8 characters. >I guess that TextInputFormat class is default inputForm

what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube
Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: 1. Some of the blocks it was managing are deleted/modified? 2. The size of the blocks are now modified say from 64MB to 128MB? 3. What if the block replication factor was one (yea

Re: what happens when a datanode rejoins?

2012-09-11 Thread George Datskos
Hi Mehul Some of the blocks it was managing are deleted/modified? The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. The size of the blocks are now modified say from

Re: what happens when a datanode rejoins?

2012-09-11 Thread George Datskos
Mehul, Let me make an addition. Some of the blocks it was managing are deleted/modified? Blocks that are deleted in the interim will deleted on the rejoining node as well, after it rejoins . Regarding the "modified," I'd advise against modifying blocks after they have been fully written.

Re: Non utf-8 chars in input

2012-09-11 Thread Joshi, Rekha
Actually even if that works, it does not seem an ideal solution. I think format and encoding are distinct, and enforcing format must not enforce an encoding.So that means there must be a possibility to pass encoding as a user choice on construction, e.g.:TextInputFormat("your-encoding"). But I do

Re: what happens when a datanode rejoins?

2012-09-11 Thread Harsh J
George has answered most of these. I'll just add on: On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube wrote: > 1. Some of the blocks it was managing are deleted/modified? A DN runs a block report upon start, and sends the list of blocks to the NN. NN validates them and if it finds any files

Re: how to make different mappers execute different processing on same data ?

2012-09-11 Thread Narasingu Ramesh
Hi Jason, Mehmet said is exactly correct ,without reducers we cannot increase performance please you can add mappers and reducers in any processing data you can get output and performance is good. Thanks & Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 9:31 AM, Mehmet Tepedelenliogl

Re: Non utf-8 chars in input

2012-09-11 Thread Ajay Srivastava
Rekha, I guess that problem is that Text class uses utf-8 encoding and one can not set other encoding for this class. I have not seen any other Text like class which supports other encoding otherwise I have written my custom input format class. Thanks for your inputs. Regards, Ajay Srivastava

Re: configure hadoop-0.22 fairscheduler

2012-09-11 Thread Jameson Li
Hi Harsh, Thanks for your reply. And I am sorry for my unclear description. As I mentioned previous, I think I configured the fairsheduler correctly in hadoop-0.22.0. But when I commit lots of the jobs: many big jobs (map number and reduce number is bigger than the map/reduce slot) commit fir

what happens when a datanode rejoins?

2012-09-11 Thread mehul choube
Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: a) Some of the blocks it was managing are deleted/modified? b) The size of the blocks are now modified say from 64MB to 128MB? c) What if the block replication factor was one (yea not in most deplo

RE: build failure - trying to build hadoop trunk checkout

2012-09-11 Thread Tony Burton
Changing the hostname to lowercase fixed this particular problem - thanks for your replies. The build is failing elsewhere now, I'll post a new thread for that. Tony From: Tony Burton [mailto:tbur...@sportingindex.com] Sent: 10 September 2012 10:44 To: user@hadoop.apache.org Subject: RE: bui

Re: what happens when a datanode rejoins?

2012-09-11 Thread shashwat shriparv
Yes the cluster will be re balanced. On Tue, Sep 11, 2012 at 2:09 PM, mehul choube wrote: > Hi, > > > What happens when an existing (not new) datanode rejoins a cluster for > following scenarios: > > > a) Some of the blocks it was managing are deleted/modified? > > b) The size of the blocks are

Re: what happens when a datanode rejoins?

2012-09-11 Thread Harsh J
Hi Mehul, Please do not send multiple mails with the same questions. We've already answered this at your other post, follow thread at: http://mail-archives.apache.org/mod_mbox/hadoop-user/201209.mbox/%3ce884ec9cd547324b8976a5d37317ac566d11c7f...@apj1xchevspin30.symc.symantec.com%3e On Tue, Sep 11

hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Hi, I've checked out the hadoop trunk, and I'm running "mvn test" on the codebase as part of following the "How To Contribute" guide at http://wiki.apache.org/hadoop/HowToContribute. The tests are currently failing in hadoop-mapreduce-client-jobclient, the failure message is below - something

Re: Can't run PI example on hadoop 0.23.1

2012-09-11 Thread Narasingu Ramesh
Hi Vinod, Please check whether input file location and output file location doesnt match. please find your input file first put into HDFS and then run MR job it is working fine. Thanks & Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 4:23 AM, Vinod Kumar Vavilapalli < vino...@horton

RE: Undeliverable messages

2012-09-11 Thread Tony Burton
Thanks Harsh. By the looks of Stanislav's linkedin profile, he's moved on from Sungard, so Outlook's filtering rules will look after his list bounce messages from now on :) -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: 10 September 2012 12:11 To: user@hadoop.apache

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Narasingu Ramesh
Hi, Please find i think one command is there then only build the all applications. Thanks & Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 2:28 PM, Tony Burton wrote: > Hi, > > ** ** > > I’ve checked out the hadoop trunk, and I’m running “mvn test” on the > codebase as part of fo

RE: what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube
> The namenode will asynchronously replicate the blocks to other datanodes in > order to maintain the replication factor after a datanode has not been in > contact for 10 minutes. What happens when the datanode rejoins after namenode has already re-replicated the blocs it was managing? Will name

RE: what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube
I apologize for this :( I thought the earlier mail didn't go through -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Tuesday, September 11, 2012 2:31 PM To: user@hadoop.apache.org Subject: Re: what happens when a datanode rejoins? Hi Mehul, Please do not send multiple

Re: what happens when a datanode rejoins?

2012-09-11 Thread Narasingu Ramesh
Hi Mehul, DataNode rejoins take care of only NameNode. Thanks & Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube wrote: > > The namenode will asynchronously replicate the blocks to other > datanodes in order to maintain the replication factor after a datanode h

Re: what happens when a datanode rejoins?

2012-09-11 Thread Harsh J
Hi, Inline. On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube wrote: >> The namenode will asynchronously replicate the blocks to other datanodes >> in order to maintain the replication factor after a datanode has not been in >> contact for 10 minutes. > > What happens when the datanode rejoins after

RE: what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube
>DataNode rejoins take care of only NameNode. Sorry didn't get this From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com] Sent: Tuesday, September 11, 2012 2:38 PM To: user@hadoop.apache.org Subject: Re: what happens when a datanode rejoins? Hi Mehul, DataNode rejoins take care

Re: Undeliverable messages

2012-09-11 Thread Harsh J
Ha, good sleuthing. I just moved it to INFRA, as no one from our side has gotten to this yet. I guess we can only moderate, not administrate. So the ticket now awaits action from INFRA on ejecting it out. On Tue, Sep 11, 2012 at 2:34 PM, Tony Burton wrote: > Thanks Harsh. By the looks of Stanisl

Re: FileSystem.get(Uri,Configuration,String) caching issue

2012-09-11 Thread Himanshu Gupta
Thanks Harsh and Daryn ! Daryn, I tried the option you suggested and changed getFileSystem(String user) implementation as following...     private static FileSystem getFileSystem(String user) throws Exception {     final Configuration conf = new Configuration();     conf.set("fs.default.

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Hi Ramesh Thanks for the quick reply, but I'm having trouble following your English. Are you saying that there is one command to build everything? If so, can you tell me what it is? Tony From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com] Sent: 11 September 2012 10:06 To: user@hadoop

Re: Undeliverable messages

2012-09-11 Thread Harsh J
And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony! On Tue, Sep 11, 2012 at 2:44 PM, Harsh J wrote: > Ha, good sleuthing. > > I just moved it to INFRA, as no one from our side has gotten to this > yet. I guess we can only moderate, not administrate. So the ticket now >

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Steve Loughran
It's probably some maven thing -in particular Maven's habit of grabbing the online nightly snapshots off apache rather than local, try mvn clean install -DskipTests -offline to force in all the artifacts, then run the MR tests Tony -why not get on the mapreduce-dev mailing list, as this is the p

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Thanks Steve, I’ll try the mvn command you suggest. All the snapshots I can see came from repository.apache.org though. How do I run the MR tests only? Thanks for the mapreduce-dev mailing list suggestion, I thought all lists had merged into one though – did I get the wrong end of the stick? T

RE: Undeliverable messages

2012-09-11 Thread Tony Burton
No problem! I'll remove that Outlook filter now... :) -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: 11 September 2012 10:34 To: user@hadoop.apache.org Subject: Re: Undeliverable messages And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony!

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
I tried this: cd into hadoop-mapreduce-project ant test and got further build errors - I’ll try mapreduce-dev. From: Tony Burton [mailto:tbur...@sportingindex.com] Sent: 11 September 2012 10:55 To: user@hadoop.apache.org Subject: RE: hadoop trunk build failure - yarn, surefire related? Thanks

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Harsh J
Tony, What I do is: $ cd hadoop/; mvn install -DskipTests; cd hadoop-mapreduce-project/; mvn test This seems to work in running without any missing dependencies at least. The user lists are all merged, but the developer lists remain separate as that works better. On Tue, Sep 11, 2012 at 3:24 P

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Ok - thanks Harsh. Following Steve's earlier advice I tried the mvn install, which worked fine, then ant test in hadoop-mapreduce-project which failed. I was mid-email to mapreduce-dev@hadoop, now I'll try mvn test in hadoop-mapreduce-project and report back. Tony -Original Message-

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Harsh J
Ah there is no longer a need to run ant anymore on trunk. Ignore the leftover files in the base MR project directory - those should be cleaned up soon. All of the functional pieces of the build right now definitely use Maven. On Tue, Sep 11, 2012 at 4:01 PM, Tony Burton wrote: > Ok - thanks Harsh

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
That's good to know - is there a more up to date guide than http://wiki.apache.org/hadoop/HowToContribute which still makes many references to ant builds? -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: 11 September 2012 11:36 To: user@hadoop.apache.org Subject: Re

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Harsh J
I guess we'll need to clean that guide and divide it in two - For branch-1 maintenance contributors, and for trunk contributors. I had another page that serves a slightly different purpose, but may help you just the same: http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk On Tue, Sep

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Hemanth Yamijala
Also, for the maven based builds, BUILDING.txt in the root folder of hadoop source does get one started. Thanks hemanth On Tue, Sep 11, 2012 at 4:28 PM, Harsh J wrote: > I guess we'll need to clean that guide and divide it in two - For > branch-1 maintenance contributors, and for trunk contribu

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Good suggestions Harsh and Hemanth. When I was asked to submit a patch for hadoop 1.0.3, I thought it a good exercise to work through the build process to become familiar even though the patch is documentation-only. Maybe the requests for patches could come with a list of suggested reading as w

what's the default reducer number?

2012-09-11 Thread Jason Yang
Hi, all I was wondering what's the default number of reducer if I don't set it in configuration? Will it change dynamically according to the output volume of Mapper? -- YANG, Lin

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks
Hi Lin The default value for number of reducers is 1 mapred.reduce.tasks 1 It is not determined by data volume. You need to specify the number of reducers for your mapreduce jobs as per your data volume. Regards Bejoy KS On Tue, Sep 11, 2012 at 4:53 PM, Jason Yang wrote: > Hi, all > > I was

Re: what's the default reducer number?

2012-09-11 Thread Jason Yang
Hi, Bejoy Thanks for you reply. where could I find the default value of mapred.reduce.tasks ? I have checked the core-site.xml, hdfs-site.xml and mapred-site.xml, but I haven't found it. 2012/9/11 Bejoy Ks > Hi Lin > > The default value for number of reducers is 1 > > mapred.reduce.tasks > 1

Re: FW: Doubts Reg

2012-09-11 Thread sudha sadhasivam
Dear Madam, I'am keeping as attachment relevant screen shots of running hive Profile.png  shows contents of /etc/profile hive_common_lib.png shows h-ve_common*.jar is already in $HIVE_HOME/lib  , here $HIVE_HOME is /home/yahoo/hive/build/dist as evident from classpath_err.png Yours Truly G Sud

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks
Hi Lin The default values for all the properties are in core-default.xml hdfs-default.xml and mapred-default.xml Regards Bejoy KS On Tue, Sep 11, 2012 at 5:06 PM, Jason Yang wrote: > Hi, Bejoy > > Thanks for you reply. > > where could I find the default value of mapred.reduce.tasks ? I have >

Re: what's the default reducer number?

2012-09-11 Thread Jagat Singh
Just to add the name is depreciated in new Hadoop Try to find mapreduce.job.reduces On Tue, Sep 11, 2012 at 9:43 PM, Bejoy Ks wrote: > Hi Lin > > The default values for all the properties are in > core-default.xml > hdfs-default.xml and > mapred-default.xml > > > Regards > Bejoy KS > > > On Tu

Re: what's the default reducer number?

2012-09-11 Thread Jason Yang
All right, I got it. Thank you very much~ 2012/9/11 Jagat Singh > Just to add the name is depreciated in new Hadoop > Try to find > mapreduce.job.reduces > > > > > On Tue, Sep 11, 2012 at 9:43 PM, Bejoy Ks wrote: > >> Hi Lin >> >> The default values for all the properties are in >> core-defaul

Some general questions about DBInputFormat

2012-09-11 Thread Yaron Gonen
Hi, After reviewing the class's (not very complicated) code, I have some questions I hope someone can answer: - (more general question) Are there many use-cases for using DBInputFormat? Do most Hadoop jobs take their input from files or DBs? - What happens when the database is updated dur

Re: Some general questions about DBInputFormat

2012-09-11 Thread Bejoy KS
Hi Yaron Sqoop uses a similar implementation. You can get some details there. Replies inline • (more general question) Are there many use-cases for using DBInputFormat? Do most Hadoop jobs take their input from files or DBs? > From my small experience Most MR jobs have data in hdfs. It is usefu

Re: Some general questions about DBInputFormat

2012-09-11 Thread Nick Jones
Hi Yaron Replies inline below. On 09/11/2012 07:41 AM, Yaron Gonen wrote: Hi, After reviewing the class's (not very complicated) code, I have some questions I hope someone can answer: * (more general question) Are there many use-cases for using DBInputFormat? Do most Hadoop jobs take t

Question about the task assignment strategy

2012-09-11 Thread Hiroyuki Yamada
Hi, I want to make sure my understanding about task assignment in hadoop is correct or not. When scanning a file with multiple tasktrackers, I am wondering how a task is assigned to each tasktracker . Is it based on the block sequence or data locality ? Let me explain my question by example. The

Re: Some general questions about DBInputFormat

2012-09-11 Thread Yaron Gonen
Thanks for the fast response. Nick, regarding locking a table: as far as I understood from the code, each mapper opens its own connection to the DB. I didn't see any code such that the job creates a transaction and passes it to the mapper. Did I miss something? again, thanks! On Tue, Sep 11, 2012

Re: Question about the task assignment strategy

2012-09-11 Thread Hemanth Yamijala
Hi, Task assignment takes data locality into account first and not block sequence. In hadoop, tasktrackers ask the jobtracker to be assigned tasks. When such a request comes to the jobtracker, it will try to look for an unassigned task which needs data that is close to the tasktracker and will ass

Issue in access static object in MapReduce

2012-09-11 Thread Stuti Awasthi
Hi, I have a configuration JSON file which is accessed by MR job for every input.So , I created a class with a static block, load the JSON file in static Instance variable. So everytime my mapper or reducer wants to access configuration can use this Instance variable. But on a single node cl

Re: Issue in access static object in MapReduce

2012-09-11 Thread Bejoy Ks
Hi Stuti You can pass the json object as a configuration property from your main class then Initialize this static json object on the configure() method. Every instance of map or reduce task will have this configure() method executed once before the map()/reduce() function . So all the executions

Implementing a grouping comparator with avro

2012-09-11 Thread Frank Kootte
I need to implement secondary sort within an avro based MR sequence. I however find little to documentation or examples online. I would like to implement this by overriding the 'int compare(AvroWrapper x, AvroWrapper y)' method but I fail to have it invoked. Does anybody have experience implementi

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton
Another "mvn test" caused the build to fail slightly further down the road. As my Jira issue is documentation-only, I've submitted the patch anyway. Is this multiple-failure scenario typical for trying to build hadoop from the trunk? It's sure putting me off submitting code in future. Is there a

Error in : hadoop fsck /

2012-09-11 Thread yogesh dhari
Hi all, I am running hadoop-0.20.2 on single node cluster, I run the command hadoop fsck / it shows error: Exception in thread "main" java.net.UnknownHostException: http at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178) at java.net.SocksSocketImpl.connect

Re: Error in : hadoop fsck /

2012-09-11 Thread Hemanth Yamijala
Could you please review your configuration to see if you are pointing to the right namenode address ? (This will be in core-site.xml) Please paste it here so we can look for clues. Thanks hemanth On Tue, Sep 11, 2012 at 9:25 PM, yogesh dhari wrote: > Hi all, > > I am running hadoop-0.20.2 on s

RE: Error in : hadoop fsck /

2012-09-11 Thread yogesh dhari
Hi Hemant, Its the content of core-site.xml fs.default.name hdfs://localhost:9000 hadoop.tmp.dir /opt/hadoop-0.20.2/hadoop_temporary_dirr A base for other temporary directories. Regards Yogesh Kumar Date: Tue, 11 Sep 2012 21:29:36 +0530 Subject: Re:

Re: Question about the task assignment strategy

2012-09-11 Thread Hiroyuki Yamada
Hi, thank you for the comment. > Task assignment takes data locality into account first and not block sequence. Does it work like that when replica factor is set to 1 ? I just had a experiment to check the behavior. There are 14 nodes (node01 to node14) and there are 14 datanodes and 14 tasktrac

Re: Error in : hadoop fsck /

2012-09-11 Thread Arpit Gupta
Yogesh try this hadoop fsck -Ddfs.http.address=localhost:50070 / 50070 is the default http port that the namenode runs on. The property dfs.http.address should be set in your hdfs-site.xml -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Sep 11, 2012, at 9:03 AM, yogesh dhari wrote

Re: Error in : hadoop fsck /

2012-09-11 Thread Harsh J
Atop what Arpit has said, the format of dfs.http.address is a simple host:port, and should not be http://host:port (which you may have set instead). On Tue, Sep 11, 2012 at 10:14 PM, Arpit Gupta wrote: > Yogesh > > try this > > hadoop fsck -Ddfs.http.address=localhost:50070 / > > 50070 is the def

how to specify the root directory of hadoop on slave node?

2012-09-11 Thread Richard Tang
Hi, All I need to setup a hadoop/hdfs cluster with one namenode on a machine and two datanodes on two other machines. But after setting datanode machiines in conf/slaves file, running bin/start-dfs.sh can not start hdfs normally.. I am aware that I have not specify the root directory hadoop is inst

security-in-HADOOP

2012-09-11 Thread nisha
How security is maintained in hadoop, is it maintained by giving folder/file permissions in hadoop how can i make sure that somebody else dunt write in to my hdfs file system ...

removing datanodes from clustes.

2012-09-11 Thread yogesh dhari
Hello all, I am not getting the clear way out to remove datanode from the cluster. please explain me decommissioning steps with example. like how to creating exclude files and other steps involved in it. Thanks & regards Yogesh Kumar

Re: security-in-HADOOP

2012-09-11 Thread Bertrand Dechoux
By reading the documentation, like the following http://hadoop.apache.org/docs/r1.0.3/hdfs_permissions_guide.html On Tue, Sep 11, 2012 at 8:14 PM, nisha wrote: > How security is maintained in hadoop, is it maintained by giving > folder/file permissions in hadoop > how can i make sure that somebo

How to remove datanode from cluster..

2012-09-11 Thread yogesh dhari
Hello all, I am not getting the clear way out to remove datanode from the cluster. please explain me decommissioning steps with example. like how to creating exclude files and other steps involved in it. Thanks & regards Yogesh Kumar

Re: How to remove datanode from cluster..

2012-09-11 Thread abhishek dodda
hi yogesh, Hope this helps To remove nodes from the cluster: 1. Add the network addresses of the nodes to be decommissioned to the exclude file. Do not update the include file at this point. 2. Update the namenode with the new set of permitted datanodes, with this command: % hadoop dfsadmin -r

Re: How to remove datanode from cluster..

2012-09-11 Thread Bejoy Ks
Hi Yogesh The detailed steps are available in hadoop wiki on FAQ page http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F Regrads Bejoy KS On Wed, Sep 12, 2012 at 12:14 AM, yogesh dhari wrote: > Hel

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Visioner Sadak
any hints experts atleast if i m on the right track or we cant use hftp at all coz the browser wont understand it? On Mon, Sep 10, 2012 at 1:58 PM, Visioner Sadak wrote: > or shud i use datanode ip for accessing images using hftp > > ftp://localhost:50075/Comb/java1.jpg"/> > > my datanode is

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Arpit Gupta
your browser does not know what an hftp file system is so it wont work. If you use WebHDFS it has rest api's that you can use to read data from hdfs. I would suggest look at those and try them out. http://hadoop.apache.org/docs/stable/webhdfs.html -- Arpit Gupta Hortonworks Inc. http://hortonwo

Re: Some general questions about DBInputFormat

2012-09-11 Thread Nick Jones
Hi Yaron, I haven't looked at/used it in awhile but I seem to remember that each mapper's SQL request was wrapped in a transaction to prevent the number of rows changing. DBInputFormat uses Connection.TRANSACTION_SERIALIZABLE from java.sql.Connection to prevent changes in the number of rows

Unsubscribe

2012-09-11 Thread Kunaal

RE: Unsubscribe

2012-09-11 Thread jcfolsom
FAIL Original Message Subject: Unsubscribe From: Kunaal Date: Tue, September 11, 2012 8:01 pm To: user@hadoop.apache.org

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Bharath Mundlapudi
You should never expose internal host names in the Javascript/HTML. The flow can be Browser --> Tomcat --(REST, HDFS Client)--> HDFS Your web app can make REST requests to HDFS and you could use JAX-RS impl for REST talk in your web app. I must warn that user experience will suffer by any of th

How to not output the key

2012-09-11 Thread Nataraj Rashmi - rnatar
Hello, I have simple map/reduce program to merge input files into one big output files. My question is, is there a way not to output the key from the reducer to the output file? I only want the value, not the key for each record. Thanks *

Re: Question about the task assignment strategy

2012-09-11 Thread Hiroyuki Yamada
I figured out the cause. HDFS block size is 128MB, but I specify mapred.min.split.size as 512MB, and data local I/O processing goes wrong for some reason. When I remove the mapred.min.split.size configuration, tasktrackers pick data-local tasks. Why does it happen ? It seems like a bug. Split is a

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Michael Segel
Here's one... Write a Java program which can be accessed on the server side to pull the picture from HDFS and display it on your JSP. On Sep 11, 2012, at 3:48 PM, Visioner Sadak wrote: > any hints experts atleast if i m on the right track or we cant use hftp at > all coz the browser wont u

How to split a sequence file

2012-09-11 Thread Jason Yang
Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of , like below: Text BytesWritable - id_A_01 7F2B3C687F2B3C687F2B3C68 id_A_02 2F2B3C687F2B3C687F2B3C686AB23C68D73C68D7 id_A

Re: How to not output the key

2012-09-11 Thread Manoj Babu
Hi, You have to specify the reducer key out type as NullWritable. Cheers! Manoj. On Wed, Sep 12, 2012 at 7:43 AM, Nataraj Rashmi - rnatar < rashmi.nata...@acxiom.com> wrote: > Hello, > > ** ** > > I have simple map/reduce program to merge input files into one big output > files. My quest

Re: How to split a sequence file

2012-09-11 Thread Harsh J
Hey Jason, Is the file pre-sorted? You could override the OutputFormat's #getSplits method to return InputSplits at identified key boundaries, as one solution - this would require reading the file up-front (at submit-time) and building the input splits out of it. On Wed, Sep 12, 2012 at 8:45 AM,

Re: how to specify the root directory of hadoop on slave node?

2012-09-11 Thread Hemanth Yamijala
Hi Richard, If you have installed the hadoop software on the same locations on all machines and if you have a common user on all the machines, then there should be no explicit need to specify anything more on the slaves. Can you tell us whether the above two conditions are true ? If yes, some mor

RE: Issue in access static object in MapReduce

2012-09-11 Thread Stuti Awasthi
Thanks Bejoy, I try to implement and if face any issues will let you know. Thanks Stuti From: Bejoy Ks [mailto:bejoy.had...@gmail.com] Sent: Tuesday, September 11, 2012 8:39 PM To: user@hadoop.apache.org Subject: Re: Issue in access static object in MapReduce Hi Stuti You can pass the json obje

RE: removing datanodes from clustes.

2012-09-11 Thread Brahma Reddy Battula
Hi Yogesh.. FYI. Please go through following.. http://tech.zhenhua.info/2011/04/how-to-decommission-nodesblacklist.html http://hadoop-karma.blogspot.in/2011/01/hadoop-cookbook-how-to-decommission.html From: yogesh dhari [yogeshdh...@live.com] Sent: Wednesday,

Re: Issue in access static object in MapReduce

2012-09-11 Thread Kunaal
Have you looked at Terracotta or any other distributed caching system? Kunal -- Sent while mobile -- On Sep 11, 2012, at 9:30 PM, Stuti Awasthi wrote: > Thanks Bejoy, > I try to implement and if face any issues will let you know. > > Thanks > Stuti > > From: Bejoy Ks [mailto:bejoy.had...@gm

Re: How to split a sequence file

2012-09-11 Thread Robert Dyer
If the file is pre-sorted, why not just make multiple sequence files - 1 for each split? Then you don't have to compute InputSplits because the physical files are already split. On Tue, Sep 11, 2012 at 11:00 PM, Harsh J wrote: > Hey Jason, > > Is the file pre-sorted? You could override the Outpu

Re: How to split a sequence file

2012-09-11 Thread Ajay Srivastava
Hi Jason, I am wondering about use case of distributing records on the basis of key to mapper. If possible, could you please share your scenario ? Is it map only job ? Why not distribute records using partitioner and do the processing in reducers ? Regards, Ajay Srivastava On 12-Sep-2012, at

Re: Question about the task assignment strategy

2012-09-11 Thread Hemanth Yamijala
Hi, I tried a similar experiment as yours but couldn't replicate the issue. I generated 64 MB files and added them to my DFS - one file from every machine, with a replication factor of 1, like you did. My block size was 64MB. I verified the blocks were located on the same machine as where I adde

Re: How to split a sequence file

2012-09-11 Thread Jason Yang
hey guys, Thanks for all your suggestions. To wrap up, there're two ways to achieve this: 1. use multiple sequence files, then write a WholeFileInputFormat which use each file as a split by overriding the isSeparatable(); 2. Distribute records using partitioner and do the processing in reducers,

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Visioner Sadak
Thanks a ton guys for showing the right direction i was so wrong with hftp, will try out web hdfs,is hdfs FUSE mount a good approach by using that i will have to just mount my existing local java uploads in to hdfs but can i access Har files using this or will i have to create a symlink for ac