Re: Suggestions required for learning Hadoop

2012-09-13 Thread Jay Vyas
/ @marcosluis2186 http://twitter.com/marcosluis2186 ** http://www.uci.cu/ -- Jay Vyas MMSB/UCHC

Re: Pseudo distributed mode : How to increase no of concurrent map task

2012-09-29 Thread Jay Vyas
Hmmm... I always make this mistake on my hadoop vm -- trying to set parameters which require xml settings in the conf.setInt(...) API at runtime, which sometimes has no effect. How can we know, (without having to individually troubleshoot a parameter) which parameters CAN versus CANNOT be set

Re: Lib conflicts

2012-10-03 Thread Jay Vyas
VM. Replacing the 1.4 jar with the 1.7 does seem to fix the problem but this doesn't seem too sane. Hopefully there is a better alternative. Thanks! -- Harsh J -- Jay Vyas MMSB/UCHC

Re: Why they recommend this (CPU) ?

2012-10-11 Thread Jay Vyas
Presumably, if you have a reasonable number of cores - speeding the cores up will be better than forking a task into smaller and smaller chunks - because at some point the overhead of multiple processes would be a bottleneck - maybe due to streaming reads and writes? I'm sure each and every

Re: Suitability of HDFS for live file store

2012-10-15 Thread Jay Vyas
analysis rather than any processing on the files themselves. In other words, what I really want is a distributed, resilient, scalable filesystem. Is Hadoop suitable if we just use this facility, or would I be misusing it and inviting grief? M -- Harsh J -- Jay Vyas MMSB/UCHC

Re: Hadoop counter

2012-10-19 Thread Jay Vyas
and Reducers in a job? - What is the performance and best practices of using Hadoop counters? I am not sure if using Hadoop counters too heavy, there will be performance downgrade to the whole job? regards, Lin -- Bertrand Dechoux -- Jay Vyas http://jayunit100

Re: Subscription to the mailing list

2012-10-22 Thread Jay Vyas
as MapReduce coding. I would like to subscribe to the mailing list. Is it possible that I get mails only when a response is provided to the queries I post? Thanks and regards ! -- Jay Vyas http://jayunit100.blogspot.com

Re: Interview questions..

2012-10-26 Thread Jay Vyas
Hive: Know SQL internals - how joins work, data structures and disk algorithms, etc.. And how those would be implemented in MapReduce. Know what a projection, aggregation, etc.. is. Hadoop: Know how terasort works, know how word count works, and know about why java serialization is non ideal.

Re: backup of hdfs data

2012-11-05 Thread Jay Vyas
Amazon has a really cheap, large scale backup solution called glacier which is good if your just backing up for the sake of archival in emergencies. If you need the archival to be performant, than you might want to just consider a higher replication rate.

Re: Hadoop processing

2012-11-08 Thread Jay Vyas
de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Jay Vyas http://jayunit100.blogspot.com

Re: hadoop - running examples

2012-11-08 Thread Jay Vyas
What do you mean immutable? Do u mean non modifiable maybe .? Immutable implies that they can't be deleted . Jay Vyas MMSB UCHC On Nov 8, 2012, at 5:28 PM, Mohammad Tariq donta...@gmail.com wrote: Files are immutable, once written into the Hdfs. And touchz creates a file of 0 length

Re: Can dynamic Classes be accessible to Mappers/Reducers?

2012-11-13 Thread Jay Vyas
Wow that's an awesome trick.! Okay thanks. Jay Vyas MMSB UCHC On Nov 13, 2012, at 3:56 AM, Bertrand Dechoux decho...@gmail.com wrote: You should look at the job conf file. You will see that indeed the class for the mapper and reducer are explicitly written. So if you generate the class

Re: mapred.tasktracker.map.tasks.maximum

2012-11-13 Thread Jay Vyas
Hmmm What do you mean wrong configuration file.? How could that ever happen? Jay Vyas On Nov 13, 2012, at 10:25 AM, Mark Kerzner mark.kerz...@shmsoft.com wrote: Exactly! I found the right one, and it is 80. Thank you, Mark On Tue, Nov 13, 2012 at 10:23 AM, Serge Blazhiyevskyy

Optimizing Disk I/O - does HDFS do anything ?

2012-11-13 Thread Jay Vyas
cluster. MAybe, for example, data nodes could log the amount of time spent on I/O for certain files as a way of reporting wether or not defragmentation needed to be run on a particular node in a cluster. -- Jay Vyas http://jayunit100.blogspot.com

Re: Input splits for sequence file input

2012-12-02 Thread Jay Vyas
This question is fundamentally flawed : it assumes that a mapper will ask for anything. The mapper class run method reads from a record reader. The question you really should ask is : How does a RecordReader read records across block boundaries? Jay Vyas http://jayunit100.blogspot.com

Re: DFS and the RecordReader

2012-12-06 Thread Jay Vyas
Hmm... so when a record reader calls fs.open(...) , I guess Im looking for an example of how the input stream is created... ?

Re: What is the preferred way to pass a small number of configuration parameters to a mapper or reducer

2012-12-28 Thread Jay Vyas
...@gmail.comwrote: Yes. another big data, data scientist, no ops, devops, cloud computing specialist is born. Thank goodness we have multiple choice tests to identify the best coders and administrators. -- Jay Vyas http://jayunit100.blogspot.com

Re: FileSystem.workingDir vs mapred.local.dir

2013-01-15 Thread Jay Vyas
of the FileSystem class simply seems to indicate the working directory for a given filesystem as set by applications. They don't seem very related per se, unless I am missing something ? Thanks Hemanth On Tue, Jan 15, 2013 at 2:54 AM, Jay Vyas jayunit...@gmail.com wrote: Hi guys: What

Re: FileSystem.workingDir vs mapred.local.dir

2013-01-15 Thread Jay Vyas
ah okay. so - in default hadoop dfs, the workingDir is (i beleive) /user/hadoop/ , because as i recall when putting a file into hdfs, that seems to be where the files naturally end up if there is no path specified.

Re: Configuration object not loading parameters in unit tests

2013-01-16 Thread Jay Vyas
Is there a way to have the Configuration report wether or not it was able / unable to find the default configuration resources? It looks at the moment that it simply prints out all the resources it *wants* to find, but it doesnt actually report the files which it *did* find on the classpath.

Re: core-site.xml file is being ignored by new Configuration()

2013-01-16 Thread Jay Vyas
()? Thanks, Tom On Wed, Jan 16, 2013 at 4:33 PM, Jay Vyas jayunit...@gmail.com wrote: Hi guys: I've finally extracted my problem of loading a special filesystem into a unit test. Below, clearly, Im creating a raw configuration and adding a single resource to it (core-site.xml

Re: core-site.xml file is being ignored by new Configuration()

2013-01-17 Thread Jay Vyas
...@ezako.comwrote: conf.addResource(file.getAbsoluteFile().toURI().toURL()); -- Jay Vyas http://jayunit100.blogspot.com

Re: core-site.xml file is being ignored by new Configuration()

2013-01-17 Thread Jay Vyas
). Julien 2013/1/17 Jay Vyas jayunit...@gmail.com Good catch with that string.length() - you're right, that was a silly mistake. --- sorry - im not sure what i was thinking. it was a late night :) In any case, the same code with file.exists() fails... i've validated that path many ways

Re: Program trying to read from local instead of hdfs

2013-01-17 Thread Jay Vyas
local location.. it is pointing to hdfs.. What am I doing wrong? For reference, I am trying to run this code: http://kickstarthadoop.blogspot.com/2011/09/joins-with-plain-map-reduce.html THanks -- Jay Vyas http://jayunit100.blogspot.com

Re: Sorting huge text files in Hadoop

2013-02-15 Thread Jay Vyas
i don't think you can't do an embarassingly parallel sort of a randomly ordered file without merging results. However, if you know that the file is psudeoordered: 1123 1232 1000 19991019 20200222 30111 3000 Then you can (maybe) sort the individual blocks in mappers using

Re: Sorting huge text files in Hadoop

2013-02-15 Thread Jay Vyas
well.. ok... i guess you could have a 1TB block do an in place sort on the file, write it to a tmp directory, and then spill the records in order or something. at that point might as well not use hadoop.

Re: Sorting huge text files in Hadoop

2013-02-15 Thread Jay Vyas
michael_se...@hotmail.comwrote: Why do you need a 1TB block? On Feb 15, 2013, at 1:29 PM, Jay Vyas jayunit...@gmail.com wrote: well.. ok... i guess you could have a 1TB block do an in place sort on the file, write it to a tmp directory, and then spill the records in order or something

Re: How do _you_ document your hadoop jobs?

2013-02-25 Thread Jay Vyas
Wow that's very heavy weight and difficult to modify. Why not graphviz or generating the diagrams from some Or text format.? On Feb 25, 2013, at 4:11 AM, David Parks davidpark...@yahoo.com wrote: We’ve taken to documenting our Hadoop jobs in a simple visual manner using PPT (attached

cannot find /usr/lib/hadoop/mapred/

2013-03-06 Thread Jay Vyas
:1522) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3821) -- Jay Vyas http://jayunit100.blogspot.com

Re: cannot find /usr/lib/hadoop/mapred/

2013-03-06 Thread Jay Vyas
message and, also, maybe I shouldn't be using /usr/lib/hadoop/mapred as an actively writen directory. On Wed, Mar 6, 2013 at 11:21 AM, Jay Vyas jayunit...@gmail.com wrote: Hi guys: I'm getting an odd error involving a file called toBeDeleted. I've never seen this - somehow its blocking my task

MVN repository for hadoop trunk

2013-04-06 Thread Jay Vyas
Hi guys: Is there a mvn repo for hadoop's 3.0.0 trunk build? Clearly the hadoop pom.xml allows us to build hadoop from scratch and installs it as 3.0.0-SNAPSHOT -- but its not clear wether there is a published version of this snapshot jar somewhere. -- Jay Vyas http://jayunit100.blogspot.com

Re: MVN repository for hadoop trunk

2013-04-06 Thread Jay Vyas
: https://repository.apache.org/content/groups/snapshots -Giri On Sat, Apr 6, 2013 at 2:00 PM, Harsh J ha...@cloudera.com wrote: I don't think we publish nightly or rolling jars anywhere on maven central from trunk builds. On Sun, Apr 7, 2013 at 2:17 AM, Jay Vyas jayunit

Re: Distributed cache: how big is too big?

2013-04-09 Thread Jay Vyas
Hmmm.. maybe im missing something.. but (@bjorn) Why would you use hdfs as a replacement for the distributed cache? After all - the distributed cache is just a file with replication over the whole cluster, which isn't in hdfs. Cant you Just make the cache size big and store the file there?

The Job.xml file

2013-04-09 Thread Jay Vyas
into individual tasks So, my (related) questions are: Is there a way to start a job directly from a job.xml file? What components depend on and read the job.xml file? Where is the job.xml defined/documented (if anywhere)? -- Jay Vyas http://jayunit100.blogspot.com

Re: No bulid.xml when to build FUSE

2013-04-10 Thread Jay Vyas
the wrong codes, Or is there any other ways to bulid fuse-dfs? * * Please guide me . * * * * *Thanks * regards -- Jay Vyas http://jayunit100.blogspot.com

Re: Copy Vs DistCP

2013-04-10 Thread Jay Vyas
/some_location/file /new_location/ Thanks, your responses are appreciated. -- Kay -- Jay Vyas http://jayunit100.blogspot.com

Re: Writing intermediate key,value pairs to file and read it again

2013-04-20 Thread Jay Vyas
How many intermediate keys? If small enough, you can keep them in memory. If large, you can just wait for the job to finish and siphon them into your job as input with the MultipleInputs API. On Apr 20, 2013, at 10:43 AM, Vikas Jadhav vikascjadha...@gmail.com wrote: Hello, Can anyone help

Re: Append MR output file to an exitsted HDFS file

2013-04-21 Thread Jay Vyas
OutputFormat is going to have to find the corresponding file to append. On Sun, Apr 21, 2013 at 10:54 PM, YouPeng Yang yypvsxf19870...@gmail.comwrote: Hi All Can I append a MR output file to an existed file on HDFS. I‘m using CDH4.1.2 vs MRv2 Regards -- Jay Vyas http

Re: Maven dependency

2013-04-24 Thread Jay Vyas
reduce dependencies and I am not sure which to pick. Are there other dependencies need (such as JobConf)? What are the imports needed? During the construction of the configuration what heuristics are used to find the configuration for the Hadoop cluster? ** ** Thank you. -- Jay Vyas

partition as block?

2013-04-30 Thread Jay Vyas
traffic... but maybe (1) or (2) will be a precise way to use partitions as a poor mans block. Just a thought - not sure if anyone has tried (1) or (2) before in order to simulate blocks and increase locality by utilizing the partition API. -- Jay Vyas http://jayunit100.blogspot.com

Re: partition as block?

2013-04-30 Thread Jay Vyas
/ cloudfront.blogspot.com On Wed, May 1, 2013 at 12:16 AM, Jay Vyas jayunit...@gmail.com wrote: Hi guys: Im wondering - if I'm running mapreduce jobs on a cluster with large block sizes - can i increase performance with either: 1) A custom FileInputFormat 2) A custom partitioner 3) -DnumReducers Clearly

Re: partition as block?

2013-04-30 Thread Jay Vyas
it be a bottleneck from 'disk' point of view??Are you not going away from the distributed paradigm?? Am I taking it in the correct way. Please correct me if I am getting it wrong. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Wed, May 1, 2013 at 12:34 AM, Jay Vyas jayunit

Re: partition as block?

2013-04-30 Thread Jay Vyas
it considerably high will definitely give you some boost. But it'll require a high level tinkering. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Wed, May 1, 2013 at 1:29 AM, Jay Vyas jayunit...@gmail.com wrote: Yes it is a problem at the first stage. What I'm wondering

Re: Configuring SSH - is it required? for a psedo distriburted mode?

2013-05-16 Thread Jay Vyas
to implement on a single node: cat ~/.ssh/id_rsa.pub /root/.ssh/authorized_keys On Thu, May 16, 2013 at 11:31 AM, Jay Vyas jayunit...@gmail.com wrote: Yes it is required -- in psuedodistributed node the jobtracker is not necessarily aware that the task trackers / data nodes are on the same

Re: understanding souce code structure

2013-05-27 Thread Jay Vyas
Hi! a few weeks ago I had the same question... Tried a first iteration at documenting this by going through the classes starting with key/value pairs in the blog post below. http://jayunit100.blogspot.com/2013/04/the-kv-pair-salmon-run-in-mapreduce-hdfs.html Note it's not perfect yet but I

Re: What else can be built on top of YARN.

2013-05-30 Thread Jay Vyas
efficiently using MRv2 jobs. thanks, Rahul -- Jay Vyas http://jayunit100.blogspot.com

Re: Install hadoop on multiple VMs in 1 laptop like a cluster

2013-05-31 Thread Jay Vyas
...@yahoo.in wrote: Just wondering if anyone has any documentation or references to any articles how to simulate a multi node cluster setup in 1 laptop with hadoop running on multiple ubuntu VMs. any help is appreciated. Thanks Sai -- Jay Vyas http://jayunit100.blogspot.com

Re: HDFS interfaces

2013-06-04 Thread Jay Vyas
the data is located. * What are these interfaces and where they are in the source code? Is there any manual for the interfaces? Regards, Mahmood -- Jay Vyas http://jayunit100.blogspot.com

starting Hadoop, the new way

2013-07-05 Thread Jay Vyas
is and if I need to set any particular env variables when doing so. -- Jay Vyas http://jayunit100.blogspot.com

Data node EPERM not permitted.

2013-07-06 Thread Jay Vyas
(DataNode.java:1575) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1598) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1751) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1772) -- Jay Vyas http

Re: CompositeInputFormat

2013-07-11 Thread Jay Vyas
someone help me out? It would be much appreciated. J ** ** Thanks in advance, ** ** Andrew -- Jay Vyas http://jayunit100.blogspot.com

Staging directory ENOTDIR error.

2013-07-11 Thread Jay Vyas
) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116) -- Jay Vyas http://jayunit100.blogspot.com

Re: Staging directory ENOTDIR error.

2013-07-12 Thread Jay Vyas
” configuration in client with the HDFS. ** ** Thanks Devaraj k ** ** *From:* Jay Vyas [mailto:jayunit...@gmail.com] *Sent:* 12 July 2013 04:12 *To:* common-u...@hadoop.apache.org *Subject:* Staging directory ENOTDIR error. ** ** Hi , I'm getting an ungoogleable

Re: solr -Reg

2013-07-28 Thread Jay Vyas
True that it deserves some posting on solr, but i think It's still partially relevant... The SolrInputFormat and SolrOutputFormat handle this for you and will be used in your map reduce jobs . They will output one core. per reducer, where each reducer corresponds to a core.. This is

Re: Why LineRecordWriter.write(..) is synchronized

2013-08-08 Thread Jay Vyas
Then is this a bug? Synchronization in absence of any race condition is normally considered bad. In any case id like to know why this writer is synchronized whereas the other one are not.. That is, I think, then point at issue: either other writers should be synchronized or else this one

Re: e-Science app on Hadoop

2013-08-16 Thread Jay Vyas
an e-Science application to run on Hadoop? Thanks. Felipe -- *-- -- Felipe Oliveira Gutierrez -- felipe.o.gutier...@gmail.com -- https://sites.google.com/site/lipe82/Home/diaadia* -- Jay Vyas http://jayunit100.blogspot.com

examples of HADOOP REST API

2013-08-20 Thread Jay Vyas
between this and the ambari REST services, but not sure where to start digging. I want to run some rest calls at the end of some jobs to query how many tasks failed, etc... Hopefully, I could get this in JSON rather than scraping HTML. Thanks! -- Jay Vyas http://jayunit100.blogspot.com

Re: hadoop cares about /etc/hosts ?

2013-09-09 Thread Jay Vyas
this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Jay Vyas http://jayunit100.blogspot.com

Re: Concatenate multiple sequence files into 1 big sequence file

2013-09-10 Thread Jay Vyas
iirc sequence files can be concatenated as is and read as one large file but maybe im forgetting something.

Re: Extending DFSInputStream class

2013-09-26 Thread Jay Vyas
inaccessible for developers, or am I missing something? regards tmp -- Jay Vyas http://jayunit100.blogspot.com

Re: Extending DFSInputStream class

2013-09-26 Thread Jay Vyas
The way we have gotten around this in the past is extending and then copying the private code and creating a brand new implementation. On Thu, Sep 26, 2013 at 10:50 AM, Jay Vyas jayunit...@gmail.com wrote: This is actually somewhat common in some of the hadoop core classes : Private

Re: Retrieve and compute input splits

2013-09-27 Thread Jay Vyas
the input splits. Any help please. Thanks Sai -- Jay Vyas http://jayunit100.blogspot.com

Re: Uploading a file to HDFS

2013-10-01 Thread Jay Vyas
are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -- Jay Vyas http

YARN And NTP

2013-10-24 Thread Jay Vyas
, but depending on underlying filesystem the semantics of this last modified time might vary. Any thoughts on this? -- Jay Vyas http://jayunit100.blogspot.com

Re: how to stream the video from hdfs

2013-11-13 Thread Jay Vyas
I believe there is a FUSE mount for hdfs which will allow you to open files normally in your streaming app rather than requiring using the jav API. Also consider that For Media and highly available binary data for a front end I would guess that hdfs might be overkill because of the

Hadoop Test libraries: Where did they go ?

2013-11-21 Thread Jay Vyas
is packaged into a jar anymore... but i fear it is not. -- Jay Vyas http://jayunit100.blogspot.com

Re: how to prevent JAVA HEAP OOM happen in shuffle process in a MR job?

2013-12-02 Thread Jay Vyas
version is rewally important here.. - If 1.x, then Where (NN , JT , TT ?) - if 2.x, then where? (AM, NM, ... ?) -- probably less likely here, since the resources are ephemeral. I know that some older 1x versions had an issue with the jobtracker having an ever-expanding hashmap or something like

Re: multiusers in hadoop through LDAP

2013-12-10 Thread Jay Vyas
OS? So that if a user is authenticated by the LDAP ,who will also access the HDFS directory? Regards -- Jay Vyas http://jayunit100.blogspot.com

Pluggable distribute cache impl

2013-12-15 Thread Jay Vyas
are there any ways to plug in an alternate distributed cache implantation (I.e when nodes of a cluster already have an nfs mount or other local data service...)?

Re: Debug Hadoop Junit Test in Eclipse

2013-12-16 Thread Jay Vyas
are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email. -- Jay Vyas http://jayunit100

Re: Debug Hadoop Junit Test in Eclipse

2013-12-16 Thread Jay Vyas
it uploads files. So I am only looking to trace fs commands through the DFS shell. I believe this should be require less work in debugging than actually going to mapred VMs! -- Best Regards, Karim Ahmed Awara On Mon, Dec 16, 2013 at 5:57 PM, Jay Vyas jayunit...@gmail.com wrote: Excellent

Re: Ways to manage user accounts on hadoop cluster when using kerberos security

2014-01-08 Thread Jay Vyas
I recently found a pretty simple and easy way to set ldap up for my machines on rhel and wrote it up using jumpbox and authconfig. If you are in the cloud and only need a quick easy ldap idh and nssswitch setup, this is I think the easiest / cheapest way to do it. I know rhel and fedora come

Re: What is the difference between Hdfs and DistributedFileSystem?

2014-01-13 Thread Jay Vyas
印 liyin.lian...@aliyun-inc.com wrote: What is the difference between Hdfs.java and DistributedFileSystem.java in Hadoop2? Best Regards, Liyin Liang Tel: 78233 Email: liyin.lian...@alibaba-inc.com -- Jay Vyas http://jayunit100.blogspot.com

Re: Shutdown hook for FileSystems

2014-01-21 Thread Jay Vyas
what is happening when you remove the shutdown hook ?is that supposed to trigger an exception -

Strange rpc exception in Yarn

2014-01-27 Thread Jay Vyas
Hi folks: At the **end** of a successful job, im getting some strange stack traces this when using pig, however, it doesnt seem to be pig specific from the stacktrace. Rather, it appears that the job client is attempting to do something funny. Anyone ever see this sort of exception in

Re: Hadoop-2.2.0 and Pig-0.12.0 - error IBM_JAVA

2014-01-28 Thread Jay Vyas
issue with hadoop and pig. I'm using Java version - *1.6.0_31* Please help me out. -- Regards, Viswa.J -- Jay Vyas http://jayunit100.blogspot.com

performance of hadoop fs -put

2014-01-28 Thread Jay Vyas
into this issue. ** Is hadoop fs -put inherently slower than a unix cpaction, regardless of filesystem -- and if so , why? ** -- Jay Vyas http://jayunit100.blogspot.com

Re: performance of hadoop fs -put

2014-01-29 Thread Jay Vyas
No , im using a glob pattern, its all done in one put statement On Tue, Jan 28, 2014 at 9:22 PM, Harsh J ha...@cloudera.com wrote: Are you calling one command per file? That's bound to be slow as it invokes a new JVM each time. On Jan 29, 2014 7:15 AM, Jay Vyas jayunit...@gmail.com wrote

Re: DistributedCache deprecated

2014-01-29 Thread Jay Vyas
) Add a file to be localized and it works fine. The same way you were using DC before.. Well I am not sure what would be the best answer, but if you are trying to use DC , I was able to do it with Job class itself. Regards Prav On Wed, Jan 29, 2014 at 9:27 PM, Jay Vyas jayunit

Re: Passing data from Client to AM

2014-01-29 Thread Jay Vyas
this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Jay Vyas http://jayunit100.blogspot.com

YARN FSDownload: How did Mr1 do it ?

2014-02-11 Thread Jay Vyas
Im noticing that resource localization is much more complex in YARN than MR1, in particular, the timestamps need to be identical, or else, an exception is thrown. i never saw that in MR1. How did MR1 JobTrackers handle resource localization differently than MR2 App Masters? -- Jay Vyas http

Re: Test hadoop code on the cloud

2014-02-12 Thread Jay Vyas
is the simplest way to do this on the cloud? Is there any way to do it for free? Thank in advance -- Jay Vyas http://jayunit100.blogspot.com

How to ascertain why LinuxContainer dies?

2014-02-13 Thread Jay Vyas
) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) where can i find the root cause of the non-zero exit code ? -- Jay Vyas http://jayunit100

Re: How to ascertain why LinuxContainer dies?

2014-02-14 Thread Jay Vyas
that you can check under the container's work directory after it fails? On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas jayunit...@gmail.com wrote: I have a linux container that dies. The nodemanager logs only say: WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception

Re: How to ascertain why LinuxContainer dies?

2014-02-14 Thread Jay Vyas
Feb 4 17:27 .. drwx--x--- 2 htf htf 4096 Feb 4 17:27 . -rw-rw-r-- 1 htf htf 50471 Feb 4 17:31 syslog Regards ./g -Original Message- From: Jay Vyas [mailto:jayunit...@gmail.com] Sent: Friday, February 14, 2014 7:02 AM To: user@hadoop.apache.org Cc: user

YARN job exits fast without failure, but does nothing

2014-02-17 Thread Jay Vyas
-- Jay Vyas http://jayunit100.blogspot.com

Re: What if file format is dependent upon first few lines?

2014-02-27 Thread Jay Vyas
is a must to parse each log line. It means log file could NOT be simply splitted, otherwise the second split would lost the file format information. How could each mapper get the first few lines in the file? -- Harsh J -- Jay Vyas http://jayunit100.blogspot.com

Re: Doubt

2014-03-19 Thread Jay Vyas
of few things, but as far as installation is concerned, it should be easily doable. Regards Prav On Wed, Mar 19, 2014 at 3:41 PM, sri harsha rsharsh...@gmail.com wrote: Hi all, is it possible to install Mongodb on the same VM which consists hadoop? -- amiable harsha -- Jay Vyas http

Jobs fail immediately in local mode ?

2014-03-29 Thread Jay Vyas
. -- Jay Vyas http://jayunit100.blogspot.com

Re: Hadoop Serialization mechanisms

2014-03-31 Thread Jay Vyas
see a gain in using a more efficient data serialisation format for data files. On Sun, Mar 30, 2014 at 9:09 PM, Jay Vyas jayunit...@gmail.com wrote: Those are all great questions, and mostly difficultto answer.I havent played with serialization APIs in some time, but let me try to give some

Re: MapReduce for complex key/value pairs?

2014-04-08 Thread Jay Vyas
using Java, btw. Thank you, Natalia Connolly -- Harsh J -- Jay Vyas http://jayunit100.blogspot.com

Re: Shuffle Error after enabling Kerberos authentication

2014-04-19 Thread Jay Vyas
works fine if Kerberos authentication is disabled. Any idea what what the problem could be? Thanks, Terance. -- Jay Vyas http://jayunit100.blogspot.com

Re: Strange error in Hadoop 2.2.0: FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/

2014-04-22 Thread Jay Vyas
of? Perhaps some permissions issues? Thank you, Natalia -- Jay Vyas http://jayunit100.blogspot.com

Yarn hangs @Scheduled

2014-04-24 Thread Jay Vyas
attempt.RMAppAttemptImpl: appattempt_1398370674313_0004_01 State change from SUBMITTED to SCHEDULED -- Jay Vyas http://jayunit100.blogspot.com

Re: Yarn hangs @Scheduled

2014-04-24 Thread Jay Vyas
appattempt_1398370674313_0004_01 to scheduler from user: yarn 14/04/24 16:20:33 INFO attempt.RMAppAttemptImpl: appattempt_1398370674313_0004_01 State change from SUBMITTED to SCHEDULED -- Jay Vyas http://jayunit100.blogspot.com CONFIDENTIALITY NOTICE NOTICE: This message is intended

Re: No job can run in YARN (Hadoop-2.2)

2014-05-11 Thread Jay Vyas
Sounds oddSo (1) you got a filenotfound exception and (2) you fixed it by commenting out memory specific config parameters? Not sure how that would work... Any other details or am I missing something else? On May 11, 2014, at 4:16 AM, Tao Xiao xiaotao.cs@gmail.com wrote: I'm sure

Re: Hadoop with SAN

2014-06-15 Thread Jay Vyas
You can either use san to back your datanodes, or implement a custom FileSystem over your san storage. Either would have different drawbacks depending on your requirements.

Re: Hadoop virtual machine

2014-07-06 Thread jay vyas
helpful material are appreciated. Manar, -- -- André Kelpe an...@concurrentinc.com http://concurrentinc.com -- jay vyas

Re: clarification on HBASE functionality

2014-07-15 Thread Jay Vyas
Hbase is not harcoded to hdfs: it works on any file system that implements the file system interface, we've run it on glusterfs for example. I assume some have also run it on s3 and other alternative file systems . ** However ** For best performance, direct block io hooks on hdfs can boost

Re: Bench-marking Hadoop Performance

2014-07-22 Thread jay vyas
options? Also, I have searched safari books online including rough cuts, but not seeing books for the 2.4 release. If you know of a book for this release, please share. Thank you. -- jay vyas

  1   2   >