e just javadocs. i'm talking about the full documentations
> (see original post).
>
>
> On Tue, Jul 29, 2014 at 2:17 PM, Harsh J wrote:
>
>> Precompiled docs are available in the archived tarballs of these
>> releases, which you can find on:
>> https://archive
1, 2.2.0, 2.4.1, 0.23.11
--
Harsh J
roup.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscr...@cloudera.org.
> For more options, visit https://groups.google.com/a/cloudera.org/d/optout.
--
Harsh J
er how can i do this?
Remove the configuration override, and it will always go back to the
default FIFO based scheduler, the same whose source has been linked
above.
> I am struggling since 4 months to get help on Apache Hadoop??
Are you unsure about this?
--
Harsh J
roup:drwxrwx---
>
> to me, it seems that i have already disabled permission checking, so i
> shouldn't get that AccessControlException.
>
> any ideas?
--
Harsh J
049.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
--
Harsh J
adoop-0.23.9 /opt
> 3. ln -s /opt/hadoop-0.23.9 /opt/hadoop
> 4. export HADOOP_HOME=/opt/hadoop
> 5. export JAVA_HOME=/opt/java
> 6. export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${PATH}
>
> any help is appreciated.
--
Harsh J
icStore*" related problem in Eclipse on Mac. I want to run this code
> from eclipse. Please let me know if anyone knows the trick to resolve this
> problem on Mac. This is really annoying problem.
>
> --
> Thanks & Regards,
> Anil Gupta
--
Harsh J
t quite sure about
> the details
>
>
> On Thu, May 23, 2013 at 4:37 PM, Harsh J wrote:
>
>> Your problem seems to surround available memory and over-subscription. If
>> you're using a 0.20.x or 1.x version of Apache Hadoop, you probably want to
>> use the
his limit for every job.
>
> Are there plans to fix this??
>
> --
>
--
Harsh J
but I get permission issues
>
>
>
> --
>
--
Harsh J
ted to connect to the namenode. The full pathname of the
> file must be specified. If the value is empty, no hosts are
> excluded.
>
>
>
> dfs.webhdfs.enabled
> false
> Enable or disable webhdfs. Defaults to false
>
>
> dfs.support.append
> true
> Enable o
d to avoid Access Exceptions
>
> How do I do this and If I can only access certain directories how do I do
> that?
>
> Also are there some directories my code MUST be able to access outside
> those for my user only?
>
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
--
Harsh J
hwiley.com
> music.keithwiley.com
>
> "I used to be with it, but then they changed what it was. Now, what I'm with
> isn't it, and what's it seems weird and scary to me."
>-- Abe (Grandpa) Simpson
>
>
--
Harsh J
>From your email header:
List-Unsubscribe: <mailto:common-user-unsubscr...@hadoop.apache.org>
On Wed, Mar 13, 2013 at 10:42 AM, Alex Luya wrote:
> can't find a way to unsubscribe from this list.
--
Harsh J
egards
> Abhishek
>
>
> On Feb 22, 2013, at 1:03 AM, Harsh J wrote:
>
>> HDFS does not have such a client-side feature, but your applications
>> can use Apache Zookeeper to coordinate and implement this on their own
>> - it can be used to achieve distributed l
very easy.
On Fri, Feb 22, 2013 at 5:17 AM, abhishek wrote:
>
>> Hello,
>
>> How can I impose read lock, for a file in HDFS
>>
>> So that only one user (or) one application , can access file in hdfs at any
>> point of time.
>>
>> Regards
>> Abhi
>
> --
>
>
>
--
Harsh J
ory structure as output?
>
> Thanks,
> Max Lebedev
>
>
>
> --
> View this message in context:
> http://hadoop.6.n7.nabble.com/Running-hadoop-on-directory-structure-tp67904.html
> Sent from the common-user mailing list archive at Nabble.com.
--
Harsh J
o quickly.
>
> The full source code is attached as there's nothing sensitive in it.
> Coding wouldn't be my strong point so apologies in advance if it looks a
> mess.
>
> Thanks
>
>
> On Sat, Feb 9, 2013 at 6:09 PM, Harsh J wrote:
>>
>> Whatever &qu
e data into the mapper and processing it
> public static class MapClass extends Mapper VectorWritable> {
> public void map (LongWritable key, Text value,Context context)
> throws IOException, InterruptedException {
>
> Would anyone have any clues as to what would be wrong with the arguements
> being passed to the Mapper?
>
> Any help would be appreciated,
>
> Thanks.
--
Harsh J
Thanks
>
>
>
>
> --
> View this message in context:
> http://hadoop-common.472056.n3.nabble.com/no-jobtracker-to-stop-no-namenode-to-stop-tp34874p4006830.html
> Sent from the Users mailing list archive at Nabble.com.
>
--
Harsh J
> branch-0.21(also in trunk), say HADOOP-4012 and MAPREDUCE-830, but not
> integrated/migrated into branch-1, so I guess we don't support contatenated
> bzip2 in branch-1, correct? If so, is there any special reason? Many thanks!
>
> --
> Best Regards,
> Li Yu
--
Harsh J
enables
> multiple use of
> tasks on a single JVM. It does not say anything about setup() or cleanup().
>
> Even if I set "mapred.job.reuse.jvm.num.tasks" to -1, do setup() and
> cleanup() get called every single time a task is launched?
>
> Best,
> Ed
--
Harsh J
> filesystem, just for the hell of it - for fast unit tests, that simulated
> lookups and stuff.
>
> So - if the interface is abstract and decoupled enough from any real world
> filesystem, i think this could definetly work.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
--
Harsh J
possible for reducers to start (not just copying, but actually)
> "reducing" before all mappers are done, speculatively?
>
> In particular im asking this because Im curious about the internals of how
> the shuffle and sort might (or might not :)) be able to support this.
--
Harsh J
way, the
> hardlinks should still point to them. The reading job reads from the
> hardlinks and cleans them up when done. If the hardlinks are placed in a
> directory with the reading job-id then garbage collection should be
> possible for crashed jobs if normal cleanup fails.
>
> Gr
d jar file.
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
u are better off asking the
u...@hive.apache.org lists than the Hadoop user lists here.
--
Harsh J
borate ?
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
really need alot of message passing, then then you might be
>> > better of using an inherently more integrated tool like GridGain... which
>> > allows for sophisticated message passing between asynchronously running
>> > processes, i.e.
>> >
>> http://gridgaintech.wordpress.com/2011/01/26/distributed-actors-in-gridgain/
>> .
>> >
>> >
>> > It seems like there might not be a reliable way to implement a
>> > sophisticated message passing architecutre in hadoop, because the system
>> is
>> > inherently so dynamic, and is built for rapid streaming reads/writes,
>> which
>> > would be stifled by significant communication overhead.
>>
>
>
>
> --
> Bertrand Dechoux
--
Harsh J
hat i agree with largely) against this approach.
>>>
>>> in general, my question is this: has anyone tried to implement an
>>> algorithm using mapreduce where mappers required cross-communications?
>>> how did you solve this limitation of mapreduce?
>>>
>>> thanks,
>>>
>>> jane.
>>>
--
Harsh J
ponsible for delivering it to the intended recipient, you are
> hereby notified that you have received this document in error and
> that any review, dissemination, distribution, or copying of this
> message is strictly prohibited. If you have received this
> communication in error, please notify us immediately by e-mail, and
> delete the original message.
--
Harsh J
Do we
> decompress first , and then deserialize? Or do them both at the same time
> ? Thanks!
>
> PS I've added an issue to github here
> https://github.com/matteobertozzi/Hadoop/issues/5, for a python
> SequenceFile reader. If I get some helpful hints on this thread maybe I
> can directly implement an example on matteobertozzi's python hadoop trunk.
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
> > > > I am going to process video analytics using hadoop
>> > > > > I am very interested about CPU+GPU architercute espessially using
>> > CUDA
>> > > (
>> > > > > http://www.nvidia.com/object/cuda_home_new.html) and JCUDA (
>> > > > > http://jcuda.org/)
>> > > > > Does using HADOOP and CPU+GPU architecture bring significant
>> > > performance
>> > > > > improvement and does someone succeeded to implement it in
>> production
>> > > > > quality?
>> > > > >
>> > > > > I didn't fine any projects / examples to use such technology.
>> > > > > If someone could give me a link to best practices and example using
>> > > > > CUDA/JCUDA + hadoop that would be great.
>> > > > > Thanks in advane
>> > > > > Oleg.
>> > > > >
>> > > >
>> > >
>> >
>>
--
Harsh J
someone wait after leaving the safemode?
> Why is it recommended not to set it to 0 instead of 3 (30 seconds)?
>
> Regards
>
> Bertrand
--
Harsh J
t if the path (arg[3]) is on local file system like
> /tmpFolder/myfile, then the above code report the file as not existing
> where the file is there for sure. What I am doing wrong?
--
Harsh J
String.valueOf(submitTime) ,
> jobConfPath}
> );
>
> }catch(IOException e){
> LOG.error("Failed creating job history log file, disabling
> history", e);
> *disableHistory = true; *
> }
> }
>
>
> Thanks,
--
Harsh J
:
> Thanks for your immediately reply . Are there some other ways that do not
> interrupt the running of TaskTracker?
>
> On Mon, Sep 10, 2012 at 11:25 AM, Harsh J wrote:
>
>> You could just restart the specific tasktracker that has filled this
>> directory; there
t's about 32 G and I must
> restart the cluser to clean it .
>
> Thanks & Best Regards
>
> hong
--
Harsh J
re the newline and I would like to get the full records as part of
> 'tail' (not truncated versions).
>
> Thanks,
> -Sukhendu
--
Harsh J
take in effect? Is there anything else
> I need to do to make the change to be applied?
You will need to restart the JobTracker JVM for the new heap limit to
get used. You can run "hadoop-daemon.sh stop jobtracker" followed by
"hadoop-daemon.sh start jobtracker" to restart just the JobTracker
daemon (run the command on the JT node).
--
Harsh J
>>Thanks in advance
>>
>>
>>
>>--
>>View this message in context:
>>http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185.html
>>Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>>
>>
>>
--
Harsh J
ifferent directory structure.. Of course, I'm
> probably either not following the M/R Paradigm - or just doing it wrong.
>
> The FilealreadyExistsException was applicable to my "/foo/bar" directory
> which had very little to do with my "genuine" output.
>
>
&g
ccurrs when a
>> duplicate directory is discovered by an OutputFormat,
>> Is there a hadoop property that is accessible by the client to disable
>> this behavior?
>>
>> IE, disable.file.already.exists.behaviour=true
>>
>> Thank You
>> Daniel G. Hoffman
>>
>
>
>
> --
> Bertrand Dechoux
--
Harsh J
Abhishek,
Moving this to the u...@sqoop.apache.org lists.
On Mon, Aug 20, 2012 at 9:03 PM, sudeep tokala wrote:
> hi all,
>
> what are all security concerns with sqoop.
>
> Regards
> Abhishek
--
Harsh J
HDFS and implement further .
>
>
>
> Kind Regard
> Sujit Dhamale
--
Harsh J
No, you will need to restart the TaskTracker to have it in effect.
On Sat, Aug 18, 2012 at 8:46 PM, Jay Vyas wrote:
> hmmm I wonder if there is a way to push conf/*xml parameters out to all
> the slaves, maybe at runtime ?
>
> On Sat, Aug 18, 2012 at 4:06 PM, Harsh J wro
d "
>
> I've confirmed in my job that the counter parameter is correct, when the
> job starts... However... somehow the "120 limit exceeded" exception is
> still thrown.
>
> This is in elastic map reduce, hadoop .20.205
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
me
> , step by step?
> Thank you very much.
> --
> Hoàng Minh Hương
> GOYOH VIETNAM 44-Trần Cung-Hà Nội-Viet NAM
> Tel: 0915318789
--
Harsh J
f simply storing Json
> in hdfs for processing. I see there is Avro that does the similar thing but
> most likely stores it in more optimized format. I wanted to get users
> opinion on which one is better.
--
Harsh J
, abhiTowson cal
wrote:
> hi all,
>
> can log data be converted into avro,when data is sent from source to sink.
>
> Regards
> Abhishek
--
Harsh J
nside a mapreduce job Is it okay to do this ? Or does a classes
> ability to aquire local resources change in the mapper/reducer JVMs ?
I believe this should work.
--
Harsh J
8101040_0003_m_21_0/taskjvm.sh
> /mnt/DP_disk4/raymond/hdfs/mapred/ttprivate/taskTracker/raymond/jobcache/job_201208101040_0003/attempt_201208101040_0003_m_21_0/taskjvm.sh
>
> So, Is there anything I am still missing?
>
>
> Best Regards,
> Raymond Liu
>
--
Harsh J
Yes, singular JVM (The test JVM itself) and the latter approach (no
TT/JT daemons).
On Wed, Aug 8, 2012 at 4:50 AM, Mohit Anchlia wrote:
> On Tue, Aug 7, 2012 at 2:08 PM, Harsh J wrote:
>
>> It used the local mode of operation:
>> org.apache.hadoop.mapred.LocalJobRunner
&g
org.apache.hadoop.mapred.JobClient [main]: Combine output
> records=0
> INFO org.apache.hadoop.mapred.JobClient [main]: Reduce output records=1
> INFO org.apache.hadoop.mapred.JobClient [main]: Map output records=2
> INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [main]: Inside
> reduce
> INFO com.i.cg.services.dp.analytics.hadoop.mapred.GeoLookup [main]: Outsid
> e reduce
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.547 sec
> Results :
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
--
Harsh J
If you instantiate the JobConf with your existing conf object, then
you needn't have that fear.
On Wed, Aug 8, 2012 at 1:40 AM, Mohit Anchlia wrote:
> On Tue, Aug 7, 2012 at 12:50 PM, Harsh J wrote:
>
>> What is GeoLookupConfigRunner and how do you utilize the setConf(conf)
&g
e(output, true);
>
> log.info("Here");
> GeoLookupConfigRunner configRunner = new GeoLookupConfigRunner();
> configRunner.setConf(conf);
> int exitCode = configRunner.run(new String[]{input.toString(),
> output.toString()});
> Assert.assertEquals(exitCode, 0);
> }
--
Harsh J
did
>
> Text zip = new Text();
> zip.set("9099");
> collector.write(zip,value);
> zip.set("9099");
> collector.write(zip,value1);
>
> Should I expect to receive both values in reducer or just one?
--
Harsh J
Welcome! We look forward to learn from you too! :)
On Fri, Aug 3, 2012 at 10:58 PM, Harit Himanshu <
harit.subscripti...@gmail.com> wrote:
> first message - I have just joined this group looking forward to learn from
> everyone
>
--
Harsh J
ty.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> I asked this question on
fornia
>> Software Engineer, Infrastructure | Loki Studios
>> fb.me/marco.gallotta (http://fb.me/marco.gallotta) | twitter.com/marcog
>> (http://twitter.com/marcog)
>> ma...@gallotta.co.za (mailto:ma...@gallotta.co.za) | +1 (650) 417-3313
>>
>> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
--
Harsh J
ies :
>
> fs.default.name
> s3://BUCKET
>
>
>
> fs.s3.awsAccessKeyId
> ID
>
>
>
> fs.s3.awsSecretAccessKey
> SECRET
>
>
> hdfs-site.xml is empty!
>
> Namenode log says, its trying to connect to local HDFS not S3.
> Am i missing anything?
>
> Regards,
> Alok
--
Harsh J
th
> have no access to counters.
>
> Is there really no way to increment counters inside of a RecordReader or
> InputFormat in the mapreduce api?
--
Harsh J
rom source I no longer see the issue.
>
>
> On Jul 20, 2012, at 8:48 PM, Harsh J wrote:
>
>> Prashant,
>>
>> Can you add in some context on how these files were written, etc.?
>> Perhaps open a JIRA with a sample file and test-case to reproduce
>> this? O
Physical memory (bytes)
> snapshot=0
> 12/07/29 13:36:02 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
> 12/07/29 13:36:02 INFO mapred.JobClient: Total committed heap
> usage (bytes)=124715008
> 12/07/29 13:36:02 INFO mapred.JobClient:
> org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
>
> Regards
> Abhishek
--
Harsh J
gt; Abhishek
>
> Sent from my iPhone
>
> On Jul 28, 2012, at 12:57 AM, Harsh J wrote:
>
>> Hi Abhishek,
>>
>> Easy on the caps mate. Can you pastebin.com-paste your NM logs and RM logs?
>>
>> On Sat, Jul 28, 2012 at 8:45 AM, abhiTowson cal
>> wrote:
&
nning
>
> Resource manager is also running
>
> WHEN I CHECK LOG FILES,IT SAYS CONNECTION REFUSED ERROR
>
> Regards
> abhishek
--
Harsh J
t approach for
> my cluster environment?
> Also, on a side note, shouldn't the NodeManager throw an error on this kind
> of memory problem? Should i file a JIRA for this? It just sat quietly over
> there.
>
> Thanks a lot,
> Anil Gupta
>
> On Fri, Jul 27, 2012 at 3
ase let me
> know.
>
> -Sean
--
Harsh J
uce.jobhistory.done-dir
> /disk/mapred/jobhistory/done
>
>
>
> yarn.web-proxy.address
> ihub-an-l1:
>
>
> yarn.app.mapreduce.am.staging-dir
> /user
>
>
> *
> Amount of physical memory, in MB, that can be allocated
>
t;
>> > If it is one index file per reducer, can rely on HDFS append to change
>> > the index write behavior and build one index file from all the
>> > reducers by basically making all the parallel reducers to append to
>> > one index file? Data files do not matter.
>> >
>>
>
>
>
> --
> Bertrand Dechoux
--
Harsh J
nding sequence files has been ongoing
though: https://issues.apache.org/jira/browse/HADOOP-7139. Maybe you
can take a look and help enable MapFiles do the same somehow?
--
Harsh J
.class
> 12/07/27 09:38:27 WARN conf.Configuration: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 12/07/27 09:38:27 INFO mapred.ResourceMgrDelegate: Submitted application
> application_1343365114818_0002 to ResourceManager at ihub-an-l1/
> 172.31.192.151:8040
> 12/07/27 09:38:27 INFO mapreduce.Job: The url to track the job:
> http://ihub-an-l1:/proxy/application_1343365114818_0002/
> 12/07/27 09:38:27 INFO mapreduce.Job: Running job: job_1343365114818_0002
>
> No Map-Reduce task are started by the cluster. I dont see any errors
> anywhere in the application. Please help me in resolving this problem.
>
> Thanks,
> Anil Gupta
--
Harsh J
using multiple thread in map task or
> reduce task?
> Is it a good way to use multithread in map task?
> --
> View this message in context:
> http://old.nabble.com/Hadoop-Multithread-MapReduce-tp34213534p34213534.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
--
Harsh J
07-20 00:12:34,271 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(DN01:50010,
> storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.EOFException: while trying to read 65557 bytes
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:290)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:334)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:398)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:577)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:494)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183)
>
--
Harsh J
05)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> Thanks,
> Prashant
--
Harsh J
d the hadoop main job can naturally come to a close.
>
> However, when I run "hadoop job kill-attempt / fail-attempt ", the
> jobtracker seems to simply relaunch
> the same tasks with new ids.
>
> How can I tell the jobtracker to give up on redelegating?
--
Harsh J
2012-07-20 00:12:34,271 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(DN01:50010,
> storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.EOFException: while trying to read 65557 bytes
>
; Le 16/07/2012 22:49, Mike S a écrit :
>
>> Strictly from speed and performance perspective, is Avro as fast as
>> protocol buffer?
>>
>
--
Harsh J
ant sudo just to interact with our hadoop cluster. I
> suppose I need to read up on user authentication and authorization in
> hadoop before doing something like that.
>
> Thanks
>
> -Original Message-
> From: Harsh J [mailto:ha...@cloudera.com]
> Sent: Wednesday,
suppose this would do the trick but I was hoping
> we could just issue hadoop fs commands against our cluster directly from a
> remote client yet override the username thats being sent to the cluster.
>
> Thanks
>
> On Jul 18, 2012, at 11:54 AM, Harsh J wrote:
>
> >
is strictly prohibited. Unauthorized use of information
> contained herein may subject you to civil and criminal prosecution and
> penalties. If you are not the intended recipient, you should delete this
> message immediately and notify the sender immediately by telephone or by
> replying to this transmission.
--
Harsh J
ock when simultaneous the same table is
> being hit with a read query and write query
--
Harsh J
e Task Trackers not seeing
> JobTracker.
> Checking the JobTracker web interface, it always states there is only 1
> node available.
>
> I've checked the 5 troubleshooting steps provided but it all looks to be ok
> in my environment.
>
> Would anyone have any idea's of what could be causing this?
> Any help would be appreciated.
>
> Cheers,
> Ronan
--
Harsh J
everything with
> SequenceFiles and almost forgot :)
>
> my text output actually has tabs in it... So, im not sure what the default
> separator is, and wehter or not there is a smart way to find the value.
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
11, 2012 at 10:12 PM, Harsh J wrote:
>> Are you sure you've raised the limits for your user, and have
>> re-logged in to the machine?
>>
>> Logged in as the user you run eclipse as, what do you get as the
>> output if you run "ulimit -n"?
>>
>>
to be modified?
Yes, configure fs.checkpoint.dir to SSD/dfs/namesecondary, for the SNN
to use that. Use the hdfs-site.xml.
After configuring these, you may ignore hadoop.tmp.dir, as it
shouldn't be used for anything else.
--
Harsh J
SUCCESS [0.974s]
> [INFO] Apache Hadoop Common ...... FAILURE [1.548s]
> [INFO] Apache Hadoop Common Project .. SKIPPED
> Thanks,
> Su
--
Harsh J
Er, sorry I meant mapred.map.tasks = 1
On Thu, Jul 12, 2012 at 10:44 AM, Harsh J wrote:
> Try passing mapred.map.tasks = 0 or set a higher min-split size?
>
> On Thu, Jul 12, 2012 at 10:36 AM, Yang wrote:
>> Thanks Harsh
>>
>> I see
>>
>> then there s
>
> A = LOAD 'myinput.txt' ;
>
> supposedly it should generate at most 1 mapper.
>
> but in reality , it seems that pig generated 3 mappers, and basically fed
> empty input to 2 of the mappers
>
>
> Thanks
> Yang
>
> On Wed, Jul 11, 2012 at 10:
java io.
>
> I think the main reason is that I am using a MultipleTextOutputFormat
> and my reducer could create many output files based on the my Muti
> Output logic. Is there a way to make Hadoop not to open so many open
> files. If not, can I control when the reduce to close a file?
--
Harsh J
d correctly because I'm trying to debug the
> mappers through eclipse,
> but if more than 1 mapper process is fired, they all try to connect to the
> same debugger port, and the end result is that nobody is able to
> hook to the debugger.
>
>
> Thanks
> Yang
--
Harsh J
job-acls.xml
>
> so 3 attempts were indeed fired ??
>
> I have to get this controlled correctly because I'm trying to debug the
> mappers through eclipse,
> but if more than 1 mapper process is fired, they all try to connect to the
> same debugger port, and the end result is that nobody is able to
> hook to the debugger.
>
>
> Thanks
> Yang
--
Harsh J
apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
> at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:615)
> at
> org.apache.hadoop.hive.shims.Hadoop20Shims.fileSystemDeleteOnExit(Hadoop20Shims.java:68)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:451)
> ... 12 more
--
Harsh J
sive to the hadoop community.
>
> --
> Jay Vyas
> MMSB/UCHC
--
Harsh J
option. It's
> taking very long time to move data into trash. Can you please help me
> how to stop this process of deleting and restart process with skip
> trash??
--
Harsh J
s.
>
> Hadoop 1.0.3
> hive 0.9.0
> flume 1.2.0
> Hbase 0.92.1
> sqoop 1.4.1
>
> my questions are.
>
> 1. the above tools are compatible with all the versions.
>
> 2. any tool need to change the version
>
> 3. list out all the tools with compatible versions.
>
> Please suggest on this?
--
Harsh J
;
>> I have one MBP with 10.7.4 and one laptop with Ubuntu 12.04. Is it possible
>> to set up a hadoop cluster by such mixed environment?
>>
>> Best Regards,
>>
>> --
>> Welcome to my ET Blog http://www.jdxyw.com
>>
--
Harsh J
ing
>>> to make the splits size to be a multiple of 180 but was wondering if
>>> there is anything else that I can do? Please note that my files are
>>> not sequence file and just a custom binary file.
>>>
>>
>> --
>> Kai Voigt
>> k...@123.org
>>
>>
>>
>>
--
Harsh J
client decompressing the file for me?
--
Harsh J
1 - 100 of 1010 matches
Mail list logo