Re: rack awarness unexpected behaviour
I've check it out and it works like that. The problem is, if the two racks have not the same capacity, one will have the disk space filled up much faster than the other (that's what I'm seeing). If one rack (rack A) has 2 servers of 8 cores with 4 reduce slots each and the other rack (rack B) has 2 servers of 16 cores with 8 reduce slots each, rack A will get filled up faster as rack B is writing more (because has more reduce slots). Could a solution be to modify the bash script used to decide to which replica write a block? Would use probability and give to rack B double chance to receive de write. -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093270.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: rack awarness unexpected behaviour
Doing that will balance the block writing but I think here you loose the concept of physical rack awareness. Let's say you have 2 physical racks, one with 2 servers and one with 4. If you artificially tell hadoop that one rack has 3 servers and the other 3 you are loosing the concept of rack awareness. You're not guaranteeing that each physical rack contains at least a replica of each block. So if you have 2 racks with different number of servers, it's not possible to do proper rack awareness without filling the disks of the rack with less servers first. Am I right or am I missing something? -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093337.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
rack awarness unexpected behaviour
Hey there, I've set up rack awareness on my hadoop cluster with replication 3. I have 2 racks and each contains 50% of the nodes. I can see that the blocks are spread on the 2 racks, the problem is that all nodes from a rack are storing 2 replicas and the nodes of the other rack just one. If I launch the hadoop balancer script, it will properly spread the replicas across the 2 racks, leaving all nodes with exactly the same available disk space but, after jobs are running for hours, the data will be unbalanced again (rack1 having all nodes with less empty disk space than all nodes from rack2) Any clue whats going on? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: rack awarness unexpected behaviour
Jobs run on the whole cluster. After rebalancing everything is properly allocated. Then I start running jobs using all the slots of the 2 racks and the problem starts to happen. Maybe I'm missing something. When using the rack awareness, do you have to specify to the jobs to run in slots form both racks and not just one? (I guess not) -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086038.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: rack awarness unexpected behaviour
I'm on cdh3u4 (0.20.2), gonna try to read a bit on this bug -- View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086049.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: doubt about reduce tasks and block writes
Thanks, Raj you got exactly my point. I wanted to confirm this assumption as I was guessing if a shared HDFS cluster with MR and Hbase like this would make sense: http://old.nabble.com/HBase-User-f34655.html -- View this message in context: http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185p4003211.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
doubt about reduce tasks and block writes
Hey there, I have a doubt about reduce tasks and block writes. Do a reduce task always first write to hdfs in the node where they it is placed? (and then these blocks would be replicated to other nodes) In case yes, if I have a cluster of 5 nodes, 4 of them run DN and TT and one (node A) just run DN, when running MR jobs, map tasks would never read from node A? This would be because maps have data locality and if the reduce tasks write first to the node where they live, one replica of the block would always be in a node that has a TT. Node A would just contain blocks created from replication by the framework as no reduce task would write there directly. Is this correct? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
Tried but still getting the error 0.4.15. Really lost with this. My hadoop release is 0.20.2 from more than a year ago. Could this be related to the problem? -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792484.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
Yes, The steps I followed where: 1-Intall lzo 2.06 in a machine with the same kernel as my nodes. 2-Compile there lzo 0.4.15 (in /lib replaced cdh3u3 per my hadoop 0.20.2 release) 3-Replace hadoop-lzo-0.4.9.jar for the now compiled hadoop-lzo-0.4.15.jar in the hadoop lib directory of all my nodes and master 4-Put de generated native files in the native lib directory of all the nodes and master 5-In my jar job, replaced the jar library hadoop-lzo-0.4.9.jar for hadoop-lzo-0.4.15.jar And sometimes when a job is running I get (4 times so the job gets killed): ...org.apache.hadoop.mapred.ReduceTask: Shuffling 3188320 bytes (1025174 raw bytes) into RAM from attempt_201202291221_1501_m_000480_0 2012-03-02 02:32:55,496 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201202291221_1501_r_000105_0: Failed fetch #1 from attempt_201202291221_1501_m_46_0 2012-03-02 02:32:55,496 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201202291221_1501_r_000105_0 adding host hadoop-01.backend to penalty box, next contact in 4 seconds 2012-03-02 02:32:55,496 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201202291221_1501_r_000105_0: Got 1 map-outputs from previous failures 2012-03-02 02:32:55,497 FATAL org.apache.hadoop.mapred.TaskRunner: attempt_201202291221_1501_r_000105_0 : Map output copy failure : java.lang.InternalError: lzo1x_decompress returned: -8 at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native Method) at com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:305) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:76) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1553) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216) -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792505.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
I use to have 2.05 but now as I said I installed 2.06 -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792511.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
Absolutely. In case I don't find the root of the problem soon I'll definitely try it. -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792531.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
Absolutely. In case I don't find the root of the problem soon I'll definitely try it. -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792530.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
LZO exception decompressing (returned -8)
Hey there, I've been running a cluster for over a year and was getting a lzo decompressing exception less than once a month. Suddenly it happens almost once per day. Any ideas what could be causing it? I'm with hadoop 0.20.2 I've thought in moving to snappy but would like to know why this happens more often now The exception happens always when the reducer gets data from the map and looks like: Error: java.lang.InternalError: lzo1x_decompress returned: -8 at com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native Method) at com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:305) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:76) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1553) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216) Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3783652.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LZO exception decompressing (returned -8)
I'm with 0.4.9 (think is the latest) -- View this message in context: http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3783927.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
multioutput dfs.datanode.max.xcievers and too many open files
Hey there, I've been running a cluster for about a year (about 20 machines). I've run many concurrent jobs there and some of them with multiOutput and never had any problem (multiOutputs where creating just 3 or 4 different outputs). Now I've a job with multiOutputs that creates 100 different outputs and it always end up with errors. Tasks start throwing this erros: java.io.IOException: Bad connect ack with firstBadLink 10.2.0.154:50010 at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2963) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329) or: java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2961) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329) Checking the datanode log I see hundreds of times this error: 2012-02-23 14:22:56,008 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block for append blk_336844604470452_29464903 2012-02-23 14:22:56,008 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_336844604470452_29464903 received exception java.net.SocketException: Too many open files 2012-02-23 14:22:56,008 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.2.0.156:50010, storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Too many open files at sun.nio.ch.Net.socket0(Native Method) at sun.nio.ch.Net.socket(Net.java:97) at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84) at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) at java.nio.channels.SocketChannel.open(SocketChannel.java:105) at org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118) 2012-02-23 14:22:56,034 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-2698946892792040969_29464904 src: /10.2.0.156:40969 dest: /10.2.0.156:50010 2012-02-23 14:22:56,035 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-2698946892792040969_29464904 received exception java.net.SocketException: Too many open files 2012-02-23 14:22:56,035 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.2.0.156:50010, storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075, ipcPort=50020):DataXceiver java.net.SocketException: Too many open files at sun.nio.ch.Net.socket0(Native Method) at sun.nio.ch.Net.socket(Net.java:97) at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84) at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) at java.nio.channels.SocketChannel.open(SocketChannel.java:105) at org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118) I've always had configured in hdfs-site.xml: property namedfs.datanode.max.xcievers/name value4096/value /property But I think now it's not enough to handle that many multipleOutputs. If I increase even more max.xcievers which are de side effects? Wich value should be considered as maximum (I suppose it depends on the CPU and RAM, but aprox). Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/multioutput-dfs-datanode-max-xcievers-and-too-many-open-files-tp3770024p3770024.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
cross product of 2 data sets
Hey there, I would like to do the cross product of two data sets, any of them feeds in memory. I've seen pig has the cross operation. Can someone please explain me how it implements it? -- View this message in context: http://lucene.472066.n3.nabble.com/cross-product-of-2-data-sets-tp3302160p3302160.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Why use Reverse Timestamp as the Row Key?
This is normally useful for lot's of web apps. Sort in Hbase is done at insert time not when scanning. Using a reversed timestamp you ensure the most recent activity of the user will be shown first. -- View this message in context: http://lucene.472066.n3.nabble.com/Why-use-Reverse-Timestamp-as-the-Row-Key-tp3190719p3190906.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
check namenode, jobtracker, datanodes and tasktracker status
Hey there, I've written some scripts to check dfs disk space, number of datanodes, number of tasktrackers, heap in use... I'm with hadoop 0.20.2 and to do that I use the DFSClient and JobClient APIs. I do things like: JobClient jc = new JobClient(socketJT, conf); ClusterStatus clusterStatus = jc.getClusterStatus(true); clusterStatus.getTaskTrackers(); ... jc.close(); DFSClient client = new DFSClient(socketNN, conf); DatanodeInfo[] dni = client.datanodeReport(DatanodeReportType.ALL); ... client.close(); FileSystem fs = FileSystem.get(new URI(hdfs:// + host + /), conf); fs.getStatus().getCapacity(); ... fs.close(); It's is working well but I'm worried it could be harmful for the cluster to run the script continuously (resource consumer). Is it alrite for example to run it every 10 o 15 minutes? In case not, which is a good practice to monitor the cluster? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/check-namenode-jobtracker-datanodes-and-tasktracker-status-tp3152565p3152565.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
So many unexpected Lost task tracker errors making the job to be killed Options
Hey there, I have a small cluster running on 0.20.2. Everything is fine but once in a while, when a job with a lot of map tasks is running I start getting the error: Lost task tracker: tracker_cluster1:localhost.localdomain/ 127.0.0.1:x Before getting the error, the task attempt has been running for 7h (and normally it takes 46sec to complete). Sometimes, another task attempt is launched in paralel, takes 50 sec. to complete and so the first one gets killed (the second one can even be launched in the same task tracker and work). But in the end, I get so many Lost task tracker so the job get killed. The job will end up with some of the task trackers blacklisted. If I kill the zombie tasks, remove the jobtracker and tasktracer pid files, remove the userlogs and stop/start mapred, everything works fine again, but some days later, the error will happen again. Any idea why this happens? Could someway be related with having too many attempt folders in the userlogs (even that there is space left on device)? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/So-many-unexpected-Lost-task-tracker-errors-making-the-job-to-be-killed-Options-tp2917961p2917961.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
check if a sequenceFile is corrupted
Is there any way to check if a seqfile is corrupted without iterate over all its keys/values till it crashes? I've seen that I can get an IOException when opening it or an IOException reading the X key/value (depending on when it was corrupted). Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/check-if-a-sequenceFile-is-corrupted-tp2693230p2693230.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Tasks seem to fail randomly with nonzero status of 1
Hey there, My cluster was working fine but suddenly lots and lots of tasks start failing like: java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:472) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:459) I restarted the whole cluster but since it happened once its getting broken every time I run a job. Any clue or advice? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612433.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Tasks seem to fail randomly with nonzero status of 1
Well I'ven been running these jobs for days. It's just happening since last night and now even if I restart the error keeps happening. I'am the only one using the cluster -- View this message in context: http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612509.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Check lzo is working on intermediate data
Hey there, I am using hadoop 0.20.2. I 've successfully installed LZOCompression following these steps: https://github.com/kevinweil/hadoop-lzo I have some MR jobs written with the new API and I want to compress intermediate data. Not sure if my mapred-site.xml should have the properties: property namemapred.compress.map.output/name valuetrue/value /property property namemapred.map.output.compression.codec/name valuecom.hadoop.compression.lzo.LzoCodec/value /property or: property namemapreduce.map.output.compress/name valuetrue/value /property property namemapreduce.map.output.compress.codec/name valuecom.hadoop.compression.lzo.LzoCodec/value /property How can I check that the compression is been applied? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/Check-lzo-is-working-on-intermediate-data-tp2567704p2567704.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
JobConf.setQueueName(xxx) with the new api using hadoop 0.20.2
I'm trying to use the fair scheduler. I have jobs written using the new api and hadoop 0.20.2. I've seen that to associate a job with a queue you have to do: JobConf.setQueueName() The Job class of the new api has not this class. How can I do that? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/JobConf-setQueueName-xxx-with-the-new-api-using-hadoop-0-20-2-tp2553042p2553042.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: JobConf.setQueueName(xxx) with the new api using hadoop 0.20.2
Thanks, to be exact mapreduce.job.queuename http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html -- View this message in context: http://lucene.472066.n3.nabble.com/JobConf-setQueueName-xxx-with-the-new-api-using-hadoop-0-20-2-tp2553042p2553352.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
LocalDirAllocator and getLocalPathForWrite
I have a doubt about how this works. The API documentation says that the class LocalDirAllocator is: An implementation of a round-robin scheme for disk allocation for creating files I am wondering, the disk allocation is done in the constructor? Let's say I have a cluster of just 1 node and 4 disks and I do inside a reducer: LocalDirAllocator localDirAlloc = new LocalDirAllocator(mapred.local.dir); Path pathA = localDirAlloc.getLocalPathForWrite(a) ; Path pathB = localDirAlloc.getLocalPathForWrite(b) ; The local paths pathA and pathB will for sure be in the same local disk as it was allocated by new LocalDirAllocator(mapred.local.dir) or is getLocalPathForWrite who gets the disk and so the two paths might not be in the same disk (as I have 4 disks)? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/LocalDirAllocator-and-getLocalPathForWrite-tp2199517p2199517.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: LocalDirAllocator and getLocalPathForWrite
Hey Todd, LocalDirAllocator is an internal-facing API and you shouldn't be using it from user code. If you write into mapred.local.dir like this, you will end up with conflicts between different tasks running from the same node I know it's a bit odd usage but the thing is that I need to create files in the local file system, work in there with them amb after that upload them to hdfs (I use the outputcomitter.) To avoid the conflicts you talk about, I create a folder which looks like mapred.local.dir/taskId/attemptId and I work there and aparently I am having no problems. and there isn't usually a good reason to write to multiple drives from within a task When I said I had a cluster of one node, was just to try to clarify my doubt and explain the example. My cluster is bigger than that actually and each node has more than 1 phisical disk. To have multuple task running at the same time is what I do. I would like each task to write just to a single local disk but don't know how to do it. The working directory of your MR task is already within one of the drives, Is there a way to get a working directory in the local disk from the reducer? Could I do something similar to: FileSystem fs = FileSystem.get(conf); LocalFileSystem localFs = fs.getLocal(conf); Path path = localFs.getWorkingDirectory(); I would apreciate if you can tell me a bit more about this. I need to deal with these files just in local and want them copied to hdfs just when I finish working with them. Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/LocalDirAllocator-and-getLocalPathForWrite-tp2199517p2202221.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: How to create hadoop-0.21.0-core.jar ?
I was able to compile and build the commons .jar but got errors building the hdfs .jar (and so, couldn't build the mapred .jar) Here is the thread, got no answers at the moment: http://lucene.472066.n3.nabble.com/Building-hadoop-0-21-0-from-the-source-td1647589.html -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-create-hadoop-0-21-0-core-jar-tp2196434p2202327.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
SequenceFiles and streaming or hdfs thrift api
Hey there, I have the need to write a file to a hdfs cluster in php. I now I can do that with the hdfs thrift api. http://wiki.apache.org/hadoop/HDFS-APIs The thing is I want this file to be a SequenceFile, where the key should be a Text and the value a Thrift serialized object. Is it possible to reach that goal? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/SequenceFiles-and-streaming-or-hdfs-thrift-api-tp2193101p2193101.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
small files and number of mappers
Hey there, I am doing some tests and wandering which are the best practices to deal with very small files which are continuously being generated(1Mb or even less). I see that if I have hundreds of small files in hdfs, hadoop automatically will create A LOT of map tasks to consume them. Each map task will take 10 seconds or less... I don't know if it's possible to change the number of map tasks from java code using the new API (I know it can be done with the old one). I would like to do something like NumMapTasksCalculatedByHadoop * 0.3. This way, less maps tasks would be instanciated and each would be working more time. I have had a look at hadoop archives aswell but don't thing they can help me here. Any advice or similar experience? Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/small-files-and-number-of-mappers-tp1989598p1989598.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: How to Transmit and Append Indexes
You could implement some scripts to send to the slave the index updates using rsync. Do something similar to what Solr does: http://wiki.apache.org/solr/CollectionDistribution However, if what you want is a total merge of the indexes, you can do it easy with lucene: http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexesNoOptimize(org.apache.lucene.store.Directory...) -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-Transmit-and-Append-Indexes-tp1931444p1931881.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: MultipleInputs and org.apache.hadoop.mapred package in 0.20.2
Thanks for the advice, will do that. This way will be able to use MarkableIterator aswell. -- View this message in context: http://lucene.472066.n3.nabble.com/MultipleInputs-and-org-apache-hadoop-mapred-package-in-0-20-2-tp1643587p1647012.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
MultipleInputs and org.apache.hadoop.mapred package in 0.20.2
I'm working with hadoop 0.20.2 using the new API contained in the package: org.apache.hadoop.mapreduce I have noticed that MultipleInputs is under: org.apache.hadoop.mapred and when setting a path it asks for a JobConf: addInputPath(JobConf conf, Path path, Class? extends InputFormat inputFormatClass, Class? extends Mapper mapperClass) but JobConf is deprecated in 0.20.2 (so all my jobs are using Job instead of JobConf) I have noticed too that in 0.21.0 there's no such problem as MultipleInputs exist in org.apache.hadoop.mapreduce and when setting a path it asks for a Job: addInputPath(Job job, org.apache.hadoop.fs.Path path, Class? extends InputFormat inputFormatClass, Class? extends Mapper mapperClass) I was actually using 0.21.0 but had to downgrade to 0.20.2 as 0.21.0 can't report progress in the reduce context (and this feature is a must for me): https://issues.apache.org/jira/browse/MAPREDUCE-1905 So, wich would be the best way to use MultipleInputs? Should I change all my code to use the org.apache.hadoop.mapred insetad of the org.apache.hadoop.mapreduce classes? I am really confused here, thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/MultipleInputs-and-org-apache-hadoop-mapred-package-in-0-20-2-tp1643587p1643587.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: can not report progress from reducer context with hadoop 0.21
Thanks, was going mad with this. It's working properly with 0.20.2 Once the patch is totally done will apply it to be able to keep using the MarkableIterator as it simplifies me many MapReduce jobs -- View this message in context: http://lucene.472066.n3.nabble.com/can-not-report-progress-from-reducer-context-with-hadoop-0-21-tp1534700p1555486.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
weird exception when running mapreduce jobs with hadoop 0.21.0
Hey There, I have set up a cluster wich is suposed to work properly. I can add data files and read them from a java app. But when I ejecute a mapred job I am getting this exception: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.util Any idea why is this happening? -- View this message in context: http://lucene.472066.n3.nabble.com/weird-exception-when-running-mapreduce-jobs-with-hadoop-0-21-0-tp1488154p1488154.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.