Re: rack awarness unexpected behaviour

2013-10-03 Thread Marc Sturlese
I've check it out and it works like that. The problem is, if the two racks
have not the same capacity, one will have the disk space filled up much
faster than the other (that's what I'm seeing).
If one rack (rack A) has 2 servers of 8 cores with 4 reduce slots each and
the other rack (rack B) has 2 servers of 16 cores with 8 reduce slots each,
rack A will get filled up faster as rack B is writing more (because has more
reduce slots).

Could a solution be to modify the bash script used to decide to which
replica write a block? Would use probability and give to rack B double
chance to receive de write.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093270.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: rack awarness unexpected behaviour

2013-10-03 Thread Marc Sturlese
Doing that will balance the block writing but I think here you loose the
concept of physical rack awareness.
Let's say you have 2 physical racks, one with 2 servers and one with 4. If
you artificially tell hadoop that one rack has 3 servers and the other 3 you
are loosing the concept of rack awareness. You're not guaranteeing that each
physical rack contains at least a replica of each block.

So if you have 2 racks with different number of servers, it's not possible
to do proper rack awareness without filling the disks of the rack with less
servers first. Am I right or am I missing something?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093337.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


rack awarness unexpected behaviour

2013-08-22 Thread Marc Sturlese
Hey there,
I've set up rack awareness on my hadoop cluster with replication 3. I have 2
racks and each contains 50% of the nodes.
I can see that the blocks are spread on the 2 racks, the problem is that all
nodes from a rack are storing 2 replicas and the nodes of the other rack
just one. If I launch the hadoop balancer script, it will properly spread
the replicas across the 2 racks, leaving all nodes with exactly the same
available disk space but, after jobs are running for hours, the data will be
unbalanced again (rack1 having all nodes with less empty disk space than all
nodes from rack2)

Any clue whats going on?
Thanks in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: rack awarness unexpected behaviour

2013-08-22 Thread Marc Sturlese
Jobs run on the whole cluster. After rebalancing everything is properly
allocated. Then I start running jobs using all the slots of the 2 racks and
the problem starts to happen.
Maybe I'm missing something. When using the rack awareness, do you have to
specify to the jobs to run in slots form both racks and not just one? (I
guess not)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086038.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: rack awarness unexpected behaviour

2013-08-22 Thread Marc Sturlese
I'm on cdh3u4 (0.20.2), gonna try to read a bit on this bug



--
View this message in context: 
http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086049.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: doubt about reduce tasks and block writes

2012-08-25 Thread Marc Sturlese
Thanks, Raj you got exactly my point. I wanted to confirm this assumption as
I was guessing if a shared HDFS cluster with MR and Hbase like this would
make sense:
http://old.nabble.com/HBase-User-f34655.html



--
View this message in context: 
http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185p4003211.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


doubt about reduce tasks and block writes

2012-08-24 Thread Marc Sturlese
Hey there,
I have a doubt about reduce tasks and block writes. Do a reduce task always
first write to hdfs in the node where they it is placed? (and then these
blocks would be replicated to other nodes)
In case yes, if I have a cluster of 5 nodes, 4 of them run DN and TT and one
(node A) just run DN, when running MR jobs, map tasks would never read from
node A? This would be because maps have data locality and if the reduce
tasks write first to the node where they live, one replica of the block
would always be in a node that has a TT. Node A would just contain blocks
created from replication by the framework as no reduce task would write
there directly. Is this correct?
Thanks in advance



--
View this message in context: 
http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-03-01 Thread Marc Sturlese
Tried but still getting the error 0.4.15. Really lost with this.
My hadoop release is 0.20.2 from more than a year ago. Could this be related
to the problem?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792484.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-03-01 Thread Marc Sturlese
Yes, The steps I followed where:
1-Intall lzo 2.06 in a machine with the same kernel as my nodes.
2-Compile there lzo 0.4.15 (in /lib replaced cdh3u3 per my hadoop 0.20.2
release)
3-Replace hadoop-lzo-0.4.9.jar for the now compiled hadoop-lzo-0.4.15.jar in
the hadoop lib directory of all my nodes and master
4-Put de generated native files in the native lib directory of all the nodes
and master
5-In my jar job, replaced the jar library hadoop-lzo-0.4.9.jar for
hadoop-lzo-0.4.15.jar

And sometimes when a job is running I get (4 times so the job gets killed):

...org.apache.hadoop.mapred.ReduceTask: Shuffling 3188320 bytes (1025174 raw
bytes) into RAM from attempt_201202291221_1501_m_000480_0
2012-03-02 02:32:55,496 INFO org.apache.hadoop.mapred.ReduceTask: Task
attempt_201202291221_1501_r_000105_0: Failed fetch #1 from
attempt_201202291221_1501_m_46_0
2012-03-02 02:32:55,496 WARN org.apache.hadoop.mapred.ReduceTask:
attempt_201202291221_1501_r_000105_0 adding host hadoop-01.backend to
penalty box, next contact in 4 seconds
2012-03-02 02:32:55,496 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_201202291221_1501_r_000105_0: Got 1 map-outputs from previous
failures
2012-03-02 02:32:55,497 FATAL org.apache.hadoop.mapred.TaskRunner:
attempt_201202291221_1501_r_000105_0 : Map output copy failure :
java.lang.InternalError: lzo1x_decompress returned: -8
at 
com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native
Method)
at
com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:305)
at
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:76)
at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1553)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792505.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-03-01 Thread Marc Sturlese
I use to have 2.05 but now as I said I installed 2.06

--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792511.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-03-01 Thread Marc Sturlese
Absolutely. In case I don't find the root of the problem soon I'll definitely
try it.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792531.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-03-01 Thread Marc Sturlese
Absolutely. In case I don't find the root of the problem soon I'll definitely
try it.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3792530.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


LZO exception decompressing (returned -8)

2012-02-28 Thread Marc Sturlese
Hey there,
I've been running a cluster for over a year and was getting a lzo
decompressing exception less than once a month. Suddenly it happens almost
once per day. Any ideas what could be causing it? I'm with hadoop 0.20.2
I've thought in moving to snappy but would like to know why this happens
more often now

The exception happens always when the reducer gets data from the map and
looks like:

Error: java.lang.InternalError: lzo1x_decompress returned: -8
at 
com.hadoop.compression.lzo.LzoDecompressor.decompressBytesDirect(Native
Method)
at
com.hadoop.compression.lzo.LzoDecompressor.decompress(LzoDecompressor.java:305)
at
org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:76)
at
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1553)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1432)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1285)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1216)

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3783652.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LZO exception decompressing (returned -8)

2012-02-28 Thread Marc Sturlese
I'm with 0.4.9 (think is the latest)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/LZO-exception-decompressing-returned-8-tp3783652p3783927.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


multioutput dfs.datanode.max.xcievers and too many open files

2012-02-23 Thread Marc Sturlese
Hey there,
I've been running a cluster for about a year (about 20 machines). I've run
many concurrent jobs there and some of them with multiOutput and never had
any problem (multiOutputs where creating just 3 or 4 different outputs).
Now I've a job with multiOutputs that creates 100 different outputs and it
always end up with errors.
Tasks start throwing this erros:

java.io.IOException: Bad connect ack with firstBadLink 10.2.0.154:50010
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2963)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329)


or:
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:250)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2961)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329)


Checking the datanode log I see hundreds of times this error:
2012-02-23 14:22:56,008 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block
for append blk_336844604470452_29464903
2012-02-23 14:22:56,008 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_336844604470452_29464903 received exception
java.net.SocketException: Too many open files
2012-02-23 14:22:56,008 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.2.0.156:50010,
storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketException: Too many open files
at sun.nio.ch.Net.socket0(Native Method)
at sun.nio.ch.Net.socket(Net.java:97)
at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84)
at
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118)
2012-02-23 14:22:56,034 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-2698946892792040969_29464904 src: /10.2.0.156:40969 dest:
/10.2.0.156:50010
2012-02-23 14:22:56,035 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_-2698946892792040969_29464904 received exception
java.net.SocketException: Too many open files
2012-02-23 14:22:56,035 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.2.0.156:50010,
storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketException: Too many open files
at sun.nio.ch.Net.socket0(Native Method)
at sun.nio.ch.Net.socket(Net.java:97)
at sun.nio.ch.SocketChannelImpl.init(SocketChannelImpl.java:84)
at
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118)


I've always had configured in hdfs-site.xml:
property
namedfs.datanode.max.xcievers/name
value4096/value
/property

But I think now it's not enough to handle that many multipleOutputs. If I
increase  even more max.xcievers which are de side effects? Wich value
should be considered as maximum (I suppose it depends on the CPU and RAM,
but aprox).

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/multioutput-dfs-datanode-max-xcievers-and-too-many-open-files-tp3770024p3770024.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


cross product of 2 data sets

2011-09-01 Thread Marc Sturlese
Hey there,
I would like to do the cross product of two data sets, any of them feeds in
memory. I've seen pig has the cross operation. Can someone please explain me
how it implements it?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/cross-product-of-2-data-sets-tp3302160p3302160.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: Why use Reverse Timestamp as the Row Key?

2011-07-22 Thread Marc Sturlese
This is normally useful for lot's of web apps. Sort in Hbase is done at
insert time not when scanning. Using a reversed timestamp you ensure the
most recent activity of the user will be shown first. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-use-Reverse-Timestamp-as-the-Row-Key-tp3190719p3190906.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


check namenode, jobtracker, datanodes and tasktracker status

2011-07-08 Thread Marc Sturlese
Hey there,
I've written some scripts to check dfs disk space, number of datanodes,
number of tasktrackers, heap in use...
I'm with hadoop 0.20.2 and to do that I use the DFSClient and JobClient
APIs.
I do things like:

JobClient jc = new JobClient(socketJT, conf);
ClusterStatus clusterStatus = jc.getClusterStatus(true);
clusterStatus.getTaskTrackers();
...
jc.close();
DFSClient client = new DFSClient(socketNN, conf);
DatanodeInfo[] dni = client.datanodeReport(DatanodeReportType.ALL);
...
client.close();

FileSystem fs = FileSystem.get(new URI(hdfs:// + host + /), conf);
fs.getStatus().getCapacity();
...
fs.close();

It's is working well but I'm worried it could be harmful for the cluster to
run the script continuously (resource consumer). Is it alrite for example to
run it every 10 o 15 minutes? In case not, which is a good practice to
monitor the cluster?

Thanks in advance.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/check-namenode-jobtracker-datanodes-and-tasktracker-status-tp3152565p3152565.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


So many unexpected Lost task tracker errors making the job to be killed Options

2011-05-09 Thread Marc Sturlese
Hey there, I have a small cluster running on 0.20.2. Everything is 
fine but once in a while, when a job with a lot of map tasks is 
running I start getting the error: 
Lost task tracker: tracker_cluster1:localhost.localdomain/ 
127.0.0.1:x 
Before getting the error, the task attempt has been running for 7h 
(and normally it takes 46sec to complete). Sometimes, another task 
attempt is launched in paralel, takes 50 sec. to complete and so the 
first one gets killed (the second one can even be launched in the same 
task tracker and work). But in the end, I get so many Lost task 
tracker so the job get killed. 
The job will end up with some of the task trackers blacklisted. 
If I kill the zombie tasks, remove the jobtracker and tasktracer pid 
files, remove the userlogs and stop/start mapred, everything works 
fine again, but some days later, the error will happen again. 
Any idea why this happens? Could someway be related with having too 
many attempt folders in the userlogs (even that there is space left on 
device)? 
Thanks in advance. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/So-many-unexpected-Lost-task-tracker-errors-making-the-job-to-be-killed-Options-tp2917961p2917961.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


check if a sequenceFile is corrupted

2011-03-17 Thread Marc Sturlese
Is there any way to check if a seqfile is corrupted without iterate over all
its keys/values till it crashes?
I've seen that I can get an IOException when opening it or an IOException
reading the X key/value (depending on when it was corrupted).  
Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/check-if-a-sequenceFile-is-corrupted-tp2693230p2693230.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Tasks seem to fail randomly with nonzero status of 1

2011-03-02 Thread Marc Sturlese
Hey there,
My cluster was working fine but suddenly lots and lots of tasks start
failing like:

java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:472)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:459)

I restarted the whole cluster but since it happened once its getting broken
every time I run a job.
Any clue or advice?
Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612433.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: Tasks seem to fail randomly with nonzero status of 1

2011-03-02 Thread Marc Sturlese
Well I'ven been running these jobs for days. It's just happening since last
night and now even if I restart the error keeps happening. I'am the only one
using the cluster

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612509.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Check lzo is working on intermediate data

2011-02-24 Thread Marc Sturlese

Hey there,
I am using hadoop 0.20.2. I 've successfully installed LZOCompression
following these steps:
https://github.com/kevinweil/hadoop-lzo

I have some MR jobs written with the new API and I want to compress
intermediate data.
Not sure if my mapred-site.xml should have the properties:

  property
namemapred.compress.map.output/name
valuetrue/value
  /property
  property
namemapred.map.output.compression.codec/name
valuecom.hadoop.compression.lzo.LzoCodec/value
  /property

or:

  property
namemapreduce.map.output.compress/name
valuetrue/value
  /property
  property
namemapreduce.map.output.compress.codec/name
valuecom.hadoop.compression.lzo.LzoCodec/value
  /property

How can I check that the compression is been applied?

Thanks in advance

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Check-lzo-is-working-on-intermediate-data-tp2567704p2567704.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


JobConf.setQueueName(xxx) with the new api using hadoop 0.20.2

2011-02-22 Thread Marc Sturlese

I'm trying to use the fair scheduler. I have jobs written using the new api
and hadoop 0.20.2.
I've seen that to associate a job with a queue you have to do:
JobConf.setQueueName()
The Job class of the new api has not this class. How can I do that?
Thanks in advance.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/JobConf-setQueueName-xxx-with-the-new-api-using-hadoop-0-20-2-tp2553042p2553042.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: JobConf.setQueueName(xxx) with the new api using hadoop 0.20.2

2011-02-22 Thread Marc Sturlese

Thanks, to be exact mapreduce.job.queuename
http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/JobConf-setQueueName-xxx-with-the-new-api-using-hadoop-0-20-2-tp2553042p2553352.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


LocalDirAllocator and getLocalPathForWrite

2011-01-05 Thread Marc Sturlese

I have a doubt about how this works. The API documentation says that the
class LocalDirAllocator is: An implementation of a round-robin scheme for
disk allocation for creating files
I am wondering, the disk allocation is done in the constructor?
Let's say I have a cluster of just 1 node and 4 disks and I do inside a
reducer:
LocalDirAllocator localDirAlloc = new LocalDirAllocator(mapred.local.dir);
Path pathA = localDirAlloc.getLocalPathForWrite(a) ;
Path pathB = localDirAlloc.getLocalPathForWrite(b) ;

The local paths pathA and pathB will for sure be in the same local disk as
it was allocated by new LocalDirAllocator(mapred.local.dir) or is
getLocalPathForWrite who gets the disk and so the two paths might not be in
the same disk (as I have 4 disks)?

Thanks in advance
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/LocalDirAllocator-and-getLocalPathForWrite-tp2199517p2199517.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: LocalDirAllocator and getLocalPathForWrite

2011-01-05 Thread Marc Sturlese

Hey Todd,

LocalDirAllocator is an internal-facing API and you shouldn't be using it
from user code. If you write into mapred.local.dir like this, you will end
up with conflicts between different tasks running from the same node

 I know it's a bit odd usage but the thing is that I need to create files in
the local file system, work in there with them amb after that upload them to
hdfs (I use the outputcomitter.) To avoid the conflicts you talk about, I
create a folder which looks like mapred.local.dir/taskId/attemptId and I
work there and aparently I am having no problems.

and there isn't usually a good reason to write to multiple drives from
within a task

When I said I had a cluster of one node, was just to try to clarify my doubt
and explain the example. My cluster is bigger than that actually and each
node has more than 1 phisical disk. To have multuple task running at the
same time is what I do. I would like each task to write just to a single
local disk but don't know how to do it. 

The working directory of your MR task is already within one of the drives,

Is there a way to get a working directory in the local disk from the
reducer?
Could I do something similar to:
FileSystem fs = FileSystem.get(conf);
LocalFileSystem localFs = fs.getLocal(conf);
Path path = localFs.getWorkingDirectory();
I would apreciate if you can tell me a bit more about this.
I need to deal with these files just in local and want them copied to hdfs
just when I finish working with them. 

Thanks in advance.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/LocalDirAllocator-and-getLocalPathForWrite-tp2199517p2202221.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: How to create hadoop-0.21.0-core.jar ?

2011-01-05 Thread Marc Sturlese

I was able to compile and build the commons .jar but got errors building the
hdfs .jar (and so, couldn't build the mapred .jar)
Here is the thread, got no answers at the moment:
http://lucene.472066.n3.nabble.com/Building-hadoop-0-21-0-from-the-source-td1647589.html
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-create-hadoop-0-21-0-core-jar-tp2196434p2202327.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


SequenceFiles and streaming or hdfs thrift api

2011-01-04 Thread Marc Sturlese

Hey there,
I have the need to write a file to a hdfs cluster in php. I now I can do
that with the hdfs thrift api.
http://wiki.apache.org/hadoop/HDFS-APIs

The thing is I want this file to be a SequenceFile, where the key should be
a Text and the value a Thrift serialized object. Is it possible to reach
that goal?
Thanks in advance
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SequenceFiles-and-streaming-or-hdfs-thrift-api-tp2193101p2193101.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


small files and number of mappers

2010-11-29 Thread Marc Sturlese

Hey there,
I am doing some tests and wandering which are the best practices to deal
with very small files which are continuously being generated(1Mb or even
less).

I see that if I have hundreds of small files in hdfs, hadoop automatically
will create A LOT of map tasks to consume them. Each map task will take 10
seconds or less... I don't know if it's possible to change the number of map
tasks from java code using the new API (I know it can be done with the old
one). I would like to do something like NumMapTasksCalculatedByHadoop * 0.3.
This way, less maps tasks would be instanciated and each would be working
more time.

I have had a look at hadoop archives aswell but don't thing they can help me
here.

Any advice or similar experience?
Thanks in advance.


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/small-files-and-number-of-mappers-tp1989598p1989598.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: How to Transmit and Append Indexes

2010-11-19 Thread Marc Sturlese

You could implement some scripts to send to the slave the index updates using
rsync. Do something similar to what Solr does:
http://wiki.apache.org/solr/CollectionDistribution

However, if what you want is a total merge of the indexes, you can do it
easy with lucene:
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexWriter.html#addIndexesNoOptimize(org.apache.lucene.store.Directory...)



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-Transmit-and-Append-Indexes-tp1931444p1931881.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: MultipleInputs and org.apache.hadoop.mapred package in 0.20.2

2010-10-07 Thread Marc Sturlese

Thanks for the advice, will do that. This way will be able to use
MarkableIterator aswell.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/MultipleInputs-and-org-apache-hadoop-mapred-package-in-0-20-2-tp1643587p1647012.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


MultipleInputs and org.apache.hadoop.mapred package in 0.20.2

2010-10-06 Thread Marc Sturlese

I'm working with hadoop 0.20.2 using the new API contained in the package: 
org.apache.hadoop.mapreduce

I have noticed that MultipleInputs is under: 
org.apache.hadoop.mapred
and when setting a path it asks for a JobConf: 
addInputPath(JobConf conf, Path path, Class? extends InputFormat
inputFormatClass, Class? extends Mapper mapperClass) 
but JobConf is deprecated in 0.20.2 (so all my jobs are using Job instead of
JobConf) 

I have noticed too that in 0.21.0 there's no such problem as MultipleInputs
exist in org.apache.hadoop.mapreduce
and when setting a path it asks for a Job: 
addInputPath(Job job, org.apache.hadoop.fs.Path path, Class? extends
InputFormat inputFormatClass, Class? extends Mapper mapperClass) 

I was actually using 0.21.0 but had to downgrade to 0.20.2 as 0.21.0 can't
report progress in the reduce context (and this feature is a must for me): 
https://issues.apache.org/jira/browse/MAPREDUCE-1905

So, wich would be the best way to use MultipleInputs? Should I change all my
code to use the org.apache.hadoop.mapred insetad of the
org.apache.hadoop.mapreduce classes? 

I am really confused here, thanks in advance
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/MultipleInputs-and-org-apache-hadoop-mapred-package-in-0-20-2-tp1643587p1643587.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Re: can not report progress from reducer context with hadoop 0.21

2010-09-21 Thread Marc Sturlese

Thanks, was going mad with this. It's working properly with 0.20.2
Once the patch is totally done will apply it to be able to keep using the
MarkableIterator as it simplifies me many MapReduce jobs
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/can-not-report-progress-from-reducer-context-with-hadoop-0-21-tp1534700p1555486.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


weird exception when running mapreduce jobs with hadoop 0.21.0

2010-09-16 Thread Marc Sturlese

Hey There,
I have set up a cluster wich is suposed to work properly. I can add data
files and read them from a java app.
But when I ejecute a mapred job I am getting this exception:
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
at org.apache.hadoop.mapred.Child.main(Child.java:211)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.util

Any idea why is this happening?

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/weird-exception-when-running-mapreduce-jobs-with-hadoop-0-21-0-tp1488154p1488154.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.