Re: [ANN] lzo indexing

2010-08-31 Thread Torsten Curdt
Hey Todd, > The existing hadoop-lzo project doesn't use the C code in indexing, though I > think you're right that the classes will fail to initialize if the native > libraries aren't available. Well, it relies on the header reading of the codec. But frankly speaking I missed the fact that the re

Re: [ANN] lzo indexing

2010-08-31 Thread Todd Lipcon
Hi Torsten, The existing hadoop-lzo project doesn't use the C code in indexing, though I think you're right that the classes will fail to initialize if the native libraries aren't available. Could I encourage you to simply post a patch that fixes this rather than forking a new project? This code

[ANN] lzo indexing

2010-08-31 Thread Torsten Curdt
For those people using LZO compression: While I know there is http://github.com/kevinweil/hadoop-lzo The native stuff makes it a bit of a hurdle. Especially if you are just running on Amazon Elastic Map Reduce it's way easier to just run this java-only indexer instead. http://github.com/tcurd

Re: specify different number of mapper tasks for different machines

2010-08-31 Thread Vitaliy Semochkin
Thank you Sam, I'll give it a try. On Mon, Aug 30, 2010 at 4:39 PM, Vitaliy Semochkin wrote: > To say the truth I didn't understood Ted's proposal to solve  the > wiping configuration. > If you manage to make such configuration work please report :-) > > On Mon, Aug 30, 2010 at 3:59 PM, Shaojun