Re: Help in completion of academic project.

2013-05-07 Thread Harsh J
The moveFromLocal is a simple utility method that uses the FileSystem instance to upload a file onto HDFS and then deletes the source FS's file. Is your goal just to encrypt files uploaded using this utility method or is the goal to encrypt all files present in HDFS globally? Adding more notes on

Re: no _SUCCESS file in MR output directory.

2013-05-07 Thread Harsh J
Good observance: Pig does seem to use a default "false" when possible, to disable the _SUCCESS creation. I don't see Hive do that, nor any part of the stock Apache Hadoop MR jobs. Rahul - Do you use a Pig action in your WF? Also, are you definitively seeing _SUCCESS being created after you add the

DistributedCache does not seem to copy the HDFS files to local

2013-05-07 Thread YouPeng Yang
Hi All I want to use the DistributedCache to perform replicated join on the map side. My java code refer to [1][2]. When I run the job,the file that I want to cache in the local dir of my DN is not to copied.So the FileNotFoundException error came out[3] And I checkout the source code

Re: no _SUCCESS file in MR output directory.

2013-05-07 Thread Rahul Bhattacharjee
No we do not use any pig , yes I am very sure that I am seeing the success file after enabling it manually. Thanks, Rahul On Tue, May 7, 2013 at 1:14 PM, Harsh J wrote: > Good observance: Pig does seem to use a default "false" when possible, > to disable the _SUCCESS creation. I don't see Hive

How to change default 1024 open files in Linux

2013-05-07 Thread Aditya exalter
How to change default 1024 open files in Linux

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Nitin Pawar
use ulimit On Tue, May 7, 2013 at 4:27 PM, Aditya exalter wrote: > How to change default 1024 open files in Linux > -- Nitin Pawar

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Jagat Singh
Quickly check http://hbase.apache.org/book/configuration.html#os Read about ulimit On Tue, May 7, 2013 at 8:57 PM, Aditya exalter wrote: > How to change default 1024 open files in Linux >

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Sandeep Nemuri
How to increase it in linux ? On Tue, May 7, 2013 at 4:30 PM, Jagat Singh wrote: > Quickly check > > http://hbase.apache.org/book/configuration.html#os > > Read about ulimit > > > > On Tue, May 7, 2013 at 8:57 PM, Aditya exalter wrote: > >> How to change default 1024 open files in Linux >> > >

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Aditya exalter
just go to /etc/security/limits.conf and append * soft nofiles 30(your limit) * hard nofiles 827399(your limit) Just logout and login to see changes ulimit -a On Tue, May 7, 2013 at 4:36 PM, Sandeep Nemuri wrote: > How to increase it > in linux ? > > > On

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Sandeep Nemuri
its not working !! On Tue, May 7, 2013 at 4:41 PM, Aditya exalter wrote: > just go to /etc/security/limits.conf > > and append > > * soft nofiles 30(your limit) > * hard nofiles 827399(your limit) > > Just logout and login to see changes ulimit -a > > > On Tue,

Re: Hardware Selection for Hadoop

2013-05-07 Thread Michael Segel
I wouldn't go the route of multiple nics unless you are using MapR. MapR allows you to do port bonding or rather use both ports simultaneously. When you port bond. 1+1 != 2 and then you have some other configuration issues. (Unless they've fixed them) If this is your first cluster... keep it s

Re: Hardware Selection for Hadoop

2013-05-07 Thread Michael Segel
I wouldn't. You end up with a 'Frankencluster' which could become problematic down the road. Ever try to debug a port failure on a switch? (It does happen and its a bitch.) Note that you say 'reliable'... older hardware may or may not be reliable or under warranty. (How many here build th

Possible to query the number of values as input of a reducer?

2013-05-07 Thread Han JU
Hi, Just a small question. Is there a way to know how many values associated with a key in a reducer, without using containers to store all of them? Thanks. -- *JU Han* Software Engineer Intern @ KXEN Inc. UTC - Université de Technologie de Compiègne * **GI06 - Fouille de Données et Déc

RE: Bloom Filter analogy in SQL

2013-05-07 Thread Tony Burton
There’s an explanation of Bloom Filters and a MapReduce implementation in Chuck Lam’s book “Hadoop In Action”. Maybe that might guide the way a bit more. From: Sai Sai [mailto:saigr...@yahoo.in] Sent: 29 March 2013 16:37 To: user@hadoop.apache.org Subject: Re: Bloom Filter analogy in SQL Can so

Re: Bloom Filter analogy in SQL

2013-05-07 Thread Kai Voigt
http://billmill.org/bloomfilter-tutorial/ has nice explanation, and it links to the Wiki entry for further theory. Am 07.05.2013 um 16:56 schrieb Tony Burton : > There’s an explanation of Bloom Filters and a MapReduce implementation in > Chuck Lam’s book “Hadoop In Action”. Maybe that might gu

RE: How to balance reduce job

2013-05-07 Thread Tony Burton
The typical Partitioner method for assigning reducer r from reducers R is r = hash(key) % count(R) However if you find your partitioner is assigning your data to too few or one reducers, I found that changing the count(R) to the next odd number or (even better) prime number above count(R) is a

Re: Hardware Selection for Hadoop

2013-05-07 Thread Ted Dunning
On Tue, May 7, 2013 at 5:53 AM, Michael Segel wrote: > While we have a rough metric on spindles to cores, you end up putting a > stress on the disk controllers. YMMV. > This is an important comment. Some controllers fold when you start pushing too much data. Testing nodes independently before i

Re: How to balance reduce job

2013-05-07 Thread shashwat shriparv
The number of reducer running depends on the data available. *Thanks & Regards* ∞ Shashwat Shriparv On Tue, May 7, 2013 at 8:43 PM, Tony Burton wrote: > ** ** > > The typical Partitioner method for assigning reducer r from reducers R is* > *** > > ** ** > > r = hash(key) % count(R) >

Re: How to change default 1024 open files in Linux

2013-05-07 Thread Dhanasekaran Anbalagan
HI Sandeep, you can try this path root@dv150:~# *cat /etc/security/limits.d/* *hbase.nofiles.conf hdfs.conf mapred.conf mapreduce.conf yarn.conf * root@dv150:~# cat /etc/security/limits.d/hdfs.conf # Licensed to the Apache Software Foundation (ASF) under one or more # con

Need help on coverting Audio files to text

2013-05-07 Thread Sanjeevv Sriram
Hi Users, Please let me know how can I convert Audio files to text files using Hadoop technologies. I want to do some analysys on recorded audio files. Regards, Sanjeevv

Re: Need help on coverting Audio files to text

2013-05-07 Thread Tom Deutsch
Accuracy of conversion is the key issue here. We (IBM) use things from our Research labs to do this, but in any case the accuracy of the translation is nontrivial if accuracy matters (and it usually does). From: Sanjeevv Sriram To: user@hadoop.apache.org, Date: 05/07/2013 10:31 AM Sub

problem building lzo

2013-05-07 Thread kaveh minooie
Hi everyone I am trying to follow this tutorial https://wiki.apache.org/hadoop/UsingLzoCompression and I am getting an error I don't know how to solve. i have CFLAGS='-m64' and run this: CLASSPATH=/hadoop/hadoop-core-1.1.1.jar ant compile-native and I get this: compile-java: [javac] /s

Good White Paper - Hadoop Hive & ETL

2013-05-07 Thread Raj Hadoop
Hi, This is an interesting article on Hadoop Hive & ETL. http://dbtr.cs.aau.dk/DBPublications/DBTR-31.pdf Did any one used this kind of a framework? Please advise. Thanks, Raj

Benefits of Hadoop Distributed Cache

2013-05-07 Thread Saeed Shahrivari
Would you please tell me why we should use Distributed Cache instead of HDFS? Because HDFS seems more stable, easier to use, and less error-prone. Thanks in advance.

Re: Benefits of Hadoop Distributed Cache

2013-05-07 Thread Michael Segel
Not sure what you mean... If you want to put up a small file to be used by each Task in your job (mapper or reducer)... you could put it up on HDFS. Or if you're launching your job from an edge node, you could read in the small file and put it in to the distributed cache. It really depends o

get recent changed files in hadoop

2013-05-07 Thread Winston Lin
Any idea to get recent changed file in hadoop? e.g. files created yesterday? fs -ls will only give us all the files. Thanks Winston

Re: get recent changed files in hadoop

2013-05-07 Thread Rahul Bhattacharjee
Is any such option available in other posix shells? On Wednesday, May 8, 2013, Winston Lin wrote: > Any idea to get recent changed file in hadoop? e.g. files created > yesterday? > > fs -ls will only give us all the files. > > Thanks > Winston > -- Sent from Gmail Mobile

Re: get recent changed files in hadoop

2013-05-07 Thread Mohammad Tariq
I don't think any such thing is available OOTB. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Wed, May 8, 2013 at 8:51 AM, Rahul Bhattacharjee wrote: > Is any such option available in other posix shells? > > > On Wednesday, May 8, 2013, Winston Lin wrote: > >> Any idea

Re: get recent changed files in hadoop

2013-05-07 Thread Winston Lin
look like we cannot even sort the output of ls by date with fs command? In *ux system, we can do ls -t ...to sort by modification time, newest first Winston On Wed, May 8, 2013 at 1:47 PM, Mohammad Tariq wrote: > I don't think any such thing is available OOTB. > > Warm Regards, > Tariq > h

Re: get recent changed files in hadoop

2013-05-07 Thread Jean-Marc Spaggiari
You can still parse the hadoop ls ouput with bash and sort it (revert, cut, sort, etc.), but that will read all the entries, just just the x first one... 2013/5/7 Winston Lin : > look like we cannot even sort the output of ls by date with fs command? > > In *ux system, we can do ls -t ...to sort

Kerboes--Checksum failed----HDFS IN HA mode

2013-05-07 Thread Brahma Reddy Battula
HI to all, Please help me for following issue I Started the cluster in HA mode by configuring the kerboes.. ENV details: === **.200.68KDC Machine **.195.165-- NN1(Active),DN1 ***195.178-- NN2(StandBy),DN2 Created the principles and keytab for the both nodes like