lzo testing problem in hadoop cluster

2013-07-22 Thread ch huang
hi all: i use cdh4.3 ,and i use yarn ,so it mapreduce port is 8088 ,not 9001 ,why the following command still try to connect the port 9001? any one can help? # hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-0.4.15.jar com.hadoop.compression.lzo.DistributedLzoIndexer -jt

Re: lzo testing problem in hadoop cluster

2013-07-22 Thread 李洪忠
after hdfs 1.x, use snappy or lzw is better than lzo. 于 2013/7/22 15:58, ch huang 写道: 13/07/22 15:51:25 INFO lzo.LzoCodec: Successfully loaded initialized native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]

test lzo problem in hadoop

2013-07-22 Thread ch huang
anyone can help? # sudo -u hdfs hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-0.4.15.jar com.hadoop.compression.lzo.DistributedLzoIndexer /alex/test_lzo/sqoop-1.99.2-bin-hadoop200.tar.gz.lzo 13/07/22 16:33:50 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library 13/07/22 16:33:50 INFO lzo.LzoCodec:

Re: Data Import via Sqoop from Postgresql to HDFS

2013-07-22 Thread Fatih Haltas
Hi Jarek, Thanks for your help. But I am using sqoop 1.4.3 but --schema version did not work for me. On Sun, Jul 21, 2013 at 6:54 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Hi Fatih, The list-database tool seems to be working with only one schema at the time. You can specify extra

Re: Data Import via Sqoop from Postgresql to HDFS

2013-07-22 Thread Fatih Haltas
Is there any --schema option to be able to list schemas other than public, because, I am not able to list other tables under non-public schemas? On Mon, Jul 22, 2013 at 12:39 PM, Fatih Haltas fatih.hal...@nyu.edu wrote: Hi Jarek, Thanks for your help. But I am using sqoop 1.4.3 but --schema

RE

2013-07-22 Thread sri harsha
Hi all, can some one do post interview questions on hadoop? -- amiable harsha

Listing spesific schema

2013-07-22 Thread Fatih Haltas
Finally I found the mails between Jarek and Vantesh true usage is sqoop import --connect jdbc:postgresql://192.168.194.158:5432/pgsql--username pgsql --password XXX -- --schema fatih As far as I read emails, it was a bug then solved by Vantesh, Thanks.

Re: test lzo problem in hadoop

2013-07-22 Thread Sandeep Nemuri
Try this Ccommand hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-cdh4-0.4.15-gplextras.jar com.hadoop.compression.lzo.LzoIndexer /user/sample.txt.lzo On Mon, Jul 22, 2013 at 2:08 PM, ch huang justlo...@gmail.com wrote: anyone can help? # sudo -u hdfs hadoop jar

Re: RE

2013-07-22 Thread shashwat shriparv
Do you really think this is the place to get interview question? do following : www.google.com hadoop+interview+questions you will get lot of links. *Thanks Regards* ∞ Shashwat Shriparv On Mon, Jul 22, 2013 at 2:54 PM, sri harsha rsharsh...@gmail.com wrote: Hi all, can some one

Re: Loading 1st 100 records

2013-07-22 Thread Shahab Yunus
Any error messages or details or logs would be helpful in advising. Plus you are saying your loading FROM teradata. Where are you loading TO? How does HDFS (and the 100 files on it) comes into the picture? Regards, Shahab On Mon, Jul 22, 2013 at 12:06 PM, suneel hadoop

Loading 1st 100 records

2013-07-22 Thread suneel hadoop
Hi All, We have a strange problem, We are trying to load data using sqoop from teradata and we have 100 part files in the Hdfs location,while running it is loading first 100 records of the first part file and getting failed..what would be the problem.. Thanks in advance.. Thanks, Suneel

Re: Data Import via Sqoop from Postgresql to HDFS

2013-07-22 Thread Jarek Jarcec Cecho
Hi Fatih, the --schema extra parameter was added in 1.4.3 via SQOOP-601. It's however extra parameter and not a normal Sqoop parameter. As a result it needs to be specified in the extra argument section of the command line. For example: sqoop list-tables --connect ... -- --schema

setting mapred.task.timeout programmatically from client

2013-07-22 Thread Eugene Koifman
Hello, is there a way to set mapred.task.timeout programmatically from client? Thank you

Re: Parameter 'yarn.nodemanager.resource.cpu-cores' does not work

2013-07-22 Thread Sandy Ryza
Hi Sam, LinuxResourceCalculatorPlugin and DominantResourceCalculator control separate things. The former is for a NodeManager to calculate the resource usage of a container process so that it can kill it if it gets too large. The latter is used by the Capacity Scheduler to allocate containers,

Re: Parameter 'yarn.nodemanager.resource.cpu-cores' does not work

2013-07-22 Thread sam liu
Hi Sandy, Thanks to your detailed explanation! But I am still not very clear. In my current cluster with Hadoop-2.0.3-alpha, how to enable the properties 'yarn.nodemanager.resource.cpu-cores' and 'yarn.nodemanager.vcores-pcores-ratio' work for me? Or do they only works well in 2.1.0-beta?

RE: setting mapred.task.timeout programmatically from client

2013-07-22 Thread Devaraj k
'mapred.task.timeout' is deprecated configuration. You can use 'mapreduce.task.timeout' property to do the same. You could set this configuration while submitting the Job using org.apache.hadoop.conf.Configuration.setLong(String name, long value) API from conf or JobConf. Thanks Devaraj k

Re: setting mapred.task.timeout programmatically from client

2013-07-22 Thread Harsh J
Yes, you can set it into your Job configuration object in code. If your driver uses the Tool framework, then you can also pass a -Dmapred.task.timeout=value CLI argument when invoking your program. On Tue, Jul 23, 2013 at 4:24 AM, Eugene Koifman ekoif...@hortonworks.com wrote: Hello, is there a

Re: setting mapred.task.timeout programmatically from client

2013-07-22 Thread Eugene Koifman
Than you both On Mon, Jul 22, 2013 at 8:16 PM, Devaraj k devara...@huawei.com wrote: 'mapred.task.timeout' is deprecated configuration. You can use 'mapreduce.task.timeout' property to do the same. You could set this configuration while submitting the Job using

Re: setting mapred.task.timeout programmatically from client

2013-07-22 Thread Balamurali
Hi, I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . Created table in HBase.Inserted records.Processing the data using Hive. I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks