Reading files from hdfs directory

2014-03-13 Thread Satyam Singh
Hello, I want to read files from hdfs remotely through camel-hdfs client. I have made changes in camel-hdfs component for supporting hadoop2.2.0 . I checked file that I want to read, exists on hdfs: [hduser@bl460cx2425 ~]$ hadoop fs -ls /user/hduser/collector/test.txt 14/03/13 09:13:31 WARN util

RE: Reading files from hdfs directory

2014-03-13 Thread Vinayakumar B
Hi Satyam, Check whether your Camel client-side configurations are pointing to correct NameNode(s). What is the deployment ? whether HA/Non-HA? And check whether same exception is present in (Active) NameNode logs. If not then request is going to some other NameNode. Regards, Vinayakumar B.

verbose output

2014-03-13 Thread Mahmood Naderan
Hi, Is there any verbosity flag for hadoop and mahout commands? I can not find such thing in the command line.   Regards, Mahmood

Re: Use Cases for Structured Data

2014-03-13 Thread Dieter De Witte
Sandbox is just meant to be a learning environment i guess, to see what's possible, how things can be connected. The real distribution will have much higher performance and is the one you need when you want to investigate performance issues. The only real drawback of the real distributions is that

Re: verbose output

2014-03-13 Thread Mahmood Naderan
The hadoop-2.3.0/log is empty when I run mahout command which uses hadoop   Regards, Mahmood On Thursday, March 13, 2014 12:53 PM, Sebastian Schelter wrote: To my knowledge, there is no such flag for mahout. You can check hadoop's logs for further information however. On 03/13/2014 10:21

Re: Solving "heap size error"

2014-03-13 Thread Mahmood Naderan
Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops at the chunk #571 (571*64=36.5GB). Still I haven't figured out is this a problem with JVM or Hadoop or Mahout? I have tested various parameters on 16GB RAM mapred.map.child.java.opts -Xmx2048m mapred.reduce.child

Streaming a subset of HBase data

2014-03-13 Thread Ian Brooks
Hi, I'm trying to implement a way of using the hadoop-streaming-2.2.0.jar to export a subset of data ( timerange ) to a mapper and reduce application written in another language. However I have been unable to get anything but all the data from HBase table. Looking at the code and forums, it se

Re: Use Cases for Structured Data

2014-03-13 Thread ados1...@gmail.com
okies, thank you D, i will start playing around with the Sandbox version. On Thu, Mar 13, 2014 at 5:55 AM, Dieter De Witte wrote: > Sandbox is just meant to be a learning environment i guess, to see what's > possible, how things can be connected. The real distribution will have much > higher

Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Hello Team, I am initiating an POC to see value of having hadoop in our architecture and so after discussing my current scenario with experts here, i think it would be better for me to start using sandbox version rather then using actual distribution from POC point of view. My query here is how t

RE: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread Martin, Nick
Hi Andy, Generally speaking, the folks participating on this list avoid questions of distribution preference. There are, perhaps obviously, both minor and significant differences in distributions that you should research and evaluate to find the best fit for your organization's strategy. Asking

Re: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Thank you Martin. I will make sure that I do not have vendor specific question on this forum. But since am starting out with Hadoop, I wanted to learn about what are the keys things that we have to keep in mind while deciding on which distribution to take...open source hadoop, mapr m7, hortonworks

Hbase create table error

2014-03-13 Thread Manish
Hi All, Below is the error details that i am getting when creating tables in Hbase. All the services are running fine. hbase(main):001:0> create 't1', 'cf1' *ERROR: java.lang.NoClassDefFoundError: org/apache/hadoop/security/authentication/util/KerberosName* Here is some help for this comma

RE: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread Martin, Nick
Start here http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support The list of things you might consider before picking a distribution is quite likely limited only by one's imagination. So, start with the basics like hosted vs. in-house, what your use case(s) cover, etc. Basica

Re: Solving "heap size error"

2014-03-13 Thread Mahmood Naderan
I am pretty sure that there is something wrong with hadoop/mahout/java. With any configuration, it stuck at the chunk #571. Previous chunks are created rapidly but I see it waits for bout 30 minutes on 571 and that is the reason for heap error size. I will try to submit a bug report.   Regards

Pig with Tez

2014-03-13 Thread Viswanathan J
Hi, Is that apache pig will run with tez?

Re: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Thanks Nick, appreciate your inputs on this. On Thu, Mar 13, 2014 at 12:51 PM, Martin, Nick wrote: > Start here > http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support > > > > The list of things you might consider before picking a distribution is > quite likely limited only

Re: Pig with Tez

2014-03-13 Thread Kim Chew
Google is your friend, http://hortonworks.com/hadoop/tez/ Kim On Thu, Mar 13, 2014 at 12:16 PM, Viswanathan J wrote: > Hi, > > Is that apache pig will run with tez? >

ResourceManager shutting down

2014-03-13 Thread John Lilley
We have this erratic behavior where every so often the RM will shutdown with an UnknownHostException. The odd thing is, the host it complains about have been in use for days at that point without problem. Any ideas? Thanks, John 2014-03-13 14:38:14,746 INFO rmapp.RMAppImpl (RMAppImpl.java:ha

Reg: Setting up Hadoop Cluster

2014-03-13 Thread ados1...@gmail.com
Hello Team, I have one question regarding putting data into hdfs and running mapreduce on data present in hdfs. 1. hdfs is file system and so to interact with it what kind of clients are available? also where do we need to install those client? 2. regarding pig, hive and mapreduce, where

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread Geoffry Roberts
Andy, Once you have hadoop running, You can run your jobs from the cli of the name node. When I write a map reduce job, I jar it up. and place it in, say, my home directory and run it from there. I do the same with pig scripts. I've used neither hive nor cascading, but I imagine they would work

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread ados1...@gmail.com
Thank you Geoffry, I have some fundamental question here. 1. Once I have installed Hadoop, how can i identify which nodes is master node, which is slave? 2. My understanding is that master node is by default namenode and slave node are data nodes, correct? 3. So i installed hadoop

RE: ResourceManager shutting down

2014-03-13 Thread John Lilley
Never mind... we figured out its DNS entry was going missing. john From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Thursday, March 13, 2014 2:52 PM To: user@hadoop.apache.org Subject: ResourceManager shutting down We have this erratic behavior where every so often the RM will shutdown w

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread Geoffry Roberts
Did you not populate the "slaves" file when you did your installation? In older versions of hadoop (< 2.0), there was a "master" file where you entered your name node. Now days there are multiple name nodes. I haven't worked with them as of yet. I installed pig, for example, on my name node an

Apache Tez supporting pig version

2014-03-13 Thread Viswanathan J
Hi, Which pig version supports the Apache Tez? Pig 0.12 version will support the Tez? Or v0.14 yet to release. Pls help.

Re: Hadoop2.x reading data

2014-03-13 Thread Viswanathan J
Thanks Harsh. On Mar 11, 2014 11:19 PM, "Harsh J" wrote: > This is a Pig problem, not a "Hadoop 2.x" one - can you please ask it > at u...@pig.apache.org? You may have to subscribe to it first. > > On Tue, Mar 11, 2014 at 1:03 PM, Viswanathan J > wrote: > > Hi, > > > > I'm currently trying to us

RE: NodeManager health Question

2014-03-13 Thread Rohith Sharma K S
Hi , As troubleshooting, few things you can verify 1. check RM web UI for "Is there any 'Active Nodes' in Yarn cluster"?. http://< yarn.resourcemanager.webapp.address>/cluster. And also verify for "Lost Nodes" or "Unhealthy Nodes" or "Rebooted Nodes". If there any active

Re: Pig with Tez

2014-03-13 Thread Hitesh Shah
Pig-on-Tez is a work in progress. You will get more info on the current status on the pig mailing lists. Additional details on the umbrella jira: https://issues.apache.org/jira/browse/PIG-3446 thanks -- Hitesh On Mar 13, 2014, at 12:16 PM, Viswanathan J wrote: > Hi, > > Is that apache pig

Re: ResourceManager shutting down

2014-03-13 Thread Hitesh Shah
Hi John Would you mind filing a jira with more details. The RM going down just because a host was not resolvable or DNS timed out is something that should be addressed. thanks -- Hitesh On Mar 13, 2014, at 2:29 PM, John Lilley wrote: > Never mind… we figured out its DNS entry was going missin

Re: ResourceManager shutting down

2014-03-13 Thread Jian He
Which Hadoop version are you running ? this should be recently fixed. Jian On Thu, Mar 13, 2014 at 8:33 PM, Hitesh Shah wrote: > Hi John > > Would you mind filing a jira with more details. The RM going down just > because a host was not resolvable or DNS timed out is something that should > be

RE: ResourceManager shutting down

2014-03-13 Thread Rohith Sharma K S
Hi Hitesh, Yes it is an issue. This is handled in https://issues.apache.org/jira/i#browse/YARN-713 fixes DNS Issue. This fix available on hadoop-2.4(unreleased). Thanks & Regards Rohith Sharma K S -Original Message- From: Hitesh Shah [mailto:hit...@apache.org] Sent: 14 Marc

Difference between FILE_Bytes_READ vs HDFS_Bytes_Read.

2014-03-13 Thread Sai Sai
Can some please help: 1. Difference between FILE_Bytes_READ vs HDFS_Bytes_Read. Thanks Sai

fault tolerance question.

2014-03-13 Thread Sai Sai
Lets say the client is writing the first block to the first Data node and the node fails what will happen now, will the client or the NN do something about it. Thanks Sai

Is hdinsights a C# version of hadoop or is it in java.

2014-03-13 Thread Sai Sai
Is hdinsights a C# version of hadoop or is it in java. Please let me know. Thanks Sai

Re: Is hdinsights a C# version of hadoop or is it in java.

2014-03-13 Thread Marco Shaw
It is based on Java (uses Hortonworks), however, Microsoft provides a .NET SDK: http://hadoopsdk.codeplex.com Marco > On Mar 14, 2014, at 2:32 AM, Sai Sai wrote: > > Is hdinsights a C# version of hadoop or is it in java. > Please let me know. > Thanks > Sai