-in theory this should work-
Find the part of hadoop code that calculates the number of cores and patch
it to always return one. [?]
On Wed, Jan 29, 2014 at 3:41 AM, Keith Wiley wrote:
> Yeah, it isn't, not even remotely, but thanks.
>
> On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote:
>
>
Hello,
I downloaded the latest stable hadoop release rfom the mirrors as a tarball:
hadoop.2.2.0.tar.gz
Then extracted the files with Archive Manager (on Ubuntu 12.10)
There are no install docs in the top level and no documentation directory.
Then, the "Getting Started" links on http://hadoop.a
I am out of the office until 03/02/2014.
I will be out of the office.
For any urgent matter please contact Dittmar Haegele
(dittmar.haeg...@de.ibm.com) or Tadhg Murphy (murp...@ie.ibm.com)
I will respond to your message when I will be back.
Note: This is an automated response to your message
Hi All ,
I ran job with 1 Map and 1 Reducers (
mapreduce.job.reduce.slowstart.completedmaps=1 ). Map failed ( because of
error in Mapper implementation), but still Reducers are launched by
applicationMaster. These reducers killed by applicationMaster while
stopping RMCommunicato
Are you calling one command per file? That's bound to be slow as it invokes
a new JVM each time.
On Jan 29, 2014 7:15 AM, "Jay Vyas" wrote:
> Im finding that "hadoop fs -put" on a cluster is quite slow for me when i
> have large amounts of small files... much slower than native file ops.
> Note t
maybe its inode exhaustion:
'df -i' command can tell you more.
On Mon, Jan 27, 2014 at 12:00 PM, John Lilley wrote:
> I've found that the error occurs right around a threshold where 20 tasks
> attempt to open 220 files each. This is ... slightly over 4k total files
> open.
>
> But that's the t
Im finding that "hadoop fs -put" on a cluster is quite slow for me when i
have large amounts of small files... much slower than native file ops.
Note that Im using the RawLocalFileSystem as the underlying backing
filesystem that is being written to in this case, so HDFS isnt the issue.
I see that
Yeah, it isn't, not even remotely, but thanks.
On Jan 28, 2014, at 14:06 , Bryan Beaudreault wrote:
> If this cluster is being used exclusively for this goal, you could just set
> the mapred.tasktracker.map.tasks.maximum to 1.
>
>
> On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley wrote:
> I'm ru
OK - I set up a ResourceManager node with a bunch of NodeManager slaves.
The set up is as follows:
HDFS: machine X is a Name node, it has 16 slaves (IPs: x.x.x.200-215)
Resources: machine Y is a Resource manager, it has 16 of the same slaves
(IPs: x.x.x.200-215) as Node manager slaves.
If I sta
If this cluster is being used exclusively for this goal, you could just set
the mapred.tasktracker.map.tasks.maximum to 1.
On Tue, Jan 28, 2014 at 5:00 PM, Keith Wiley wrote:
> I'm running a program which in the streaming layer automatically
> multithreads and does so by automatically detecting
I'm running a program which in the streaming layer automatically multithreads
and does so by automatically detecting the number of cores on the machine. I
realize this model is somewhat in conflict with Hadoop, but nonetheless, that's
what I'm doing. Thus, for even resource utilization, it wou
Hi Serge,
I'm using Apache hadoop distribution.
On Jan 29, 2014 12:54 AM, "Serge Blazhievsky" wrote:
> Which hadoop distribution are you using?
>
>
> On Tue, Jan 28, 2014 at 10:04 AM, Viswanathan J <
> jayamviswanat...@gmail.com> wrote:
>
>> Hi Guys,
>>
>> I'm running hadoop 2.2.0 version with p
Response inline...
On Tue, Jan 28, 2014 at 10:04 AM, Anfernee Xu wrote:
> Hi,
>
> Based on
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/Federation.html#Key_Benefits,
> the overall performance can be improved by federation, but I'm not sure
> federation address my userc
Thanks for sharing this as we had the same problem, and We are playing
with similar errors. Starting to think that there is something overly
difficult about pig/hadoop 2.x deployment, related to which version of pig
you use .
chelsoo has helped us resolve our issue by pointing us to
https://iss
Furthermore, what is the difference between a ResourceManager node and a
NodeManager node?
Ognen
On Tue, Jan 28, 2014 at 1:22 PM, Ognen Duzlevski
wrote:
> Hello,
>
> I have set up an HDFS cluster by running a name node and a bunch of data
> nodes. I ran into a problem where the files are only st
Which hadoop distribution are you using?
On Tue, Jan 28, 2014 at 10:04 AM, Viswanathan J
wrote:
> Hi Guys,
>
> I'm running hadoop 2.2.0 version with pig-0.12.0, when I'm trying to run
> any job getting the error as below,
>
> *java.lang.NoSuchFieldError: IBM_JAVA*
>
> Is this because of Java ver
Hello,
I have set up an HDFS cluster by running a name node and a bunch of data
nodes. I ran into a problem where the files are only stored on the node
that uses the hdfs command and was told that this is because I do not have
a job tracker and task nodes set up.
However, the documentation for 2.
Thanks Daryn, I just want to confirm I can get performance improvement if
I go with federation before I start the effort(I have to re-design my data
schema so that they can have different namespace).
On Tue, Jan 28, 2014 at 10:53 AM, Daryn Sharp wrote:
> Hi Anfernee,
>
> You will achieve imp
Anything on this?
I am pretty stuck here.
_Not_ possible to install and run Hadoop 2.2.0 with the instructions on
the website. I am sure this is not how it's supposed to be with SW from
the Apache Software Foundation: frustrating!
Where is the 'MapReduce tarball' in the binary download? It's me
Hi Anfernee,
You will achieve improved performance with federation only if you stripe files
across the multiple NNs. Federation basically shares DN storage with multiple
NNs with the expectation the namespace load will be distributed across the
multiple NNs. If everything writes to the exact
I am archiving a large amount of data out of my HDFS file system to a
separate shared storage solution (There is not much HDFS space left in my
cluster, and upgrading it is not an option right now).
I understand that HDFS internally manages checksums and won't succeed if
the data doesn't match the
This is what a friend of mine that knows elastic search had to say about
this:
o Their tagcombinations are no different than say a category or similar
grouping for data
o A search can then be executed on the index using a mixture of search
functions
ยง Search on index for the tags category
Hi Guys,
I'm running hadoop 2.2.0 version with pig-0.12.0, when I'm trying to run
any job getting the error as below,
*java.lang.NoSuchFieldError: IBM_JAVA*
Is this because of Java version or compatibility issue with hadoop and pig.
I'm using Java version - *1.6.0_31*
Please help me out.
--
R
Hi,
Based on
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/Federation.html#Key_Benefits,
the overall performance can be improved by federation, but I'm not sure
federation address my usercase, could someone elaborate it?
My usercase is I have one single NM and several DN, a
i had tried on cassandra, that attempt was not convincing, but not used
distributed countersi actually needed tagcombination ids in output, not
the no of matches, for the given set of tags..
please illustrate a little your thought by taking my tag combination table
design..
On Tue, Jan 28, 2
No-sql solution with real-time counters would work, e.g. Cassandra or
hbase. But I think elastic search or Solr would be simpler and can do the
counting on access. There are solutions that are the combination of both
these approaches.
On Tue, Jan 28, 2014 at 8:51 AM, Naresh Yadav wrote:
> pleas
There is a lesson in this by the way, I just realized I pasted my
access/secret access key to the bucket in the public email. DOH, changed ;)
Ognen
On Tue, Jan 28, 2014 at 10:55 AM, Ognen Duzlevski
wrote:
> Ahh. No, I do not have a job tracker. OK - I guess I need to set one up :)
>
> Thanks!
>
Ahh. No, I do not have a job tracker. OK - I guess I need to set one up :)
Thanks!
Ognen
On Tue, Jan 28, 2014 at 10:51 AM, Bryan Beaudreault <
bbeaudrea...@hubspot.com> wrote:
> Do you have a jobtracker? Without a jobtracker and tasktrackers, distcp
> is running in LocalRunner mode. I.E. it i
please give suggestions on this...
On Tue, Jan 28, 2014 at 3:18 PM, Naresh Yadav wrote:
> Hi all,
>
> I am new to big data technologies and design so looking for help from java
> world.
>
> I have concept of tags and tagcombinations.
> For example U.S.A and Pen are two tags AND if they come tog
Do you have a jobtracker? Without a jobtracker and tasktrackers, distcp is
running in LocalRunner mode. I.E. it is running a single-threaded process
on the local machine. The default behavior of the DFSClient is to write
data locally first, with replicas being placed off-rack then on-rack.
This
Hello,
I am new to Hadoop and HDFS so maybe I am not understanding things properly
but I have the following issue:
I have set up a name node and a bunch of data nodes for HDFS. Each node
contributes 1.6TB of space so the total space shown on the hdfs web front
end is about 25TB. I have set the re
Hi,
The central class is FSNamesystem.java downwards. I'd advise drilling
down from the NameNodeRpcServer sources at
https://github.com/apache/hadoop-common/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java,
for any client operat
I read that the namenode keeps all the metadata record in main memory right for
fast access.
I want to study the code where the namenode is instructed to do so i.e. to keep
metadata record in main memory. where can i find the source code file for this
namenode memory management? i am using githu
Hi all,
I am new to big data technologies and design so looking for help from java
world.
I have concept of tags and tagcombinations.
For example U.S.A and Pen are two tags AND if they come together in some
definition then register a tagcombination(U.S.A-Pen) for that..
*tags *(U.S.A, Pen, Penci
34 matches
Mail list logo