Cache file conflict

2013-08-29 Thread Public Network Services
Hi... After updating the source JARs of an application that launches a second job while running a MR job, the following error keeps occurring: org.apache.hadoop.mapred.InvalidJobConfException: cache file (mapreduce.job.cache.files) scheme: hdfs, host: server, port: 9000, file:

Converting a Path to a full URI String and preserving special characters

2013-08-08 Thread Public Network Services
Is there a reliable way of converting an HDFS Path object into a String? Invoking path.toUri().toString() does not work with special characters (e.g., if there are spaces in the original path string). So, for instance, in the following example String address = ...; // Path string without the

Large-scale collection of logs from multiple Hadoop nodes

2013-08-05 Thread Public Network Services
Hi... I am facing a large-scale usage scenario of log collection from a Hadoop cluster and examining ways as to how it should be implemented. More specifically, imagine a cluster that has hundreds of nodes, each of which constantly produces Syslog events that need to be gathered an analyzed at

Passing values from InputFormat via the Configuration object

2013-05-17 Thread Public Network Services
Hi... I need to communicate some proprietary number (long) values from the getSplits() method of a custom InputFormat class to the Hadoop driver class (used to launch the job), but the JobContext object passed to the getSplits() method has no access to a Counters object. From the source code, it

Re: Passing values from InputFormat via the Configuration object

2013-05-17 Thread Public Network Services
counters = work.getCounters(); } Would that be correct? On Fri, May 17, 2013 at 5:33 PM, Public Network Services publicnetworkservi...@gmail.com wrote: Hi... I need to communicate some proprietary number (long) values from the getSplits() method of a custom InputFormat class to the Hadoop driver

Re: BlockMissingException

2013-05-15 Thread Public Network Services
, Public Network Services publicnetworkservi...@gmail.com wrote: Hi... I am getting a BlockMissingException in a fairly simple application with a few mappers and reducers (see end of message). Looking around in the web has not helped much, including JIRA issues HDFS-767 and HDFS-1907

BlockMissingException

2013-05-14 Thread Public Network Services
Hi... I am getting a BlockMissingException in a fairly simple application with a few mappers and reducers (see end of message). Looking around in the web has not helped much, including JIRA issues HDFS-767 and HDFS-1907. The configuration variable -

Re: Getting custom input splits from files that are not byte-aligned or line-aligned

2013-02-23 Thread Public Network Services
you'll have to implement your own custom FileInputFormat, using this lib you mentioned to properly read your file records and split them through map tasks. Regards, Wellington. Em 23/02/2013 14:14, Public Network Services publicnetworkservi...@gmail.com escreveu: Hi... I use an application

Re: MapReduce processing with extra (possibly non-serializable) configuration

2013-02-21 Thread Public Network Services
] , It's a facility provided by the MR framework to cache files (text,archives, jars etc) needed by applications. [0] http://hadoop.apache.org/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html On Fri, Feb 22, 2013 at 5:10 AM, Public Network Services publicnetworkservi

Re: MapReduce processing with extra (possibly non-serializable) configuration

2013-02-21 Thread Public Network Services
Hazelcast, etc., where this may be taken care for you automatically in some way? :) On Fri, Feb 22, 2013 at 2:40 AM, Public Network Services publicnetworkservi...@gmail.com wrote: Hi... I am trying to put an existing file processing application into Hadoop and need to find the best way