On Sat, Aug 4, 2012 at 11:43 PM, Nitin Kesarwani wrote:
> Mohit,
>
> You can use this patch to suit your need:
> https://issues.apache.org/jira/browse/PIG-2579
>
> New fields in Avro schema descriptor file need to have a non-null default
> value. Hence, using the new schema file, you should be abl
Mohit,
You can use this patch to suit your need:
https://issues.apache.org/jira/browse/PIG-2579
New fields in Avro schema descriptor file need to have a non-null default
value. Hence, using the new schema file, you should be able to read older
data as well. Try it out. It is very straight forward
All,
Given our recent discussion (http://s.apache.org/hv), the new
u...@hadoop.apache.org mailing list has been created and all existing users in
(common,hdfs,mapreduce)-user@ have been migrated over.
I'm in the process of changing the website to reflect this (HADOOP-8652).
Henceforth, ple
Thanks, again, Liyin.
On Sat, Aug 4, 2012 at 6:59 AM, 梁李印 wrote:
> The optimization you mentioned is reduce-task locality-aware.
> Unfortunately,
> the current scheduler doesn't consider the reduce task's data locality. So
> a
> reduce task can be scheduled to any node with free slots.
> The fol
nothing has confused me as much in hadoop as FileSystem.close().
any decent java programmer that sees that an object implements Closable
writes code like this:
Final FileSystem fs = FileSystem.get(conf);
try {
// do something with fs
} finally {
fs.close();
}
so i started out using hadoop
Hi,
I'm currently trying to fix Maven dependencies for Crunch and ran into
trouble with the POM for hadoop-core 1.0.3. It looks like the Maven
dependencies are different from the actual dependencies at runtime.
As a result, bugs caused by dependency conflicts won't show up until
runtime, makeing
The optimization you mentioned is reduce-task locality-aware. Unfortunately,
the current scheduler doesn't consider the reduce task's data locality. So a
reduce task can be scheduled to any node with free slots.
The following jira is discussing this problem:
https://issues.apache.org/jira/browse/MA
Given the size of data, there can be several approaches here:
1. Moving the boxes
Not possible, as I suppose the data must be needed for 24x7 analytics.
2. Mirroring the data.
This is a good solution. However, if you have data being written/removed
continuously (if a part of live system), there