Ed,
Actually Oozie is quite different from Cascading.
* Cascading allows you to write 'queries' using a Java API and they get
translated into MR jobs.
* Oozie allows you compose sequences of MR/Pig/Hive/Java/SSH jobs in a DAG
(workflow jobs) and has timer+data dependency triggers (coordinator
Dear all,
I am using Hadoop-0.20.2 and Hadoopdb Hive on a 5 node cluster.
I am connecting Hive through Eclipse but I got the error below :
Hive history
file=/tmp/hadoop/hive_job_log_hadoop_201012141618_1092196256.txt
10/12/14 16:18:37 INFO exec.HiveHistory: Hive history
Dear All,
I've got a Folder A, and has a Symbol Link folder A' linked to A, but
when I add A' as one of the inputformat folders, it gives me this error:
Exception in thread main
org.apache.hadoop.hdfs.protocol.UnresolvedPathException:
hdfs://localhost:9000/user/songliu/W
at
Hmmm, I'll take that under advisement. So, even if I manually avoided redoing
earlier work (by keeping a log of which input key/values have been processed
and short-circuiting the map() if a key/value has already been processed,
you're saying those previously completed key/values would not be
On Dec 13, 2010, at 3:14 PM, Seth Lepzelter wrote:
Alright, a little further investigation along that line (thanks for the
hint, can't believe I didn't think of that), shows that there's actually a
carriage return character (%0D, aka \r) at the end of the filename.
This falls into
On Dec 13, 2010, at 17:58 , li ping wrote:
I think the *org.apache.hadoop.mapred.SkipBadRecords* is you are looking
for.
Yes, I considered that at one point. I don't like how it insists on
iteratively retrying the records. I wish it would simply skip the failed
records and move on, just
On Dec 14, 2010, at 09:30 , Harsh J wrote:
Hi,
On Tue, Dec 14, 2010 at 10:43 PM, Keith Wiley kwi...@keithwiley.com wrote:
I wish there were a less burdensome version of skipbadrecords. I don't want
it to perform a binary search trying to find the bad record while
reprocessing data over
I see it this way.
You can glue a bunch of discrete command line apps together that may or may not
have dependencies between one another in a new syntax. which is darn nice if
you already have a bunch of discrete ready to run command line apps sitting
around that need to be strung together,
Hi Ted,
Thanks for your reply.
Shen
On Tue, Dec 14, 2010 at 1:37 PM, Ted Yu yuzhih...@gmail.com wrote:
Check out the code on github
You can find
contrib/highavailability/src/java/org/apache/hadoop/hdfs/AvatarZooKeeperClient.java
On Sun, Dec 12, 2010 at 11:54 PM, ChingShen
When I load a file from HDFS into hive i notice that the original file
has been removed. Is there anyway to prevent this? If not, how can I got
back and dump it as a file again? Thanks
Hi Mark,
You can use 'External table' in Hive.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDLHive external table
does not move or delete files.
- Youngwoo
2010/12/15 Mark static.void@gmail.com
When I load a file from HDFS into
11 matches
Mail list logo