01.10.2012 22:36, schrieb Björn-Elmar Macek:
The script i now want to executed looks like this:
x = load 'tag_count_ts_pro_userpair' as
(group:tuple(),cnt:int,times:bag{t:tuple(c:chararray)});
y = foreach x generate *, moins.daysFromStart('2011-06-01 00:00:00',
times);
store y into 'test_daysFromStart
Hi,
i am kind of unsure where to post this problem, but i think it is more
related to hadoop than to pig.
By successfully executing a pig script i created a new file in my hdfs.
Sadly though, i cannot use it for further processing except for
dumping and viewing the data: every
)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
On Mon, 1 Oct 2012 10:12:22 -0700, Robert Molina
rmol...@hortonworks.com wrote:
Hi Bjorn,
Can you post the exception you are getting during the map phase?
On Mon, Oct 1, 2012 at 9:11 AM, Björn-Elmar Macek wrote:
Hi,
i am kind of unsure where
correctly on hdfs. Can you provide the pig script you
are trying to run? Also, for the original script that ran and
generated the file, can you verify if that job had any failed tasks?
On Mon, Oct 1, 2012 at 10:31 AM, Björn-Elmar Macek wrote:
Hi Robert,
the exception i see in the output
Hi,
i had this problem once too. Did you properly overwrite the reduce method with
the @override annotation?
Does your reduce method use OutputCollector or Context for gathering outputs?
If you are using current version, it has to be Context.
The thing is: if you do NOT override the standart
-Elmar Macek ma...@cs.uni-kassel.de:
Hi,
i had this problem once too. Did you properly overwrite the reduce method
with the @override annotation?
Does your reduce method use OutputCollector or Context for gathering outputs?
If you are using current version, it has to be Context.
The thing
(Text.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Am 05.09.2012 13:56, schrieb Björn-Elmar Macek:
Hello again,
i just wanted to keep you updated, in case
people us a rule of thumb to do x4 to get
approx mem requirement.
Just some ideas, not really a solution but maybe it helps you further.
On Wed, Sep 5, 2012 at 2:02 PM, Björn-Elmar Macek
ma...@cs.uni-kassel.de wrote:
Excuse me: in my last code section was some old code included. Here is it
again
if it possible to say anything about how much the program
is still doing useful stuff.
On Wed, Sep 5, 2012 at 2:48 PM, Björn-Elmar Macek
ma...@cs.uni-kassel.de wrote:
Hi Vasco,
thank you for your help!
I can try to add the limit again (i currently have it turned off for all
Java processes spawned
Hi Dexter,
i think, what you want is a clustering of points based on the euclidian
distance or density based clustering (
http://en.wikipedia.org/wiki/Cluster_analysis ). I bet there are some
implemented quite well in Mahout already: afaik this is the datamining
framework based on Hadoop.
)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Am 29.08.12 15:53, schrieb Björn-Elmar Macek:
Hi there,
i am currently running a job where i selfjoin a 63 gigabyte big csv
file
Hi all,
well since i now got all servers continiously running for my job i still
encounter problems:
although all services seem to be up and no errors are produced i seem to
be stuck at the map process at a certain percentage. I am not yet sure,
that just letting the cluster run may solve
Ok, to give to you the solution to the namespace errors on the
datanodes, the startup and the communication problem between
datanodes/tasktracker and namenode/jobtracker i did the following:
As you can read on several sites: there are 2 strategies for fixing
datanode namespaces. since i like
Hi James,
thank you for your reply!
i tried to, but i can only see my own processes, since i am no root user. :(
I already sent out a request to the cluster admins to sort this out for me.
Regards,
Björn
Am 14.08.2012 08:51, schrieb James Brown:
Hi Bjorn,
For the two items below, it is
Hi,
i am currently trying to run my hadoop program on a cluster. Sadly
though my datanodes and tasktrackers seem to have difficulties with
their communication as their logs say:
* Some datanodes and tasktrackers seem to have portproblems of some kind
as it can be seen in the logs below. I
Hi again,
this is an direct response to my previous posting with the title Logs
cannot be created, where logs could not be created (Spill failed). I
got the hint, that i gotta check privileges, but that was not the
problem, because i own the folders that were used for this.
I finally found
Hi,
i have lately been running into problems since i started running hadoop
on a cluster:
The setup is the following:
1 Computer is NameNode and Jobtracker
1 Computer is SecondaryNameNode
2 Computers are TaskTracker and DataNode
I ran into problems with running the wordcount example:
Am 23.05.2012 10:47, schrieb Björn-Elmar Macek:
Ok, i have look at the logs some further and googled every tiny bit of
them, hoping to find an answer out there.
I fear that the following line nails my problem at a big scale:
12/05/22 01:30:21 INFO mapred.ReduceTask:
attempt_local_0001_r_00_0
* The Partitioner always returns proper values
Please, i would really need a hint, to where i have to look.
Am 22.05.2012 16:57, schrieb Björn-Elmar Macek:
Hi Jayaseelan,
thanks for the bump! ;)
I have continued working on the problem, but with no further success.
I emptied the log directory and started
Hi there,
i am currently trying to get rid of bugs in my Hadoop program by
debugging it. Everything went fine til some point yesterday. I dont know
what exactly happened, but my program does not stop at breakpoints
within the Reducer and also not within the RawComparator for the values
Hello all,
i am currently working with a set of data which is chronologically
ordered (every data element has a timestamp and they are monotonically
increasing). Please correct me, if i am mistaken, but the data should
arrive chronologically ordered at the mapper, right? But is the order
in
i read on hadoop never discussed
these issues.
BTW: HADOOP_HOME is defined, although the log tells different.
I hope you can assist me.
Best regards,
Björn-Elmar Macek
://mapredit.blogspot.com
On Apr 27, 2012, at 12:01 PM, Björn-Elmar Macek wrote:
Hello,
i have recently installed Hadoop on my and a second machine in order to test
the setup and develop little programs locally before deploying them to the
cluster. I stumbled over several difficulties, which i could fix with some
would suggest you
use the default configs:
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/
- Alex
--
Alexander Lorenz
http://mapredit.blogspot.com
On Apr 27, 2012, at 12:39 PM, Björn-Elmar Macek wrote:
Hi Alex,
as i have written, i already
24 matches
Mail list logo