I found the problem. I used some private variables in my class. I was thinking
that in every tuple I'm getting, pig will create a new object of my class. But
this not the case of course.
Sorry for the inconvenience
Anastasis
On 28 Φεβ 2014, at 2:07 π.μ., Anastasis Andronidis andronat_
Hello everyone,
I have a foreach statement and inside of it, I use an order by. After the order
by, I have a UDF. Example like this:
logs = LOAD 'raw_data' USING org.apache.hcatalog.pig.HCatLoader();
logs_g = GROUP logs BY (date, site, profile) PARALLEL 2;
service_flavors = FOREACH logs_g {
μ.μ., Pradeep Gollakota pradeep...@gmail.com wrote:
Where exactly are you getting duplicates? I'm not sure I understand your
question. Can you give an example please?
On Thu, Feb 27, 2014 at 11:15 AM, Anastasis Andronidis
andronat_...@hotmail.com wrote:
Hello everyone,
I have
BTW, is this some how related[1] ?
[1]:
http://mail-archives.apache.org/mod_mbox/pig-user/201102.mbox/%3c5528d537-d05c-47d9-8bc8-cc68e236a...@yahoo-inc.com%3E
On 27 Φεβ 2014, at 11:20 μ.μ., Anastasis Andronidis andronat_...@hotmail.com
wrote:
Yes, of course, my output is like
? My uneducated guess is that there's a bug in
your UDF. To confirm, do you get the correct result if you replace your UDF
with an out of the box one e.g. COUNT?
On Thu, Feb 27, 2014 at 2:21 PM, Anastasis Andronidis
andronat_...@hotmail.com wrote:
BTW, is this some how related[1
I also just found out that the bag from the nested order by is
org.apache.pig.data.InternalCachedBag and not org.apache.pig.data.SortedDataBag
should be like that?
On 28 Φεβ 2014, at 1:51 π.μ., Anastasis Andronidis andronat_...@hotmail.com
wrote:
Hi again,
I added this in my UDF
side.
So, you have to find the log in the job tracker or node manager (if hadoop
2.x) in the server side.
2014-02-16 7:02 GMT+09:00 Anastasis Andronidis andronat_...@hotmail.com:
Hello, I am using Pig 0.11.0 with cdh4.5.0 I have a custom UDF and I am
trying to log stuff from inside. My
Hello, I am using Pig 0.11.0 with cdh4.5.0 I have a custom UDF and I am trying
to log stuff from inside. My problem is that I get no logs.
I tried with:
1) this.log
2) this.warn
3) this.pigLogger
4) I created a LogFactory from apache.commons
None of these worked. What am I doing wrong?
Cheers
Hello again,
any comments on the subject?
Cheers,
Anastasis
On 4 Φεβ 2014, at 5:36 μ.μ., Anastasis Andronidis andronat_...@hotmail.com
wrote:
Hello,
I am using Apache Pig version 0.11.0-cdh4.5.0 and when I want to know if
there is a way to overwrite a partition in a table with Hcat. Up
Hello,
I am using Apache Pig version 0.11.0-cdh4.5.0 and when I want to know if there
is a way to overwrite a partition in a table with Hcat. Up until know I am
getting errors that partition already exists. Is there an option to overwrite
it though a pig script??
Kindly,
Anastasis
Hello again,
any comments on this?
Thanks,
Anastasis
On 27 Σεπ 2013, at 5:36 μ.μ., Anastasis Andronidis andronat_...@hotmail.com
wrote:
Hello,
I am working on a very small project for my university and I have a small
cluster with 2 worker nodes and 1 master node. I'm using Pig to do
Hello,
I am working on a very small project for my university and I have a small
cluster with 2 worker nodes and 1 master node. I'm using Pig to do some
calculations and I have a question regarding small files.
I have a UDF that is reading a small input (around 200k) and correlates the
data
12 matches
Mail list logo