Hi Meghana,
Are you sure that you're using Apache Pig version 0.10.0-cdh4.1.0? Because
a change was made to PigStorageSchema in Pig 0.10 (
https://issues.apache.org/jira/browse/PIG-2143), it is not possible to get
your call stack:
at
org.apache.pig.piggybank.storage.PigStorageSchema.s
Sounds like however you wrote the data, it has some sort of a binary
delimiter. Figure out what that delimiter is, and tell PigStorage to
use it. For example:
my_data = load 'path/to/data' using PigStorage('\\u001');
D
On Thu, Oct 11, 2012 at 10:23 AM, yogesh dhari wrote:
>
> Hi All ,
>
> How t
The default partitioning algorithm is basically this:
reducer_id = key.hashCode() % num_reducers
If you are joining on values that all map to the same reducer_id using
this function, they will go to the same reducer. But if you have a
reasonable hash code distribution and a decent volume of uniqu
Martin,
Do you have the compete stack trace?
Generally, for Hive interop I recommend HCatalog; AllLoader is neat
but it's a 3rd party contrib and we don't really know it too well. I
can check out the error dump and see if there's anything obvious
though.
D
On Fri, Oct 12, 2012 at 8:48 AM, Martin
Thanks Arun you are right.
Sent from my iPhone
On Oct 12, 2012, at 11:26 AM, Arun Ahuja wrote:
> From my interpretation Hive coaelsce returns the first non-null value.
>
> So it seems you are just doing a null check on x and return y if it is
> null and z otherwise?
>
> In Pig you could do s
I am trying to load some text files in hive partitions on S3 using the
AllLoader function with no success. I get an error which indicates that
AllLoader is expecting the files to be on hdfs:
a = LOAD 's3n://x/y/zzz' using
org.apache.pig.piggybank.storage.AllLoader();
grunt> 2012-10-12 14:5
Instead of
count = foreach perCust generate group, COUNT(filtered_times.movie);
use
count = foreach perCust generate FLATTEN(group), COUNT(filtered_times.movie);
FLATTEN is a special operator that replaces a tuple with the elements
inside the tuple.
On Thu, Oct 11, 2012 at 4:36 PM, jamal sasha
>From my interpretation Hive coaelsce returns the first non-null value.
So it seems you are just doing a null check on x and return y if it is
null and z otherwise?
In Pig you could do something like --- " (x is null ? y : z) This a
standard ternary if/else. Don't see if the 0.00 actually plays
Hi all,
How to Hive coalesce statement in pig
Example:
Case when
Coalesce(x,0.00)=0.00 then y else z
How to write this in pig
Regards
Abhi