Hi, all
Please visit my post about the problem:
http://stackoverflow.com/questions/25285784/union-null-with-enum-in-pig.
Thanks
Martin Repka
/AvroTupleWrapper.java#L132
Thanks,
Cheolsoo
On Tue, Mar 25, 2014 at 9:17 AM, Liliang Li lll.trip...@gmail.com wrote:
Hi:
I have a record of union type of
union {TypeA, TypeB, TypeC, TypeD, TypeE} mydata;
I have the serialized data in avro format, however when I am trying to use
piggybank.jar's
Hi Cheolsoo,
Thanks for your reply! (Liang and I work together.) The restriction to
simple union types is still there in the latest code; see lines 83-95,
here:
https://github.com/apache/pig/blob/trunk/src/org/apache/pig/impl/util/avro/AvroStorageSchemaConversionUtilities.java
I know
Hello Keren,
There is nothing wrong in this. One dataset in Hadoop is usually one folder
and not one file. Pig is doing what it is supposed to do and performing a
union on both the files. You would have seen the content of both the files
together while doing dump C.
Since this is a map only job
You could try something like this :
A = load '/1.txt' using PigStorage(' ') as (x:int, y:chararray,
z:chararray);
B = load '/1_ext.txt' using PigStorage(' ') as (a:int, b:chararray,
c:chararray);
C = union A, B;
D = group C by 1;
E = foreach D generate flatten(C);
store E into '/dir';
Warm
hello,
I am doing this
DEFINE AVRO_LOAD org.apache.pig.piggybank.strorage.avro.AvroStorage();
A = load '/user/abhi/a.txt' using AVRO_LOAD;
B = load '/user/abhi/b.txt' using AVRO_LOAD;
C = UNION A , B;
here script is failing with the following error
*ERROR org.apache.pig.tools.grunt.Grunt
The following gist illustrates my question:
https://gist.github.com/jcoveney/5320422
It seems pretty surprising to me that all of these cases all return 1.0, at
least in python (I will now do this in Java, it's just more verbose). Is
this an issue with python? Is this an issue period? Is this
woops, wrong listserv :)
2013/4/5 Jonathan Coveney jcove...@gmail.com
The following gist illustrates my question:
https://gist.github.com/jcoveney/5320422
It seems pretty surprising to me that all of these cases all return 1.0,
at least in python (I will now do this in Java, it's just
This sounds like a bug.
What do you have after the union ?
Can you try to reproduce this with a script/data that you can share ?
If you can open a jira with details, that would be even better.
Thanks,
Thejas
On 9/4/12 8:52 AM, Xavier Stevens wrote:
I'm trying to do a UNION on two datasets
Hey Thejas,
After the union I just try to store using Elephant Bird's sequence file
storage with a BytesWritable key and Text value.
I'll open up a JIRA ticket with the details.
Cheers,
-Xavier
On 9/11/12 9:37 AM, Thejas Nair wrote:
This sounds like a bug.
What do you have after
I'm trying to do a UNION on two datasets with identical schemas
(k:bytearray, v:chararray). When using the UNION operator like so:
combined_data = UNION dataset1, dataset2;
I get the following error:
java.lang.RuntimeException: Unexpected data type java.util.ArrayList found in
stream. Note
exception when attempting to union two
relations.
Schemas
a: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id:
int,order_id: chararray,order_sales: double,delta: long,uuid:
chararray,ctid: long,advertiser_id: int
I am facing a similar issues that is described in
https://issues.apache.org/jira/browse/PIG-2493
I am running on latest trunk where the fix for PIG-2493 was committed yet I
am still getting the following exception when attempting to union two
relations.
Schemas
a: {timestamp: chararray
Hi there,
We hit a possible issue with Pig (version 0.9.1) and HBaseStorage where we try
to LOAD multiple sets of data and UNION them. Here's a simple example that
shows the problem:
HBase Data (use hbase shell to create table and add rows):
create 'test', {NAME = 'data', VERSIONS = 1}
put
of data and UNION them. Here's a simple example
that shows the problem:
HBase Data (use hbase shell to create table and add rows):
create 'test', {NAME = 'data', VERSIONS = 1}
put 'test', '1', 'data:value', '1'
put 'test', '2', 'data:value', '2'
put 'test', '3', 'data:value', '3
Ferreira eafon...@yahoo.com
Sent: Tuesday, September 6, 2011 12:56 PM
Subject: Re: Union of multiple loads using HBaseStorage not working as expected.
Hi Eduardo, there is no 0.9.1.. do you mean you built it from the 0.9 branch?
Could you try trunk?
On Tue, Sep 6, 2011 at 9:50 AM, Eduardo Afonso
How is UNION implemented?
Does it read from two source files or does it create a temporary file by
reading the N source files/relations and then writing a new temp file which
is then read from?
I could probably spend an hour looking through the source to figure this out
but I figured I would
Hi dear pigs,
I got a problem:
When I use UNION command to combine some results in one relation at the end
of pig script,
it sometimes will miss some results from UNION.
For example:
union_all_res = *union* res1, res2, res3, res4;
dump union_all_res; -- Or store union_all_res
What version of pig are you using? Do you have a join in your script?
2011/8/17 唐亮 leont...@gmail.com
Hi dear pigs,
I got a problem:
When I use UNION command to combine some results in one relation at the end
of pig script,
it sometimes will miss some results from UNION.
For example
? Are relations res1, res2 actually having records
in each case ?
Thanks,
Thejas
On 8/17/11 12:14 AM, 唐亮 wrote:
Hi dear pigs,
I got a problem:
When I use UNION command to combine some results in one relation at the end
of pig script,
it sometimes will miss some results from UNION.
For example
2010-05-15,123
2010-05-15,23
2010-05-15,456
2010-05-15,notjames
So i want to join a set of users on either the login or user id.
Here's my users:
123,james,11
234,notjames,11
456,someoneelse,11
So I thought I would be clever and load the user list, union it with itself
to generate
users:
123,james,11
234,notjames,11
456,someoneelse,11
So I thought I would be clever and load the user list, union it with
itself
to generate a relation where each user is represented twice, once by
login,
once by id:
logins = FOREACH users GENERATE LOWER(login
-15,23
2010-05-15,456
2010-05-15,notjames
So i want to join a set of users on either the login or user id.
Here's my users:
123,james,11
234,notjames,11
456,someoneelse,11
So I thought I would be clever and load the user list, union it with
itself
to generate a relation
add new partition to Howl table.
5) Run your pig script either on newly created partition (by using
filters) or on full dataset.
6) Repeat.
Note that all the path munging, schema management, load-union-store is
gone. You are relieved of all that. Howl presents you with a table
like abstraction
24 matches
Mail list logo