after the union ?
Can you try to reproduce this with a script/data that you can share ?
If you can open a jira with details, that would be even better.
Thanks,
Thejas
On 9/4/12 8:52 AM, Xavier Stevens wrote:
I'm trying to do a UNION on two datasets with identical schemas
(k:bytearray, v:chararr
I'm trying to do a UNION on two datasets with identical schemas
(k:bytearray, v:chararray). When using the UNION operator like so:
combined_data = UNION dataset1, dataset2;
I get the following error:
java.lang.RuntimeException: Unexpected data type java.util.ArrayList found in
stream. Note on
Does anyone else think it would make sense to have all operators and
functions listed on a single page somewhere as a reference? Right now
they are split up over the "Pig Latin Basics" and "Built In Functions"
pages.
-Xavier
We're currently running
Pig 0.9.1 with HBase 0.90.4-cdh3u2. I use it everyday so yes you can :)
Dmitriy RyaboyNovember 17, 2011 8:41 AM
Unless they made backwards incompatible
changes in a bug fix release (highly unlikely), yes.
lulynn_2008November 17, 2011 6:46 AM
I am plan
Awesome! I was trying to FLATTEN(*) without the TOBAG.
Thanks Thejas.
On 6/2/11 11:52 AM, Thejas M Nair wrote:
> one_word_per_line = FOREACH words GENERATE FLATTEN(TOBAG(*));
>
> -Thejas
>
>
> On 6/2/11 11:38 AM, "Xavier Stevens" wrote:
>
> I'm cu
I'm currently trying to write a pig script to output a feature index. Is
there a built-in function for converting an unknown length tuple to
output once for each item in the tuple?
Example code:
raw = LOAD 'hbase://mytable' USING HBaseStorage('data:json') AS
json:chararray;
genmap = FOREACH raw G
Hey John,
If you take a look at mine it looks explicitly for Lists and converts
them to DataBags. I ran into that issue with our data. That said I won't
make any claims that it'll work for all data.
Cheers,
-Xavier
On 4/19/11 12:02 PM, John Hui wrote:
> I'll post my solution in a few hours =)
>
For what it's worth I have one as well. This one uses Jackson to parse
everything.
https://github.com/xstevens/akela/blob/master/src/java/com/mozilla/pig/eval/json/JsonMap.java
On 4/19/11 11:55 AM, Dmitriy Ryaboy wrote:
> YES :)
>
> On Tue, Apr 19, 2011 at 11:49 AM, John Hui wrote:
>
>> I have
Thanks again Alan.
-Xavier
On 3/29/11 3:39 PM, Alan Gates wrote:
> If you turn your ArrayList into a bag it should be happy.
>
> Alan.
>
> On Mar 29, 2011, at 3:34 PM, Xavier Stevens wrote:
>
>> It probably wasn't the null values actually. It didn't like h
It probably wasn't the null values actually. It didn't like having
ArrayList values in a map.
-Xavier
On 3/29/11 2:08 PM, Xavier Stevens wrote:
> So it looks like the problem might be that one of the keys in the map
> has a null value. Is that just not supported in Pig
So it looks like the problem might be that one of the keys in the map
has a null value. Is that just not supported in Pig or is that a bug?
-Xavier
On 3/29/11 2:02 PM, Xavier Stevens wrote:
> The value is a mixture of types. I'll go through and spit out what types
> it has and get
rstand, and when
> it tries to write it out to the screen it doesn't know how to.
>
> Alan.
>
> On Mar 29, 2011, at 1:34 PM, Xavier Stevens wrote:
>
>> I'm currently getting a really weird error coming from one of my eval
>> functions. It expects a tuple where
I'm currently getting a really weird error coming from one of my eval
functions. It expects a tuple where the first element is a string and
then outputs a Map as a result. I put in some debug code
and I can see the value I get is what I expect and that the resulting
map size is 35 elements. Anyone
I've written a regular expression EvalFunc similar to ExtractAll except
this is called FindAll. It returns a tuple of all strings found that
match the given pattern. The syntax looks like this:
A = FOREACH raw_data GENERATE FindAll(field_str, '[^/]+') AS a_tuple;
I dumped some return tuples whi
I'm currently running into an issue where I have a bag of tuples like so:
>DUMP foo;
( {(a,b,c,d,e), (1,2,3,4,5)}, ... , {(f,g,h,i,j), (6,7,8,9,10)} )
Each one of the tuples has the same number of fields. So I try to
flatten the structure so I can get just the 1st, 3rd and 4th elements of
each
15 matches
Mail list logo