I guess you mean to load a bag. Your input file should be:
{(1,2,3),(2,4,5)}
{(2,3,4),(2,3,5)}
And load statement should be:
z = load 'tmp.txt' as (b:{(a0:int,a1:int,a2:int)});
Daniel
On Thu, Feb 2, 2012 at 2:43 AM, praveenesh kumar wrote:
> Okie so its wierd.
>
> I was able to run a pig query
Hi, all
out data format for map is Key:Value|Key:Value , how can I load the
data into map type? Can pig define the map delimiter like hive?
thanks.
Okie.. so how can I make use of -schema option with PigStorage.
Suppose my Jscon schema is -
{
"name":"Student_Data",
"properties":
{
"id":
{
"type":"INTEGER",
"description":"Student id"
Thank you for your quick response.
At 2012-02-06 14:46:46,"Dmitriy Ryaboy" wrote:
>It should work with hadoop-1.0 and hbase 0.90.*. Zookeeper is only shipped
>as part of hbase. Pig does not use it directly, it's
>a transitive dependency via hbase.
>
>2012/2/5 lulynn_2008
>
>> Hello,
>> I ha
There is no asynchronous API for Pig. However, Pig does have a
notification mechanism (See PigRunner.run), you can create a separate
thread to simulate the asynchronous call.
Daniel
On Fri, Feb 3, 2012 at 12:34 AM, Michael Lok wrote:
> Hi folks,
>
> I was wondering if it's possible to submit reg
It's a json serialization of the Pig schema object, and isn't really meant
to be created by hand.
Patches to make it more human-friendly would be quite welcome.
D
On Sun, Feb 5, 2012 at 10:35 PM, praveenesh kumar wrote:
> Thanks,
> I was also looking for -schema option in PigStorage.
> But Can a
It should work with hadoop-1.0 and hbase 0.90.*. Zookeeper is only shipped
as part of hbase. Pig does not use it directly, it's
a transitive dependency via hbase.
2012/2/5 lulynn_2008
> Hello,
> I have a question about pig-0.9.1:
> Could pig-0.9.1 work with hadoop-1.0.0 and hbase-0.90.5? I plan
Hello,
I have a question about pig-0.9.1:
Could pig-0.9.1 work with hadoop-1.0.0 and hbase-0.90.5? I planed to verify
this by running UT. Please give your suggestions.
Besides, I found zookeeper in pig, my questions are:
--what zookeeper is used for pig?
--is zookeeper used for pig mainline funct
Thanks,
I was also looking for -schema option in PigStorage.
But Can anyone explain how can we define that json schema file.
Some tutorial/small example would be very helpful.
Praveenesh
On Mon, Feb 6, 2012 at 11:55 AM, Dmitriy Ryaboy wrote:
> It's pretty straightforward, that's why the LoadMet
Check
https://cwiki.apache.org/confluence/display/PIG/FAQ#FAQ-Q%3AIloaddatafromadirectorywhichcontainsdifferentfile.HowdoIfindoutwherethedatacomesfrom%3F
On Thu, Feb 2, 2012 at 5:11 PM, Ranjan Bagchi wrote:
> Hi,
>
> I've a bunch of [for example] apache logfiles that I'm searching through. I
>
It's pretty straightforward, that's why the LoadMetadata interface exists.
You just have to implement it and translate however you store the schema to
a Pig Schema object.
PigStorageSchema will read a json file that describes the schema, you can
look at how that's done there (actually, PigStorage
I think the intent is to behave the same way as the Pig "matches" operator
(which, unsurprisingly, uses the Java matches method).
RegexExtractAll becomes quite confusing if it means "extract all matched
subexpressions of the first match of the expression" (one might expect
"all" to refer to all ma
No, there is no ONERROR handle right now.
Daniel
On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney wrote:
> Did ONERROR ever get built? I have a few bad datetimes out of many failing
> to parse, and I don't want my entire pig script dying because I lost a few
> rows.
>
> http://wiki.apache.org/pig
Seems like a bug in jython:
>>> import time
>>> tuple_time = time.strptime('2006-10-16T08:19:39', "%Y-%m-%dT%H:%M:%S")
>>> tuple_time.tm_hour
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'tuple' object has no attribute 'tm_hour'
>>> tuple_time[3]
8
Change return str(tu
To answer my own question, this is because the schemas differ. The schema
in the working case has a named tuple via AvroStorage. Storing to Mongo
works when I name the tuple:
...
sent_topics = FOREACH froms GENERATE FLATTEN(group) AS (from, to),
pairs.subject AS pairs:bag {column:tuple (subject:
sent_topics = LOAD '/tmp/pair_titles.avro' USING AvroStorage();
STORE sent_topics INTO 'mongodb://localhost/test.pigola' USING
MongoStorage();
That works. Why is it the case that MongoStorage only works if the
intermediate processing doesn't happen? Strangeness.
On Sun, Feb 5, 2012 at 12:31 AM,
Looks like this is jython bug.
Btw, afaik, the return type of this function would be a bytearray if
decorator is not specified.
Thanks,
Aniket
On Sat, Feb 4, 2012 at 9:39 PM, Russell Jurney wrote:
> Why am I having tuple objects in my python udfs? This isn't how the
> examples work.
>
> Error:
17 matches
Mail list logo