rath Sasidharan <
> ssasidha...@bol.com> wrote:
>
>
> Hi All,
>
> I have a script which stores 2 relations with different schema using
> CSVExcelStorage.
>
> The issue which i see is that the script picks up the last store function
> and takes the schema in that
Hi Eyal,
1.I have created a ticket :
PIG-4943<https://issues.apache.org/jira/browse/PIG-4943>
Thanks and Regards,
Sarath
From: Eyal Allweil
Reply-To: Eyal Allweil
Date: Monday 4 July 2016 at 16:05
To: "user@pig.apache.org" , Sarath Sasidharan
Subject: Re: Schema iss
I can replicate these results on Pig 0.14.
Did anyone open a Jira issue for this?
On Thursday, March 10, 2016 12:24 PM, Sarath Sasidharan
wrote:
Hi All,
I have a script which stores 2 relations with different schema using
CSVExcelStorage.
The issue which i see is that the script
Hi All,
I have a script which stores 2 relations with different schema using
CSVExcelStorage.
The issue which i see is that the script picks up the last store function and
takes the schema in that and puts it for all store functions , overriding the
previous store schemas.Is this a known
Im trying to define output schema which should be Tuple that contains
another two tuples, i.e `stats:tuple(c:tuple(),d:tuple)`.
The code below doesnt work as it was intended. It somehow produces
structure as:
stats:tuple(b:tuple(c:tuple(),d:tuple()))
Below is output produced by describe
internal logic
inside UDF.
My problem is that I cant pass any temporary variable from the exec()
method into outputSchema(Schema input) method which is part of the UDF
class. The temporary variable contains information needed to generate valid
output schema inside outputSchema(), eg. size of the tuples
; generate COUNT_STAR($1) as TARGET;
>> };
>> d = limit c 10;
>> e = foreach d generate TARGET;
>> dump e;
>>
>> end output ...
>> (1)
>>
>>
>> *Cheers !!*
>> Arvind
>>
>> On Sat, Nov 14, 2015 at 12:18 AM, Chr
> end output ...
> (1)
>
>
> *Cheers !!*
> Arvind
>
> On Sat, Nov 14, 2015 at 12:18 AM, Christopher Maier <
> christopher.ma...@gm.com> wrote:
>
> > Hi,
> >
> > I haven't received a response on this, has anyone had a chance to
> > r
oduce the error?
>
> Thanks,
> Kit
>
> From: Christopher Maier
> Sent: Tuesday, October 20, 2015 4:02 PM
> To: 'user@pig.apache.org'
> Subject: Schema changes based on subquery
>
> Hi,
>
> I am getting the wrong counts from Pig for a certain query. I have
&g
Hi,
I haven't received a response on this, has anyone had a chance to reproduce the
error?
Thanks,
Kit
From: Christopher Maier
Sent: Tuesday, October 20, 2015 4:02 PM
To: 'user@pig.apache.org'
Subject: Schema changes based on subquery
Hi,
I am getting the wrong counts from Pi
Hi,
I am getting the wrong counts from Pig for a certain query. I have simplified
the query to what's below, which shows as a failure instead of a wrong count.
Why does the first line of the subquery cause the output schema to revert to be
the same as the input schema? This line shoul
o collection or from MongoDB, the only exception
I get is that Pig doesn't know what the schema is.
I've attached the pig latin script for reference, but it's a pretty simple
count of times one person emails another. I can run the equivalent
map-reduce directly in MondoDB, but the goal he
er 13, 2014 12:51 PM
To: user@pig.apache.org
Subject: Group operator and variable schema (reformatted email)
Hi All,
I have the following question:
Snippet of my sample.txt. First column is id, however each row can have
variable number of columns.
id1 100 200 300 400 500
id2 10 20 30id1 800 900
'sample.txt' [how should I specify schema here]sample_grpd =
GROUP sample by $0;sample_result = FOREACH sample_grpd generate group,
FLATTEN(TOBAG([what should go here]))
group by id so that the result is:
id1 100 200 300 400 500 800 900 600 1 2 3 4 5 6 7 8 9
id2 10 20 30 40 50 60 70 80 90
id
Hi,I would like to create a StoreFunc like MultiStorage but instead of
referencing fields to be added to the output path by index, it references them
by name (it would construct a map between names and indexes based on the schema
of the data to be output). Is there a mechanism for a
I think you could specify a comma as the delimiter in your load statement:
x = load 'file.txt' using PigStorage(',');
You could specify the schema if needed on the way in after the PigStorage call
with "as (a:chararray, b:chararray, ..., n:chararray)".
But if you
here any way I can specify the schema of y to be a tuple of various
numbers of chararrays? Something on the lines of y = FOREACH x GENERATE
STRSPLIT(content, ',') as tuple(chararray(*))
2) If I try to do the above in an UDF, how do I create output schema which
depends on the input? From
look ugly and error prone with
the definition of schema of all 100 columns.
My idea was the UDF will return tuple for each record with a self
explanatory schema returned by outputSchema() and I can use this to
write directly into a Hive Table with HCatStorer(). The HCatStorer
expects same name for
Hi Lorand
Thanks for the reply. My use case has around 100 columns and growing,
and I didn't want to make the script look ugly and error prone with
the definition of schema of all 100 columns.
My idea was the UDF will return tuple for each record with a self
explanatory schema return
Hi,
If you flatten a tuple/bag, Pig will prefix the field with a
disambiguation string ([prefix]::). (See:
http://pig.apache.org/docs/r0.12.0/basic.html#disambiguate).
In your example getSchemaName() returns a generated unique name built
from the classname + first input schema field + a
Hi
I am writing a Pig UDF that returns a Tuple as per
http://wiki.apache.org/pig/UDFManual . I want the output tuple to have
a particular schema, Say {name:chararray, age:int} after I FLATTEN it
out after using the UDF.
As per the UDFManual, the method below
public Schema outputSchema(Schema
the outgoing links from a
>> source URL i.e Source URL 1 has 2 and 3 outgoing URLs
>>
>> 1 2 3
>>
>> 2 3 4
>>
>> 3 4
>>
>> 4 1
>>
>> And I would like to load into Pig as below
>>
>> So, Ou
>>
&
t the outgoing links from a source
URL i.e Source URL 1 has 2 and 3 outgoing URLs
1 2 3
2 3 4
3 4
4 1
And I would like to load into Pig as below
So, Ou
(1,(2,3))
(2,(3,4))
(3,(4))
(4,(1))
Can I do this using default AS schema or Do I have to write a custom loader
function.
Thanks,
Akoppula
Hi All,
I have Input data format as below to represent the outgoing links from a source
URL i.e Source URL 1 has 2 and 3 outgoing URLs
1 2 3
2 3 4
3 4
4 1
And I would like to load into Pig as below
So, Ou
(1,(2,3))
(2,(3,4))
(3,(4))
(4,(1))
Can I do this using default AS schema or
Hi Jamin,
>> Out of bound access. Trying to access non-existent column: 8. Schema
activityID:chararray,reqHost:chararray,rspPylByt:long
pylByt:long,reqTime:double,reqDur:double,rspTime:double,rspDur:double has 8
column(s).
Did you try to disable ColumnMapKeyPrune optimization? You can do
I found that PIG gets confused about the schema after a complicated but correct
nested FOREACH operation.
My script is attached with no modification and it gives error messages below:
Picked up _JAVA_OPTIONS: -Xmx1G
2014-03-24 13:05:18,662 [main] INFO org.apache.pig.Main - Apache Pig version
ed by a limit) yield the
following Exception:*
*java.io.IOException: Could not find schema in UDF contextat
org.apache.pig.builtin.JsonStorage.prepareToWrite(JsonStorage.java:125) at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.(PigOutputFormat
ut.seq' USING $SEQFILE_LOADER ( '-c
> $TEXT_CONVERTER', '-c $TEXT_CONVERTER') AS (key: chararray, value:
> chararray);
> UserItemAssoc = FOREACH A GENERATE myparser.myUDF(key, value) AS {(userid:
> chararray, itemtid: How to specify this???)};
> If I want to specify the schem
myparser.myUDF(key, value) AS {(userid:
chararray, itemtid: How to specify this???)};
If I want to specify the schema in the AS clause, how do I do it since the
number of fields will differ in each row? Is it possible to somehow do this
dynamically?
Raised https://issues.apache.org/jira/browse/PIG-3646
Until it gets fixed though, are there some Pig internal APIs that I can use
to get a hold of the schema? As I've mentioned in my initial email, I can't
seem to find a way to get access to the full declaration - even the POStore
contai
Like Alan said in the thread that you're referring to, user-defined schema
in the as-clause is not available within a LoadFunc. HBaseStorage is
different since its schema is passed via a constructor parameter. As far as
I know, most popular Pig storages do not require users to define schema
Thanks for the pointers regarding 1).
Any ideas on 2) - namely why only the deferenced schema is available and how to
get a hold of the actual user declaration?
Cheers and Merry Christmas!
On 24/12/2013 1:05 AM, Cheolsoo Park wrote:
As for #1, pushdownProject() is called only if it
[1] http://www.mail-archive.com/user@pig.apache.org/msg06285.html
>
>
> On 19/12/2013 4:08 PM, Costin Leau wrote:
>
>> Hi,
>>
>> I'm trying to get a hold of the schema specified for a loader through
>> 'AS' using Apache Pig 0.12 :
>>
Hi,
I'm trying to get a hold of the schema specified for a loader through 'AS'
using Apache Pig 0.12 :
A = LOAD 'pig/tupleartists' USING MyStorage() AS (name: chararray, links:
(url:chararray, picture:chararray));
B = FOREACH A GENERATE name, links.url;
DUMP B;
Forgot to specify the aforementioned thread [1]
[1] http://www.mail-archive.com/user@pig.apache.org/msg06285.html
On 19/12/2013 4:08 PM, Costin Leau wrote:
Hi,
I'm trying to get a hold of the schema specified for a loader through 'AS'
using Apache Pig 0.12 :
A = LOAD
We had to do that as well.
Meg
On Nov 27, 2013 7:19 AM, "Ruslan Al-Fakikh" wrote:
> In my company we had to write our own Loader/Storer UDFs for this.
>
>
> On Wed, Nov 27, 2013 at 6:00 PM, Noam Lavie wrote:
>
> > Hi,
> >
> > Is there a way to
In my company we had to write our own Loader/Storer UDFs for this.
On Wed, Nov 27, 2013 at 6:00 PM, Noam Lavie wrote:
> Hi,
>
> Is there a way to load a csv file with header as schema? (the header's
> fields are the properties of the schema and the other list in the csv file
Hi,
Is there a way to load a csv file with header as schema? (the header's fields
are the properties of the schema and the other list in the csv file will be in
the schema format)
For example:
Namelast nameage
Noamlavie 26
Map r
Hey Johannes!
Have you solved the problem? I also see it.
But I don't see it when I use the schema as a string to AvroStorage
parameter. I see it only when I try to use an external schema file. And if
I specify a non-existent external schema file, the error is the same.
Ruslan
On Tue, O
Hi Pradeep,
Yes, I implemented the outputSchema method and it fixed that issue.
We are also planning to evaluate to store intermediate and final results in
Cassandra.
> Date: Mon, 4 Nov 2013 17:08:56 -0800
> Subject: Re: Java UDF and incompatible schema
> From: pradeep...@gmail.com
&
This is most likely because you haven't defined the outputSchema method of
the UDF. The AS keyword merges the schema generated by the UDF with the
user specified schema. If the UDF does not override the method and specify
the output schema, it is considered null and you will not be able to u
following script, I get the following error. Any help with
this would be great!
ERROR 1031: Incompatable field schema: declared is
"bag_0:bag{:tuple(id:int,class:chararray,name:chararray,begin:int,end:int,probone:chararray,probtwo:chararray)}",
infered is ":Unknown"
J
Thanks for your answer!
Actually the Avro schema is valid and I can load data with it. The error
message states, that pig has a problem with the Pig schema, which has no
duplicate names.
Johannes
Am 21.10.2013 19:29, schrieb j.barrett Strausser:
> I'd imagime it is having an issue
18:50:15,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2116:
> Output Location Validation Failed for:
> 'hdfs://path/to/output More info to follow:
> Pig Schema contains a name that is not allowed in Avro
> Details at logfile: pig_1382374188771.log
>
> L
013-10-21 18:50:15,554 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2116:
Output Location Validation Failed for:
'hdfs://path/to/output More info to follow:
Pig Schema contains a name that is not allowed in Avro
Details at logfile: pig_1382374188771.log
Logfile contains:
Pig Schem
Thanks Prashant, Will try this out.
-Original Message-
From: Prashant Kommireddi [mailto:prash1...@gmail.com]
Sent: Thursday, September 26, 2013 1:47 PM
To: user@pig.apache.org
Subject: Re: Loading a custom schema
Hi Siddhi,
PigStorage by default looks for ".pig_schema"
Hi Siddhi,
PigStorage by default looks for ".pig_schema" under the input dir. If you
would like to use a different filename, you would have to override
PigStorage.getSchema(String location, Job job) and define a custom
JsonMetadata object. You might want to start here.
Using a s
Hi,
I am trying to load a tsv file using PigStorage
input_data = load 'input.tsv' using PigStorage('\t','-schema');
This loads the tsv file as per the .pig_schema file present in the input folder.
Is there any way to load the schema from a custom path? For ex, say
alSchema. Here is how you can verify it.
1) Debug Pig main in eclipse.
2) Set a breakpoint in the LogicalFieldSchema constructor.
3) Run "a = load '/dev/null' as (i:int, t:tuple(j:int));" on grunt.
Thanks,
Cheolsoo
On Thu, Aug 8, 2013 at 2:42 PM, Keren Ouaknine wrote:
>
Hi,
A schema in Pig (LogicalSchema.java) is defined as an array list of
LogicalFieldSchema whose class members are:
- String alias
- byte type
- long uid
- LogicalSchema schema
I am wondering why is LogicalFieldShema containing a LogicalSchema member?
My guess so far is that perhaps there
point_type;
STORE projPivotsWithEndPoints INTO '$validPivots' USING
org.apache.pig.piggybank.storage.avro.AvroStorage('index', '3', 'schema',
'{"name": "valid_pivots", "doc": "version 0.0.1", "type": "
int, --10
is_active: boolean, --11
avg_speed: double,--12
distance: int, --13
not_valid: int); --14
}
You can see that relation *routePivots* has
Have you considered using a Pig schema?
On Jul 9, 2013, at 12:32 PM, "Kimmel, Chad" wrote:
> Hi, what I am trying to do is read the headers from the first line as the
> field names into the schema. For instance, given the following tab
> deliminated file
>
> --sa
Hi, what I am trying to do is read the headers from the first line as the field
names into the schema. For instance, given the following tab deliminated file
--samplefile.txt—
Name Job Age
Chad Engineer 23
MikeStats34
ChrisIT 25
Instead of deleting the first
Cool thanks!
On 4/30/13 9:10 PM, "Cheolsoo Park" wrote:
>Hi Steven,
>
>The new AvroStorage will let you specify the input schema:
>https://issues.apache.org/jira/browse/PIG-3015
>
>In fact, somebody made the same request in a comment of the jira that I a
Hi Steven,
The new AvroStorage will let you specify the input schema:
https://issues.apache.org/jira/browse/PIG-3015
In fact, somebody made the same request in a comment of the jira that I am
copying and pasting below:
Furthermore, we occasionally have issues with pig jobs picking the old
Resending now that I am subscribed :)
On 4/25/13 4:01 PM, "Enns, Steven" wrote:
>Hi everyone,
>
>I would like to override the input schema in AvroStorage to make a pig
>script robust to schema evolution. For example, suppose a new field is
>added to an avro schema wit
Hi everyone,
I would like to override the input schema in AvroStorage to make a pig
script robust to schema evolution. For example, suppose a new field is
added to an avro schema with a default value of null. If the input to a
pig script using this field includes both old and new data
lan Gates wrote:
>
> > I would open a new JIRA, since 1914 is focussed on building an
> alternative
> > that discovers schema, while you are wanting to improve the existing one.
> >
> > Alan.
> >
> > On Jan 7, 2013, at 5:02 PM, Tim Sell wrote:
> >
&g
mport org.apache.pig.impl.util.CastUtils;
import org.apache.pig.impl.util.Utils;
import org.apache.pig.newplan.logical.relational.LogicalSchema;
import java.io.IOException;
public class CSPigUtils {
public static Object getPigRepresentation(String schema, String data)
throws IOException {
Utf8StorageConv
m elements "{(})#," which
isn't the case (ie, a serialized json chararray for a field). So I was
hoping for a more OTS solution using existing classes and methods given the
String and it's Schema a priori.
Thank you for your help, and I'll keep this post updated on my progress
toward
against
>> > a
>> > > functional requirement in the UDF.
>> > >
>> > > The UDFs I am testing are part of a larger ETL testing initiative I
>> have
>> > > been undertaking. To ensure that the various states of legacy data
>>
he UDF.
> > >
> > > The UDFs I am testing are part of a larger ETL testing initiative I
> have
> > > been undertaking. To ensure that the various states of legacy data are
> > > correctly extracted and transformed into a Pig context, I am creating
> >
s of legacy data are
> > correctly extracted and transformed into a Pig context, I am creating
> > specific JUnit tests per each UDF containing specific use cases as
> testing
> > methods.
> >
> > Motivation to use String inputs for the Data Objects and Schema
correctly extracted and transformed into a Pig context, I am creating
> specific JUnit tests per each UDF containing specific use cases as testing
> methods.
>
> Motivation to use String inputs for the Data Objects and Schema Objects is
> the improvement on the conventional approach -
Pig context, I am creating
specific JUnit tests per each UDF containing specific use cases as testing
methods.
Motivation to use String inputs for the Data Objects and Schema Objects is
the improvement on the conventional approach - creating Java Objects and
adding and appending nested Objects to
esting of my UDFs.
>
> -Dan
>
> On Tue, Mar 19, 2013 at 11:27 AM, Jonathan Coveney >wrote:
>
> > how was string_databag generated?
> >
> >
> > 2013/3/19 Dan DeCapria, CivicScience
> >
> > > Expanding upon this, the follow
Such that this string_input matches the Schema:
String string_databag = "{(apples,(banana,1024),2048)}";
String string_schema =
"b1:bag{t1:tuple(a:chararray,t2:tuple(b:chararray,d:long),f:long)}";
Schema schema = Utils.getSchemaFrom
ted?
>
>
> 2013/3/19 Dan DeCapria, CivicScience
>
> > Expanding upon this, the following use case's Schema Object can be
> resolved
> > from inputs:
> >
> > String string_databag = "{(a,(b,d),f)}";
> > String string_schema =
>
how was string_databag generated?
2013/3/19 Dan DeCapria, CivicScience
> Expanding upon this, the following use case's Schema Object can be resolved
> from inputs:
>
> String string_databag = "{(a,(b,d),f)}";
> String string_schema =
> &quo
Expanding upon this, the following use case's Schema Object can be resolved
from inputs:
String string_databag = "{(a,(b,d),f)}";
String string_schema =
"b1:bag{t1:tuple(a:chararray,t2:tuple(b:chararray,d:long),f:long)}";
Schema schema =
Thank you for your reply.
The problem is I cannot find a methodology to go from a String
representation of a complex data type to a nested Object of pig DataTypes.
I looked over the pig 0.10.1 docs, but cannot find a way to go from String
and Schema to pig DataType Object.
For context, I am
n
> with its schema String to a valid DataBag Object:
>
> String databag_string = "{(apples,1024)}";
> String schema_string = "b1:bag{t1:tuple(a:chararray,b:long)}";
>
> I've tried implementing something along the lines of this, but I believe
In Java, I am trying to convert a DataBag from it's String representation
with its schema String to a valid DataBag Object:
String databag_string = "{(apples,1024)}";
String schema_string = "b1:bag{t1:tuple(a:chararray,b:long)}";
I've tried implementing something
;
> On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang
> wrote:
> > Hi, Jeff:
> > Reply inline.
> >
> >
> > On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan
> wrote:
> >
> >> I have a couple of questions regarding job result and schema. The
> >> context is t
Zhang
> wrote:
> > Hi, Jeff:
> > Reply inline.
> >
> >
> > On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan
> wrote:
> >
> >> I have a couple of questions regarding job result and schema. The
> >> context is that I'm trying to create a cus
astAlias());
...
Thanks,
Jeff
On Tue, Mar 5, 2013 at 11:30 AM, Johnny Zhang wrote:
> Hi, Jeff:
> Reply inline.
>
>
> On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan wrote:
>
>> I have a couple of questions regarding job result and schema. The
>> context is that I
Hi, Jeff:
Reply inline.
On Tue, Mar 5, 2013 at 11:18 AM, Jeff Yuan wrote:
> I have a couple of questions regarding job result and schema. The
> context is that I'm trying to create a custom entry point for Pig that
> takes a script, executes it, and always stores the last de
I have a couple of questions regarding job result and schema. The
context is that I'm trying to create a custom entry point for Pig that
takes a script, executes it, and always stores the last declared
alias/variable in a file. Would appreciate any insights to the 2
questions I have below o
hael West wrote:
>
>>
>> I would like to set the schema after joining so that I do not have to
>> always dereference. However, I receive an error when I try this. How can
>> I resolve this error?
>>
>> pig version 0.11
>>
>> Error message:
>>
>
Each field needs to be dereferenced individually:
A::name AS name, A::age AS age...
On Saturday, March 2, 2013, Michael West wrote:
>
> I would like to set the schema after joining so that I do not have to
> always dereference. However, I receive an error when I try this. How can
&g
I would like to set the schema after joining so that I do not have to always
dereference. However, I receive an error when I try this. How can I resolve
this error?
pig version 0.11
Error message:
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1031: Incompatable field
schema
Take a look at the org.apache.pig.builtin.PigStorage.getSchema(..) method.
You can subclass PigStorage and implement that method to read the schema
from the first line of the file. Or you can just implement the LoadMetaData
in the loader you're using.
On Tue, Jan 15, 2013 at 2:27 PM,
Actually, I'll probably just end up computing positions to use, rather
than pasting in a schema, but the general point is that I'd love to do
it some other way, because little hacks like these make my data
pipeline feel fragile.
I'm willing to write some Java if anyone could point
plug that schema string into the AS portion of
my LOAD statement. Then I'll project columns I want and manually
typecast them.
Is there a better, simple way?
-Mason
Tim, can you open a github issue with EB about compiling against 0.10?
I think this is an easy fix.
On Tue, Jan 8, 2013 at 9:38 AM, Alan Gates wrote:
> I would open a new JIRA, since 1914 is focussed on building an alternative
> that discovers schema, while you are wanting to impro
7; using
>>>> org.apache.pig.piggybank.storage.avro.AvroStorage( );
>>>> dump employee;
>>>>
>>>>
>>>> Schemas :
>>>>
>>>> {
>>>> "type" : "record",
>>>> "name" :
; >>
> >> Schemas :
> >>
> >> {
> >> "type" : "record",
> >> "name" : "employee",
> >> "fields":[
> >>{"name" : "name", "type" : "string&quo
ame", "type" : "string", "default" : "NU"},
>>{"name" : "age", "type" : "int","default" : 0},
>>{"name" : "dept", "type": "string","default&q
ult" : 0},
> {"name" : "dept", "type": "string","default" : "DU"},
> {"name" : "office", "type": "string","default" : "OU"},
> {"name" : "salary&q
ot;default" : "OU"},
{"name" : "salary", "type": "float","default" : 0.0}
]
}
{
"type" : "record",
"name" : "employee",
"fields":[
{"name" : "name", "type"
I would open a new JIRA, since 1914 is focussed on building an alternative that
discovers schema, while you are wanting to improve the existing one.
Alan.
On Jan 7, 2013, at 5:02 PM, Tim Sell wrote:
> This seems like a bug to me. It makes it risky to work with JSON data
> genera
Sorry, Looks like my suggestion won't help unless you were able to specify
the schema with the original load statement. If the number of field is ONLY
available at runtime but each row have the same number field and you know
the position of join key, then I have a ugly approach. First, sample
$4 AS sale_month_3,
$5 AS sale_month_4,
$6 AS sale_month_5,
$7 AS sale_month_6,
$8 ..;
I still get the same error when I try to join on this relation.
On Mon, Jan 7, 2013 at 2:27 PM, Jinyuan Zhou wrote:
> If you can load it but join operation need the complete schema, then you
> ca
gt;>
>> And I use
>>
>> a = LOAD 'input.json' USING JsonLoader('id:int,date:chararray');
>> DUMP a;
>>
>> I get errors when it tries to force the date fields into an integer.
>>
>> Shouldn't this work independent of the or
https://issues.apache.org/jira/browse/PIG-1914
~T
On 7 January 2013 20:24, Alan Gates wrote:
> Currently the JsonLoader does assume ordering of the fields. It does not do
> any name matching against the given schema to find the right field.
>
> Alan.
>
> On Jan 7, 2013, at 11:56 AM, Ti
If you can load it but join operation need the complete schema, then you
can try do a generate statement to project your original relation to
produce the one you can define schema for all fields.
On Mon, Jan 7, 2013 at 2:19 PM, Chan, Tim wrote:
> Is it possible to declare a schema when do
Is it possible to declare a schema when doing a LOAD for data in which you
do not know the total number of columns?
For instance. I know the data contains 6 or more columns. These columns are
of the same data type.
I basically want to join this data with another data set, but I was getting
the
input.json' USING JsonLoader('id:int,date:chararray');
> DUMP a;
>
> I get errors when it tries to force the date fields into an integer.
>
> Shouldn't this work independent of the ordering of the schema fields?
> Json writers generally don't make guarante
Currently the JsonLoader does assume ordering of the fields. It does not do
any name matching against the given schema to find the right field.
Alan.
On Jan 7, 2013, at 11:56 AM, Tim Sell wrote:
> When using JsonLoader with Pig 0.10.0
>
> if I have an input.json file that looks
1 - 100 of 250 matches
Mail list logo