Re: Passing schema inside Load functionc

2012-02-06 Thread Dmitriy Ryaboy
The integer values for types come from org.apache.pig.data.DataType On Mon, Feb 6, 2012 at 1:17 AM, praveenesh kumar wrote: > Yeah I tried that - > Here's what I get for a small sample data : > > { > "fields": > [ >{"name":"name","type":55,"description":"autogenerated from > Pig

Re: Passing schema inside Load functionc

2012-02-06 Thread praveenesh kumar
Yeah I tried that - Here's what I get for a small sample data : { "fields": [ {"name":"name","type":55,"description":"autogenerated from Pig Field Schema","schema":null}, {"name":"age","type":10,"description":"autogenerated from Pig Field Schema","schema":null},

Re: Passing schema inside Load functionc

2012-02-06 Thread Dmitriy Ryaboy
it reads the schema file *it creates* . So, you process some data, store it, then read it back later, and the schema is back. Like I said, the json is not very human-readable -- the types are integers rather than words like "chararray", etc. Try saving something and check out the .pig_schema file t

Re: Passing schema inside Load functionc

2012-02-05 Thread praveenesh kumar
Okie.. so how can I make use of -schema option with PigStorage. Suppose my Jscon schema is - { "name":"Student_Data", "properties": { "id": { "type":"INTEGER", "description":"Student id"

Re: Passing schema inside Load functionc

2012-02-05 Thread Dmitriy Ryaboy
It's a json serialization of the Pig schema object, and isn't really meant to be created by hand. Patches to make it more human-friendly would be quite welcome. D On Sun, Feb 5, 2012 at 10:35 PM, praveenesh kumar wrote: > Thanks, > I was also looking for -schema option in PigStorage. > But Can a

Re: Passing schema inside Load functionc

2012-02-05 Thread praveenesh kumar
Thanks, I was also looking for -schema option in PigStorage. But Can anyone explain how can we define that json schema file. Some tutorial/small example would be very helpful. Praveenesh On Mon, Feb 6, 2012 at 11:55 AM, Dmitriy Ryaboy wrote: > It's pretty straightforward, that's why the LoadMet

Re: Passing schema inside Load functionc

2012-02-05 Thread Dmitriy Ryaboy
It's pretty straightforward, that's why the LoadMetadata interface exists. You just have to implement it and translate however you store the schema to a Pig Schema object. PigStorageSchema will read a json file that describes the schema, you can look at how that's done there (actually, PigStorage

Re: Passing schema inside Load functionc

2012-02-03 Thread praveenesh kumar
Thanks Stan, This would be a great help.. !! I'll try to implement it. :-) Praveenesh On Sat, Feb 4, 2012 at 8:10 AM, Stan Rosenberg < srosenb...@proclivitysystems.com> wrote: > Hi Praveenesh, > > Maybe this will get you started. > > Suppose we have the desired schema parsed and stored in 'map'

Re: Passing schema inside Load functionc

2012-02-03 Thread Stan Rosenberg
Hi Praveenesh, Maybe this will get you started. Suppose we have the desired schema parsed and stored in 'map' of type LinkedHashMap. The key is your field name, and the value denotes the data type, e.g., 'string', 'int', etc. Now, let's derive pig's schema from this map: Schema schema = new Sc

Re: Passing schema inside Load functionc

2012-02-03 Thread praveenesh kumar
Thanks Stan, I was going through these only. I was wondering is there a easy way to do it or am I reading something wrong. Now I will focus on what you have suggested. but I hope there is some easy solution to my problem Praveenesh On Sat, Feb 4, 2012 at 4:12 AM, Stan Rosenberg < srosenb...@procl

Re: Passing schema inside Load functionc

2012-02-03 Thread Stan Rosenberg
Hi Praveenesh, Assuming you have already read these: http://ofps.oreilly.com/titles/9781449302641/load_and_store_funcs.html http://pig.apache.org/docs/r0.9.2/udf.html#load-store-functions my next step would be to peruse the source code of some existing loaders, e.g., PigStorage. Best, stan O

Re: Passing schema inside Load functionc

2012-02-03 Thread praveenesh kumar
Thanks Stan, If you were facing this kind of scenario, how would you have proceeded ? Can you give me some pointers on how to write custom loader, some good tutorials..on it What is the current practice in order to solve the above scenario in pig ? Praveenesh On Sat, Feb 4, 2012 at 4:02 AM, Stan

Re: Passing schema inside Load functionc

2012-02-03 Thread Stan Rosenberg
My hunch is you'll have to write a custom loader, but I'll let the experts chime in. E.g., AvroStorage loader can parse the schema from a json file passed to it via the constructor. I don't think PigStorage has the same option. stan On Fri, Feb 3, 2012 at 7:35 AM, praveenesh kumar wrote: > Hey

Passing schema inside Load functionc

2012-02-03 Thread praveenesh kumar
Hey guys, I am new to Pig. I was wondering is it possible to pass schema in pig load statement while loading it first time. Suppose if I have a huge dataset.. containing around 100 cols.. Is there a way through which I can pass the schema defined in some other file (some kind of meta file) into p