Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The "LoadStoreMigrationGuide" page has been changed by PradeepKamath.
http://wiki.apache.org/pig/LoadStoreMigrationGuide?action=diff&rev1=35&rev2=36

--------------------------------------------------

  ||No equivalent method ||setUDFContextSignature() ||!LoadFunc ||This method 
will be called by Pig both in the front end and back end to pass a unique 
signature to the Loader. The signature can be used to store into the 
!UDFContext any information which the Loader needs to store between various 
method invocations in the front end and back end. A use case is to store 
!RequiredFieldList passed to it in 
!LoadPushDown.pushProjection(!RequiredFieldList) for use in the back end before 
returning tuples in getNext(). The default implementation in !LoadFunc has an 
empty body. This method will be called before other methods.||
  ||No equivalent method ||relativeToAbsolutePath() ||!LoadFunc ||Pig runtime 
will call this method to allow the Loader to convert a relative load location 
to an absolute location. The default implementation provided in !LoadFunc 
handles this for !FileSystem locations. If the load source is something else, 
loader implementation may choose to override this. ||
  ||determineSchema() ||getSchema() ||!LoadMetadata ||determineSchema() was 
used by old code to ask the loader to provide a schema for the data returned by 
it - the same semantics are now achieved through getSchema() of the 
!LoadMetadata interface. !LoadMetadata is an optional interface for loaders to 
implement - if a loader does not implement it, this will indicate to the pig 
runtime that the loader cannot return a schema for the data ||
- ||fieldsToRead() ||pushProject() ||!LoadPushDown ||fieldsToRead() was used by 
old code to convey to the loader the exact fields required by the pig script 
-the same semantics are now achieved through pushProject() of the !LoadPushDown 
interface. !LoadPushDown is an optional interface for loaders to implement - if 
a loader does not implement it, this will indicate to the pig runtime that the 
loader is not capable of returning just the required fields and will return all 
fields in the data. If a loader implementation is able to efficiently return 
only required fields, it should implement !LoadPushDown to improve query 
performance ||
+ ||fieldsToRead() ||pushProjection() ||!LoadPushDown ||fieldsToRead() was used 
by old code to convey to the loader the exact fields required by the pig script 
-the same semantics are now achieved through pushProject() of the !LoadPushDown 
interface. !LoadPushDown is an optional interface for loaders to implement - if 
a loader does not implement it, this will indicate to the pig runtime that the 
loader is not capable of returning just the required fields and will return all 
fields in the data. If a loader implementation is able to efficiently return 
only required fields, it should implement !LoadPushDown to improve query 
performance ||
  ||No equivalent method ||getInputFormat() ||!LoadFunc ||This method will be 
called by Pig to get the !InputFormat used by the loader. The methods in the 
!InputFormat (and underlying !RecordReader) will be called by pig in the same 
manner (and in the same context) as by Hadoop in a map-reduce java program. 
'''If the !InputFormat is a hadoop packaged one, the implementation should use 
the new API based one under org.apache.hadoop.mapreduce. If it is a custom 
!InputFormat, it should be implemented using the new API in 
org.apache.hadoop.mapreduce'''||
  ||No equivalent method ||setLocation() ||!LoadFunc ||This method is called by 
Pig to communicate the load location to the loader. The loader should use this 
method to communicate the same information to the underlying !InputFormat. This 
method is called multiple times by pig - implementations should bear in mind 
that this method is called multiple times and should ensure there are no 
inconsistent side effects due to the multiple calls. ||
  ||bindTo() ||prepareToRead() ||!LoadFunc ||bindTo() was the old method which 
would provide an !InputStream among other things to the !LoadFunc. The 
!LoadFunc implementation would then read from the !InputStream in getNext(). In 
the new API, reading of the data is through the !InputFormat provided by the 
!LoadFunc. So the equivalent call is prepareToRead() wherein the !RecordReader 
associated with the !InputFormat provided by the !LoadFunc is passed to the 
!LoadFunc. The !RecordReader can then be used by the implementation in 
getNext() to return a tuple representing a record of data back to pig. ||

Reply via email to