svn commit: r896134 - in /hadoop/pig/trunk: CHANGES.txt src/docs/src/documentation/content/xdocs/piglatin_reference.xml src/docs/src/documentation/content/xdocs/piglatin_users.xml

2010-01-05 Thread olga
Author: olga
Date: Tue Jan  5 17:18:51 2010
New Revision: 896134

URL: http://svn.apache.org/viewvc?rev=896134view=rev
Log:
PIG-1175: Pig 0.6 Docs - Store v. Dump (chandec via olgan)

Modified:
hadoop/pig/trunk/CHANGES.txt

hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml
hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_users.xml

Modified: hadoop/pig/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/hadoop/pig/trunk/CHANGES.txt?rev=896134r1=896133r2=896134view=diff
==
--- hadoop/pig/trunk/CHANGES.txt (original)
+++ hadoop/pig/trunk/CHANGES.txt Tue Jan  5 17:18:51 2010
@@ -24,6 +24,8 @@
 
 IMPROVEMENTS
 
+PIG-1175: Pig 0.6 Docs - Store v. Dump (chandec via olgan)
+
 PIG-1102: Collect number of spills per job (sriranjan via olgan)
 
 PIG-1149: Allow instantiation of SampleLoaders with parametrized LoadFuncs

Modified: 
hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml
URL: 
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml?rev=896134r1=896133r2=896134view=diff
==
--- 
hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml
 (original)
+++ 
hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml
 Tue Jan  5 17:18:51 2010
@@ -4919,58 +4919,7 @@
 
/section/section

-   section
-   titleDUMP/title
-   paraDisplays the contents of a relation./para
-   
-   section
-   titleSyntax/title
-   informaltable frame=all
-  tgroup cols=1tbodyrow
-entry
-   paraDUMP alias;        /para
-/entry
- /row/tbody/tgroup
-   /informaltable/section
-   
-   section
-   titleTerms/title
-   informaltable frame=all
-  tgroup cols=2tbodyrow
-entry
-   paraalias/para
-/entry
-entry
-   paraThe name of a relation./para
-/entry
- /row/tbody/tgroup
-   /informaltable/section
-   
-   section
-   titleUsage/title
-   paraUse the DUMP operator to run (execute) a Pig Latin statement and to 
display the contents of an alias. You can use DUMP as a debugging device to 
make sure the results you are expecting are being generated./para/section
-   
-   section
-   titleExample/title
-   paraIn this example a dump is performed after each statement./para
-programlisting
-A = LOAD 'student' AS (name:chararray, age:int, gpa:float);
-
-DUMP A;
-(John,18,4.0F)
-(Mary,19,3.7F)
-(Bill,20,3.9F)
-(Joe,22,3.8F)
-(Jill,20,4.0F)
-
-B = FILTER A BY name matches 'J.+';
-
-DUMP B;
-(John,18,4.0F)
-(Joe,22,3.8F)
-(Jill,20,4.0F)
-/programlisting
-/section/section
+  

section
titleFILTER /title
@@ -6521,7 +6470,7 @@

section
titleSTORE /title
-   paraStores data to the file system./para
+   paraStores or saves results to the file system./para

section
titleSyntax/title
@@ -6591,7 +6540,10 @@

section
titleUsage/title
-   paraUse the STORE operator to run (execute) Pig Latin statements and to 
store data on the file system. /para/section
+   paraUse the STORE operator to run (execute) Pig Latin statements and save 
(persist) results to the file system. Use STORE for production scripts and 
batch mode processing./para
+   
+   paraNote: To debug scripts during development, you can use ulink 
url=piglatin_reference.html#DUMPDUMP/ulink to check intermediate 
results./para
+/section

section
titleExamples/title
@@ -6962,6 +6914,68 @@

/section/section

+   
+ section
+   titleDUMP/title
+   paraDumps or displays results to screen./para
+   
+   section
+   titleSyntax/title
+   informaltable frame=all
+  tgroup cols=1tbodyrow
+entry
+   paraDUMP alias;        /para
+/entry
+ /row/tbody/tgroup
+   /informaltable/section
+   
+   section
+   titleTerms/title
+   informaltable frame=all
+  tgroup cols=2tbodyrow
+entry
+   paraalias/para
+/entry
+entry
+   paraThe name of a relation./para
+/entry
+ /row/tbody/tgroup
+   /informaltable/section
+   
+   section
+   titleUsage/title
+   paraUse the DUMP operator to run (execute) Pig Latin statements and 
display the results to your screen. DUMP is meant for interactive mode; 
statements are executed immediately and the results are not saved (persisted). 
You can use DUMP as a debugging device to make sure that the results you are 
expecting are actually generated. /para
+   
+   para
+   Note that production scripts emphasisshould not/emphasis use DUMP as it 
will disable multi-query optimizations and is likely to slow down execution 
+   (see ulink url=piglatin_users.html#Store+vs.+DumpStore vs. 
Dump/ulink).
+   /para
+   

[Pig Wiki] Update of PigMix by AlanGates

2010-01-05 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The PigMix page has been changed by AlanGates.
http://wiki.apache.org/pig/PigMix?action=diffrev1=13rev2=14

--

  || PigMix_12 || 55.33|| 95.33 || 0.58   ||
  || Total || 1352.33  || 1357  || 1.00   ||
  || Weighted avg ||  ||   || 1.04   ||
+ 
+ Run date:  January 4, 2010, run against 0.6 branch as of that day
+ || Test  || Pig run time || Java run time || Multiplier ||
+ || PigMix_1  || 138.33   || 112.67|| 1.23   ||
+ || PigMix_2  || 66.33|| 39.33 || 1.69   ||
+ || PigMix_3  || 199  || 83.33 || 2.39   ||
+ || PigMix_4  || 59   || 60.67 || 0.97   ||
+ || PigMix_5  || 80.33|| 113.67|| 0.71   ||
+ || PigMix_6  || 65   || 77.67 || 0.84   ||
+ || PigMix_7  || 63.33|| 61|| 1.04   ||
+ || PigMix_8  || 40   || 47.67 || 0.84   ||
+ || PigMix_9  || 214  || 215.67|| 0.99   ||
+ || PigMix_10 || 284.67   || 284.33|| 1.00   ||
+ || PigMix_11 || 141.33   || 151.33|| 0.93   ||
+ || PigMix_12 || 55.67|| 115   || 0.48   ||
+ || Total || 1407 || 1362.33   || 1.03   ||
+ || Weighted Avg ||   ||   || 1.09   ||
+ 
  
  
  == Features Tested ==


svn commit: r896212 - in /hadoop/pig/branches/branch-0.6: CHANGES.txt src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml src/docs/src/documentation/content/xdocs/zebra_pig.xml src/docs/src/d

2010-01-05 Thread olga
Author: olga
Date: Tue Jan  5 20:37:14 2010
New Revision: 896212

URL: http://svn.apache.org/viewvc?rev=896212view=rev
Log:
PIG-1177: Pig 0.6 Docs - Zebra docs (chandec via olgan)

Modified:
hadoop/pig/branches/branch-0.6/CHANGES.txt

hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml

hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_pig.xml

hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_reference.xml

Modified: hadoop/pig/branches/branch-0.6/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/hadoop/pig/branches/branch-0.6/CHANGES.txt?rev=896212r1=896211r2=896212view=diff
==
--- hadoop/pig/branches/branch-0.6/CHANGES.txt (original)
+++ hadoop/pig/branches/branch-0.6/CHANGES.txt Tue Jan  5 20:37:14 2010
@@ -26,6 +26,8 @@
 
 IMPROVEMENTS
 
+PIG-1177: Pig 0.6 Docs - Zebra docs (chandec via olgan)
+
 PIG-1175: Pig 0.6 Docs - Store v. Dump (chandec via olgan)
 
 PIG-1162: Pig 0.6.0 - UDF doc (chandec via olgan)

Modified: 
hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml
URL: 
http://svn.apache.org/viewvc/hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml?rev=896212r1=896211r2=896212view=diff
==
--- 
hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml
 (original)
+++ 
hadoop/pig/branches/branch-0.6/src/docs/src/documentation/content/xdocs/zebra_mapreduce.xml
 Tue Jan  5 20:37:14 2010
@@ -45,14 +45,215 @@
 /section
 !-- END HADOOP M/R API-- 
 
+ !-- ZEBRA API--
+   section
+   titleZebra MapReduce APIs/title
+pZebra includes several classes for use in MapReduce programs. The main 
entry point into Zebra are the two classes for reading and writing tables, 
namely TableInputFormat and BasicTableOutputFormat. /p
+
+   section
+ titleBasicTableOutputFormat  /title   
+   table
+   trthStatic/ththMethod/ththDescription/th/tr
+   tr
+   tdyes/td
+   tdvoid setOutputPath(JobConf, Path)  /td
+   tdSet the output path of the BasicTable in JobConf  
/td
+   /tr
+   tr
+   tdyes/td
+   tdPath[] getOutputPaths(JobConf) /td
+   tdGet the output paths of the BasicTable from JobConf 
/td
+   /tr
+   tr
+   tdyes/td
+   tdvoid setStorageInfo(JobConf, ZebraSchema, 
ZebraStorageHint, ZebraSortInfo) /td
+   tdSet the table storage information (schema, 
storagehint, sortinfo) in JobConf/td
+   /tr
+   tr
+   tdyes/td
+   tdSchema getSchema(JobConf)  /td
+   tdGet the table schema in JobConf  /td
+   /tr
+   tr
+   tdyes/td
+   tdBytesWritable generateSortKey(JobConf, Tuple)  /td
+   tdGenerates a BytesWritable key for the input key 
/td
+   /tr
+   tr
+   tdyes/td
+   tdString getStorageHint(JobConf)  /td
+   tdGet the table storage hint in JobConf  /td
+   /tr
+   tr
+   tdyes/td
+   tdSortInfo getSortInfo(JobConf)  /td
+   tdGet the SortInfo object  /td
+   /tr
+   tr
+   tdyes/td
+   tdvoid close(JobConf)  /td
+   tdClose the output BasicTable, No more rows can be 
added into the table  /td
+   /tr
+  tr
+   tdyes/td
+   tdvoid setMultipleOutputs(JobConf, String 
commaSeparatedLocs, Class lt; extends ZebraOutputPartitiongt; theClass)  /td
+   tdEnables data to be written to multiple zebra tables 
based on the ZebraOutputPartition class. 
+   See a 
href=zebra_mapreduce.html#Multiple+Table+OutputsMultiple Table 
Outputs./a/td
+   /tr
+   /table 
+/section
+
+   section
+ titleTableInputFormat   /title
+   table
+   trthStatic/ththMethod/ththDescription/th/tr
+   tr
+   tdyes/td
+   tdvoid setInputPaths(JobConf, Path... paths)  /td
+   tdSet the paths to the input table /td
+
+   /tr
+   tr
+   tdyes/td
+   tdPath[] getInputPaths(JobConf)  /td
+   tdGet the comma-separated paths to the input table or 
table union  /td
+