[2/3] git commit: Update pig readme

brandonwilliams Sat, 26 May 2012 08:50:18 -0700

Update pig readme


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2dc27a17
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2dc27a17
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2dc27a17

Branch: refs/heads/trunk
Commit: 2dc27a17567fa448aae335e74cc46ab94339eba4
Parents: db68e03
Author: Brandon Williams <brandonwilli...@apache.org>
Authored: Sat May 26 10:50:00 2012 -0500
Committer: Brandon Williams <brandonwilli...@apache.org>
Committed: Sat May 26 10:50:00 2012 -0500

----------------------------------------------------------------------
 examples/pig/README.txt |   19 +++++++++++++++++--
 1 files changed, 17 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2dc27a17/examples/pig/README.txt
----------------------------------------------------------------------
diff --git a/examples/pig/README.txt b/examples/pig/README.txt
index 3bdbf10..57b8f57 100644
--- a/examples/pig/README.txt
+++ b/examples/pig/README.txt
@@ -1,7 +1,8 @@
 A Pig storage class that reads all columns from a given ColumnFamily, or writes
 properly formatted results into a ColumnFamily.
 
-Setup:
+Getting Started
+===============
 
 First build and start a Cassandra server with the default
 configuration and set the PIG_HOME and JAVA_HOME environment
@@ -31,7 +32,6 @@ for input and output:
 * PIG_OUTPUT_RPC_PORT : the port thrift is listening on for writing
 * PIG_OUTPUT_PARTITIONER : cluster partitioner for writing
 
-
 Then you can run it like this:
 
 examples/pig$ bin/pig_cassandra -x local example-script.pig
@@ -70,3 +70,18 @@ Which will copy the ColumnFamily.  Note that the destination 
ColumnFamily must
 already exist for this to work.
 
 See the example in test/ to see how schema is inferred.
+
+Advanced Options
+================
+
+The following environment variables default to false but can be set to true to 
enable them:
+
+PIG_WIDEROW_INPUT:  this enables loading of rows with many columns without
+                    incurring memory pressure.  All columns will be in a bag 
and indexes are not
+                    supported.
+
+PIG_USE_SECONDARY:  this allows easy use of secondary indexes within your
+                    script, by appending every index to the schema as 
'index_$name', allowing
+                    filtering of loaded rows with a statement like "FILTER 
rows BY index_color eq
+                    'blue'" if you have an index called 'color' defined.
+

[2/3] git commit: Update pig readme

Reply via email to