Ok, I found the solution
Replace
Schema tupleSchema = new Schema(input.getFields());
With
Schema tupleSchema = new
Schema(input.getField(0).schema.getField(0).schema.getFields());
Will to the trick.
Thanks.
Dan
-Original Message-
From: Danfeng Li [mailto:d...@operasolutions.com]
Sen
HBase CF names are case sensitive, so you're query might be off since
you're using lowercase.
If that the problem still persists with the same case, would it be possible
to see if you can reproduce against a Pig build from the trunk?
On Mon, Aug 13, 2012 at 8:28 AM, Mohit Anchlia wrote:
>
>
> O
Thanks, Robert.
However, I'm still not clear on how to get the original fields for the tuple
inside the bag. Following is the code to generate the schema.
public Schema outputSchema(Schema input) {
try{
Schema.FieldSchema counter = new Schema.FieldSchema("counter",
DataType.INTEGER);
That would be quite handy I think.
D
On Thu, Aug 9, 2012 at 12:24 PM, Xavier Stevens wrote:
> Does anyone else think it would make sense to have all operators and
> functions listed on a single page somewhere as a reference? Right now they
> are split up over the "Pig Latin Basics" and "Built In
For CSV excel, check out
http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
D
>> Also, is PigStorage compatible with the quoting expected by excel
>> tab-delimited files? AIUI that would require quoting the values with
>> "value\tvalue" and escaping doub
You are talking about changing the way hadoop works; something like
this would be transparent to Pig.
Note that Hadoop Distributed Cache != "distributed memory cache".
I suppose you could replace the value of fs.file.impl from
org.apache.hadoop.fs.LocalFileSystem to something else.. might be
qui
Julien removed a dozen or so loader/storer instantiations.
That can do it if you do work in constructors.
D
On Fri, Aug 10, 2012 at 1:15 PM, Prashant Kommireddi
wrote:
> Thanks Chun.
>
> Jon, any idea what on 0.11 might have fixed it?
>
> On Thu, Aug 9, 2012 at 3:32 PM, Chun Yang
> wrote:
>
>> I
This seems like a bug in PigStorage. Would you mind opening a JIRA with the
steps to reproduce that you've include here?
thanks,
Bill
On Mon, Aug 13, 2012 at 3:44 PM, jeremiah rounds
wrote:
> Greetings pig users,
>
> This is regarding my previous post (in quotes below)
>
>
> I was able to remove
Chapter 10 in Alan Gates' excellent book "Programmin Pig" discusses this
issue.
Robert Yerex
Data Scientist
Civitas Leaning
On Mon, Aug 13, 2012 at 3:43 PM, Danfeng Li wrote:
> I have a big, e.g. A: {(name: chararray,age: int)}, I wrote a udf which
> adds 1 more field in the tuple inside the ba
Greetings pig users,
This is regarding my previous post (in quotes below)
I was able to remove this column error by using the start up:
pig -x local -M -t ColumnMapKeyPrune
I have no more insight than that I only tried it because someone else
reported their column oriented error went away wit
I have a big, e.g. A: {(name: chararray,age: int)}, I wrote a udf which adds 1
more field in the tuple inside the bag. E.g. B: {(name: chararray,age: int,
rank:int)}. Because the number of fields in the original bag is not fixed, e.g
I can have one more field such as gender:int.
In my udf, in o
Greetings,
I am new to pig. I am trying to get to know it on a laptop with
hadoop 20.2 installed in local mode. I have prior experience with
hadoop, but I figure my error is so weird I blew the pig install or
something.
Here is what I have my problem distilled down too:
$ pig -x local -M
gru
Hello
Can we use Distributed Cache to store intermediate results after the Map
Phase so that these can be used in Reduce phase from cache.
So as to improve performance of Map-Reduce Job.
I found a Paper regarding usage of Cache in Map-Reduce,
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5
On Aug 13, 2012, at 9:05 AM, Benjamin Smedberg wrote:
> I'm a new-ish pig user querying data on an hbase cluster. I have a question
> about accumulator-style functions.
>
> When writing an accumulator-style UDF, is all of the data shipped to a single
> machine before it is reduced/accumulated?
I'm a new-ish pig user querying data on an hbase cluster. I have a
question about accumulator-style functions.
When writing an accumulator-style UDF, is all of the data shipped to a
single machine before it is reduced/accumulated? For example, if I were
doing to write re-implement SUM as a UDF
On Sun, Aug 12, 2012 at 11:26 PM, Bill Graham wrote:
> This seems to be the problem:
>
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.zookeeper.ZKConfig.parseZooCfg(ZKConfig.java:167)
>
> Which seems like the Conf is null, which is really odd.
>
> http://svn.ap
16 matches
Mail list logo