Re: Changing the schema before Storing

2012-12-11 Thread Bill Graham
Thanks Younos for catching that and sorry that you got bit by it. That is in fact a javadoc bug. I've just opened a JIRA for it: https://issues.apache.org/jira/browse/PIG-3092 http://pig.apache.org/docs/r0.10.0/basic.html#store Regarding the casting, what does describe look like of the relation y

Re: Question regarding a custom LoadFunc implementation

2012-12-11 Thread Bill Graham
We had a yml file that mapped physical datasources to the loader that the generic one serves as a facade to. Now we're moving to an HCatalog based solution that handles that as well as the logical to physical resolution. Basically the mappings are stored in a DB. On Tue, Dec 11, 2012 at 8:20 AM,

Re: ERROR 2999: Unexpected internal error. null

2012-12-11 Thread William Oberman
For what it's worth, the error is on the cassandra side, so I'd post to that mailing list. On Tue, Dec 11, 2012 at 2:13 PM, James Schappet wrote: > This is what pig_cassandra with debug enabled outputs: > > > > > tsunami:pig schappetj$ bin/pig_cassandra -x local rowcount.pig > Using /Library/pi

Re: ERROR 2999: Unexpected internal error. null

2012-12-11 Thread James Schappet
This is what pig_cassandra with debug enabled outputs: tsunami:pig schappetj$ bin/pig_cassandra -x local rowcount.pig Using /Library/pig-0.10.0/pig-0.10.0.jar. Find hadoop at /Library/hadoop-1.0.2/bin/hadoop dry run: HADOOP_CLASSPATH: /Library/pig-0.10.0/bin/../conf:/Library/Java/JavaVirtualMa

Re: ERROR 2999: Unexpected internal error. null

2012-12-11 Thread Jonathan Coveney
Sounds like there could be a wrong version on the classpath then 2012/12/11 William Oberman > Your line numbers aren't matching up to the 1.1.7 release, which is weird. > Based on the "stock" 1.1.7 source, there was a null check on str > before predicateFromString(str), > making your code pat

Re: ERROR 2999: Unexpected internal error. null

2012-12-11 Thread William Oberman
Your line numbers aren't matching up to the 1.1.7 release, which is weird. Based on the "stock" 1.1.7 source, there was a null check on str before predicateFromString(str), making your code path impossible... will On Tue, Dec 11, 2012 at 1:00 PM, Jonathan Coveney wrote: > If I were debugging t

Re: ERROR 2999: Unexpected internal error. null

2012-12-11 Thread Jonathan Coveney
If I were debugging this (note, I know nothing about cassandra), I would put a flag in my ide on cassandra storage and see what is going on in there, and why it is erroring out. Then I would follow that backwards into whatever in Pig was generating that issue. That's pretty vague but can't really s

Re: Piggybank date time functions

2012-12-11 Thread Jonathan Coveney
I did not implement those UDF's... I imagine the reason for rigorously using UTC instead of system time is because that can introduce subtle bugs where your servers have a different time than your client and it can be hard to debug, etc. It would be pretty easy to add support for timezone to those

Re: Changing the schema before Storing

2012-12-11 Thread yaboulna
Hi Bill, Thanks for your reply. Since this is the case then JavaDocs of the class needs to be fixed (see http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html). Also, I faced a bug that I worked around by explicit casting. For some reason all the obj

Re: Running pig script as different user

2012-12-11 Thread Miki Tebeka
Thanks, I'll have a look and see if this is applicable to CDH4 as well. On Tue, Dec 11, 2012 at 4:17 AM, UMBC wrote: > Miki, Prashant, > > > Curiously I've got the same need as you do. I've started to work > on this problem few weeks ago by patching Hadoop client code. Here is > wha

Re: Question regarding a custom LoadFunc implementation

2012-12-11 Thread Prashant Kommireddi
Thanks Bill. Any ideas on how to hide the location of HDFS files from the end user? On Tue, Dec 11, 2012 at 9:42 PM, Bill Graham wrote: > I think the latter would be better. Since the LoadFunc would be decoupled > from the data exporter you could schedule the exporting independent of the > loadi

Re: Question regarding a custom LoadFunc implementation

2012-12-11 Thread Bill Graham
I think the latter would be better. Since the LoadFunc would be decoupled from the data exporter you could schedule the exporting independent of the loading. We do something similar, without the $query part. On Tue, Dec 11, 2012 at 1:10 AM, Prashant Kommireddi wrote: > I was working on a LoadFun

Re: Running pig script as different user

2012-12-11 Thread UMBC
Miki, Prashant, Curiously I've got the same need as you do. I've started to work on this problem few weeks ago by patching Hadoop client code. Here is what I have so far https://github.com/vladistan/hadoop-common/commits/cloudera-cdh3u5 I only tried it on with the Clouder