Re: Pig-cassandra Scritps and Oozie

2013-11-28 Thread Jeremy Hanna
rt#Oozie > > I am using Cassandra 1.2.10, Oozie 4.0.0 adn pig 0.11.1. > > I try to test these options and see if it works- > > Thanks in advance > > > > > > > > > > > > 2013/11/28 Jeremy Hanna > >> If I rememb

Re: Pig-cassandra Scritps and Oozie

2013-11-28 Thread Jeremy Hanna
If I remember correctly when I configured pig, cassandra, and oozie to work together, I just used vanilla pig but gave it the jars it needed. What is the problem you’re experiencing that you are unable to do this? Jeremy On 28 Nov 2013, at 12:56, Miguel Angel Martin junquera wrote: > hi all;

Re: Notes of interest from Apache Pig Hackday, Austin edition

2012-05-12 Thread Jeremy Hanna
it's just sort of languishing. > > 4. ONERROR would be a real coup for pig...there's a spec, someone just > needs to do the work! > > And then there are various and sundry things that I would like to > do...finish up SchemaTuple, move on to SchemaBag, and so on. >

Notes of interest from Apache Pig Hackday, Austin edition

2012-05-12 Thread Jeremy Hanna
Thanks again to Twitter for doing their event and inspiring ours. I just wanted to report on some things we did in Austin for any interested. We had a good turnout of about 30 people. Kevin Safford presented an introduction to Pig, or Pig 101. The slides are available here: http://www.slide

Re: Slides from Apache Pig Hackday, Austin edition

2012-05-11 Thread Jeremy Hanna
, > > Dan > On May 11, 2012 2:00 PM, "Jeremy Hanna" wrote: > >> Here in Austin, we've been having a hack day for beginning to intermediate >> developers. Just wanted to post some slides that were from presentations >> here: >> Pig 101 - >

Re: Hackday Skype: apachepig

2012-05-11 Thread Jeremy Hanna
We've also started to use the #hadoop-pig channel on freenode (IRC). On May 11, 2012, at 12:04 PM, Russell Jurney wrote: > Up to 10 people can skype in to the Pig hackday. Call apachepig :) > > -- > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com

Slides from Apache Pig Hackday, Austin edition

2012-05-11 Thread Jeremy Hanna
Here in Austin, we've been having a hack day for beginning to intermediate developers. Just wanted to post some slides that were from presentations here: Pig 101 - http://www.slideshare.net/ktsafford/dachis-group-pigout101-12895911 Pig 202 - http://www.slideshare.net/thelabdude/dachis-group-pig-h

Re: Apache Pig hackday @ Twitter (SF)

2012-05-04 Thread Jeremy Hanna
Here is the Austin event for those interested: http://pig-hackday-austin.eventbrite.com/ On Apr 19, 2012, at 6:12 PM, Jeremy Hanna wrote: > Cool - tx Russell et al. I'm talking with the higher ups here to see if we > want to make it a general pig training and hacking day - we have

Building a databag in a UDF requiring tons of memory?

2012-05-03 Thread Jeremy Hanna
We're admittedly on an older version of pig (0.8.0-cdh3u0) but are trying to build a databag in our UDF and are getting OOM exceptions even with 6 GB of heap. Specifically, we're marshaling data prior to writing it to Cassandra using our ToCassandraBag UDF and have a databag as one of the input

Re: Apache Pig hackday @ Twitter (SF)

2012-04-19 Thread Jeremy Hanna
Cool - tx Russell et al. I'm talking with the higher ups here to see if we want to make it a general pig training and hacking day - we have lunch time training things here where we go over 101, 202, etc. Maybe we'll organize something like that for this area and hack alongside people there.

Re: Apache Pig hackday @ Twitter (SF)

2012-04-18 Thread Jeremy Hanna
Just curious - is there some way to do a remote connection? we have a few people here in Austin and one in Colorado at the Dachis Group that may want to participate. On Apr 18, 2012, at 4:18 PM, Dmitriy Ryaboy wrote: > Hi folks, > The Analytics Infra team at Twitter will be hosting a Pig hackd

Re: mongo-hadoop Pig users must turn off speculative execution to avoid duplicate inserts

2012-03-02 Thread Jeremy Hanna
Not sure what mongo's doing (generate ID or triggers or something) but it should only be a problem of efficiency if the writes are idempotent. On Mar 2, 2012, at 3:39 AM, Jonathan Coveney wrote: > I agree with Bill. Speculative execution is a feature of Hadoop that > doesn't jive nicely with sto

Re: pig 0.8 releases

2012-02-17 Thread Jeremy Hanna
rs from apache's repo and get something that wasn't a release. On Feb 17, 2012, at 11:44 PM, Dmitriy Ryaboy wrote: > Do you mean the snapshot of current 0.8 branch? Once 8.2 is released, the > version in the branch is bumped up. There has been no 8.3 release. > > On

pig 0.8 releases

2012-02-17 Thread Jeremy Hanna
So the current releases of pig are 0.8.1 and 0.9.2. However, in the apache mvn repo (and mirrored repos) there is a pig 0.8.3. I find no release on it, no svn tag for it, and no user mailing list announcement for it. Where does 0.8.3 come from? it's in https://repository.apache.org/index.ht

Re: Performance problem and profiling

2011-11-30 Thread Jeremy Hanna
actually - he just put it on github :) https://github.com/edwardcapriolo/filecrush On Nov 30, 2011, at 9:03 AM, Jeremy Hanna wrote: > We went through some grief with small files and inefficiencies there. First > we went the route of CombinedInputFormat. That worked for us for a whi

Re: Performance problem and profiling

2011-11-30 Thread Jeremy Hanna
We went through some grief with small files and inefficiencies there. First we went the route of CombinedInputFormat. That worked for us for a while but then we started getting errors relating to the number of open files. So we used a utility that Ed Capriolo in the Hadoop/Hive/Cassandra comm

Re: status of hbase storage

2011-11-11 Thread Jeremy Hanna
re of significant issues with HBaseStorage in pig trunk; > some features are outstanding, but other than that, I think most > complaints we get are about jar management (which is mostly solved in > trunk and pig 9, iirc). Do file tickets if you run into problems! > > D > >

status of hbase storage

2011-11-11 Thread Jeremy Hanna
I just wondered about the status of hbase storage, specifically the store part of it. Is it something people are using in production - ready for prime time? I seemed to remember a couple of people having problems with the store side of it and I didn't know if that was rumor or not. Thanks! J

Re: inner and outer are deprecated for group/cogroup?

2011-10-26 Thread Jeremy Hanna
Sorry - for cogroup only…? same question though. On Oct 26, 2011, at 5:28 PM, Jeremy Hanna wrote: > Does this ticket mean that inner and outer are deprecated for group/cogroup? > It sounds that way, but I just wanted to make sure. (We may need to refactor > some things if so.

inner and outer are deprecated for group/cogroup?

2011-10-26 Thread Jeremy Hanna
Does this ticket mean that inner and outer are deprecated for group/cogroup? It sounds that way, but I just wanted to make sure. (We may need to refactor some things if so.) https://issues.apache.org/jira/browse/PIG-1584

Re: Converting an inner bag to an outer bag/relation?

2011-10-15 Thread Jeremy Hanna
One of the reasons why we did pygmalion here was to facilitate working with tabular data - extracting out values (with FromCassandraBag) using specified column names. Not sure if it works with your use case, but just to mention it - it doesn't work as easily with dynamic column names. https://g

Re: Pig & Cassandra integration

2011-09-28 Thread Jeremy Hanna
It's been mentioned in this thread, but if you're using tabular (static column names) data, you might consider using Pygmalion. It will extract the values from Cassandra to simplify grouping by values and other operations. https://github.com/jeromatron/pygmalion What you'll want to look at is th

Re: pigunit and auto-registering additional jars

2011-09-22 Thread Jeremy Hanna
> > into task in build.xml, though I am not sure it is acceptable > for your case. > > Daniel > > On Thu, Sep 22, 2011 at 11:43 AM, Jeremy Hanna > wrote: >> Is there a way to use -Dpig.additional.jars with pigunit to auto-register >> jars for unit test scripts?

pigunit and auto-registering additional jars

2011-09-22 Thread Jeremy Hanna
Is there a way to use -Dpig.additional.jars with pigunit to auto-register jars for unit test scripts? Maybe we're just missing something because this seems like a basic thing that people would like to use. I see in test/org/apache/pig/test/pigunit/TestPigTest.java that there is a commented out

any reason why pigunit isn't pushed to maven central?

2011-09-19 Thread Jeremy Hanna
Seems like pigunit would be one of those jars that would be handy to just depend on with maven/ivy. Is there any reason why pigunit isn't pushed to maven central along with pig itself? Thanks! Jeremy

Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Jeremy Hanna
/repos/asf/cassandra/trunk/contrib/pig. Are there any > other resource that you can point me to? There seems to be a lack of samples > on this subject. > > On Tue, Aug 30, 2011 at 10:56 PM, Jeremy Hanna > wrote: > FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to

Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS

2011-08-30 Thread Jeremy Hanna
FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to potentially move to Brisk because of the simplicity of operations there. Not sure what you mean about the true power of Hadoop. In my mind the true power of Hadoop is the ability to parallelize jobs and send each task to wher

Re: How to access subcolumns in cassandra

2011-08-17 Thread Jeremy Hanna
ks for the response > > Fabio > > On 17/08/2011, at 22:14, Jeremy Hanna wrote: > >> Hi Fabio, >> >> I'm not sure if super columns are fully supported right now in >> CassandraStorage. Brandon (who I CCed) would know for sure. That and I >>

Re: How to access subcolumns in cassandra

2011-08-17 Thread Jeremy Hanna
Hi Fabio, I'm not sure if super columns are fully supported right now in CassandraStorage. Brandon (who I CCed) would know for sure. That and I thought the pig bug that made it impossible to get to nested data structures has been resolved - the ticket you commented on today I think was a dupl

Re: Pig & Cassandra integration

2011-08-02 Thread Jeremy Hanna
72) >... 9 more > Caused by: java.io.EOFException >at java.io.DataInputStream.readInt(DataInputStream.java:375) >at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:812) >at org.apache.hadoop.ipc.Client$Connection.run(Client.java

Re: Pig & Cassandra integration

2011-08-01 Thread Jeremy Hanna
a:303) >>at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) >>at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) >>at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) >>at org.apa

Re: Pig & Cassandra integration

2011-08-01 Thread Jeremy Hanna
ig.Main.run(Main.java:465) >at org.apache.pig.Main.main(Main.java:107) > > does anyone else have this problem? > > > On Sun, Jul 31, 2011 at 2:04 PM, Jeremy Hanna > wrote: > >> Try following this and see if it helps getting started: >> https://github.com/

Re: Pig & Cassandra integration

2011-07-31 Thread Jeremy Hanna
Try following this and see if it helps getting started: https://github.com/jeromatron/pygmalion/wiki/Getting-Started I haven't tried it with 0.9 yet but I plan to this week. We use the CassandraStorage jar in production. If you can, validate your data with Cassandra's schema validators. Cassa

Re: Pig 0.9.0 has been released!

2011-07-29 Thread Jeremy Hanna
Nice work Daniel and all on the release and the blog posts! Looking forward to the other two. We'll be testing out on our stuff because of all the great features added. On Jul 29, 2011, at 4:02 PM, Daniel Dai wrote: > We wrote a serial of blogs to describe the new feature of Pig 0.9.0 on > ht

Re: Hadoop Production Issue

2011-07-16 Thread Jeremy Hanna
One thing that we use is filecrush to merge small files below a threshold. It works pretty well. http://www.jointhegrid.com/hadoop_filecrush/index.jsp On Jul 16, 2011, at 1:17 AM, jagaran das wrote: > > > Hi, > > Due to requirements in our current production CDH3 cluster we need to copy > a

Re: UDF property passing

2011-07-08 Thread Jeremy Hanna
DRA-2869 for another > case where this has reared it's head in an improper implementation. > > -Grant > > On Jul 7, 2011, at 3:24 AM, Jeremy Hanna wrote: > >> >> On Jul 6, 2011, at 11:10 PM, Raghu Angadi wrote: >> >>> On Wed, Jul 6, 2011 at

Re: UDF property passing

2011-07-07 Thread Jeremy Hanna
On Jul 6, 2011, at 11:10 PM, Raghu Angadi wrote: > On Wed, Jul 6, 2011 at 7:20 PM, Jeremy Hanna > wrote: > >> >> On Jul 6, 2011, at 12:47 PM, Dmitriy Ryaboy wrote: >> >>> I think this is the same problem we were having earlier: >>> http:

Re: UDF property passing

2011-07-06 Thread Jeremy Hanna
in this case we'll just have to require the field names be entered into the UDF and it won't introspect them. Ah well. Would be nice to be able to use it but I don't really see another way around this bug with the shared UDF context. > > D > > On Wed, Jul 6, 2011 at

UDF property passing

2011-07-06 Thread Jeremy Hanna
We have a UDF that introspects the output schema and gets the field names there and use that in the exec method. The UDF is found here: https://github.com/jeromatron/pygmalion/blob/master/udf/src/main/java/org/pygmalion/udf/ToCassandraBag.java A simple example is found here: https://github.com

set default_parallel or let pig set it

2011-07-04 Thread Jeremy Hanna
According to http://pig.apache.org/docs/r0.8.1/cookbook.html#Use+the+Parallel+Features there are two ways for pig to determine the number of reducers to use: 1- set default_parallel and/or PARALLEL 2- let pig calculate it What do people generally use right now? Is there a preferred option?

Re: Debugging pig scripts

2011-06-29 Thread Jeremy Hanna
Answering my own question. Penny with 0.9 does this. Wahoo :) Thanks for telling me Ashutosh. On Jun 25, 2011, at 9:56 AM, Jeremy Hanna wrote: > I was just wondering if the following was a common scenario for others and > whether things could be done in a more debug friendly way und

Debugging pig scripts

2011-06-25 Thread Jeremy Hanna
I was just wondering if the following was a common scenario for others and whether things could be done in a more debug friendly way under the covers. Currently we've found that developing with pig is enormously helpful because it's a scripting language that does a lot of the heavy lifting for u

Re: PIG Cassandra - Performance

2011-06-17 Thread Jeremy Hanna
, that's here: https://github.com/jeromatron/pygmalion/ Jeremy On Jun 17, 2011, at 9:05 PM, Badrinarayanan S wrote: > Hi Jeremy, > > Thanks. Till we get 1.0 we will also adopt separate CF for analysis > purposes. > > Regards, > badri > > -----Original M

Re: PIG Cassandra - Performance

2011-06-17 Thread Jeremy Hanna
The way cassandra currently does mapreduce is that it iterates over all the rows of the column family. So yes, performance would be related to the growing number of rows. You can use the pig FILTER function to filter them down, but you are still iterating over all of the rows in that columns f

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna
(the script). > > On Wed, Jun 15, 2011 at 3:04 PM, Jeremy Hanna > wrote: > >> Hi Will, >> >> That's partly why I like to use FromCassandraBag and ToCassandraBag from >> pygmalion - it does the work for you to get it back into a form that >> cassandr

Re: prep for cassandra storage from pig

2011-06-15 Thread Jeremy Hanna
Hi Will, That's partly why I like to use FromCassandraBag and ToCassandraBag from pygmalion - it does the work for you to get it back into a form that cassandra understands. Others may know better how to massage the data into that form using just pig, but if all else fails, you could write a u

Re: useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna
ng keys even if you sampled in a way that didn't actually > produce any, etc. > > D > > On Wed, Jun 15, 2011 at 10:35 AM, Jeremy Hanna > wrote: >> We started doing this recently and thought it might be useful to others. >> >> Pig (and Hive) have a sample

useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna
We started doing this recently and thought it might be useful to others. Pig (and Hive) have a sample function that allows you to sample data from your data store. In pig it looks something like this: mysample = SAMPLE myrelation 0.01; One possible use for this, with pig and cassandra is to sol

Re: PIG Cassandra Consistency Level

2011-06-13 Thread Jeremy Hanna
You need to set the property in your hadoop configuration: cassandra.consistencylevel.read to LOCAL_QUORUM. All of the properties you can set are in the org.apache.cassandra.hadoop.ConfigHelper class. You can call that directly with Java/MapReduce or use the properties defined at the top in you

viewing current relationships loaded on the grunt shell

2011-06-11 Thread Jeremy Hanna
I looked through the help and the docs pages but couldn't find anything that did this. Is there any way to show a list of current relations loaded while on the grunt shell? It would seem that the information is available, just not exposed via a command. Thanks! Jeremy

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-10 Thread Jeremy Hanna
ething to do with different address for rpc_address >> and listen_address but not sure what it is... >> >> >> >> -Original Message- >> From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com] >> Sent: Friday, May 06, 2011 11:10 PM >> To: user@

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-06 Thread Jeremy Hanna
he nodes in the cluster. > > I too believe it is something to do with different address for rpc_address > and listen_address but not sure what it is... > > > > -Original Message- > From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com] > Sent: Frida

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-06 Thread Jeremy Hanna
Where are you running the pig script from - your local machine or one of the nodes in the cluster or ? I would think it wouldn't matter which address you use, but what interface it's using. So if the internal and public address are both using the same interface, then you should be able to conn

Re: cannot cast issue: Pig filter against flatten column

2011-04-29 Thread Jeremy Hanna
A few questions: What are you trying to do? What is the pig script that you're trying to run? What version of Cassandra? What version of Pig? Did you add any column_metadata to your column family, like a validation_class? On Apr 28, 2011, at 7:58 PM, Himanshu wrote: > java.lang.ClassCastExceptio

Re: Pygmalion - a github project for pig + cassandra

2011-04-27 Thread Jeremy Hanna
a/browse/PIG-1420 Oh cool - gtk, thanks Bill! > > > On Wed, Apr 27, 2011 at 12:31 PM, Jonathan Ellis wrote: >> Nice! >> >> On Wed, Apr 27, 2011 at 1:57 PM, Jeremy Hanna >> wrote: >>> Hi all, >>> >>> A little while back, I started a pr

Pygmalion - a github project for pig + cassandra

2011-04-27 Thread Jeremy Hanna
tuple (name, value)}) - the column names are extracted from the variable names in the Pig script. Both contributed by Jacob Perkins with slight revisions by Jeremy Hanna StringConcat: probably something everyone implements but instead of CONCAT that only does two strings, it does any number of st

Re: SUM

2011-04-25 Thread Jeremy Hanna
TE key, FLATTEN(org.pygmalion.udf.FromCassandraBag('first_name, last_name, birth_place, num_heads', columns)) AS ( first_name:chararray, last_name:chararray, birth_place:chararray, num_heads:long ); b = group rows by key; x = foreach b generate group, SUM(rows.num_heads);

Re: SUM

2011-04-25 Thread Jeremy Hanna
6) >>>> at >>>> >>>> >>> >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) >>>> at >>>> >>>> >>> >> org.apache.pig.backend.hadoop.ex

Re: pig query on Cassandra

2011-04-21 Thread Jeremy Hanna
On Apr 21, 2011, at 9:25 AM, Mridul Muralidharan wrote: > On Thursday 21 April 2011 06:41 PM, Jeremy Hanna wrote: >> >> On Apr 21, 2011, at 3:19 AM, Mridul Muralidharan wrote: >> >>> >>> In general (on hadoop based systems), if the input is not immu

Re: pig query on Cassandra

2011-04-21 Thread Jeremy Hanna
On Apr 21, 2011, at 3:19 AM, Mridul Muralidharan wrote: > > In general (on hadoop based systems), if the input is not immutable - you can > end up with issues during task re-execution, etc. > This happens not just for cassandra but for hbase, others too - where you > modify data in-place. >

Re: pig query on Cassandra

2011-04-20 Thread Jeremy Hanna
The answer is that it depends on which consistency level you are reading and writing at. You can make sure you are always reading consistent data by using quorum for reads and quorum for writes. For more information on consistency level, see: http://www.datastax.com/docs/0.7/consistency/index

Re: COUNT sometimes returning a float value?

2011-04-15 Thread Jeremy Hanna
example data/query that can be used to reproduce this ? > Can you paste the entire stacktrace of the ClassCastException ? > Do you have something like a bincond which might be returning different > results for different rows ? > > -Thejas > > > > > On 4/15/11 2:44

COUNT sometimes returning a float value?

2011-04-15 Thread Jeremy Hanna
I have been getting strange errors in my pig script and narrowed it down a bit and found that when I do a COUNT, sometimes it returns a float, but most of the time it returns a long. Some example output of the result column that came from a COUNT is below. Any reason why this would happen? Th

Re: Getting errors with BinSedesTuple in my storefunc

2011-04-08 Thread Jeremy Hanna
ejas > > > > On 4/8/11 9:30 AM, "Jeremy Hanna" wrote: > > I am going through a lot of processing with my data and then I reformat it to > go back into my data store using the storefunc. I store it out to hdfs and > it visually looks just fine. However when I tr

Re: Pig filter against flatten column

2011-04-08 Thread Jeremy Hanna
The 0.7.4 version is here: http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.7.4/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java The latest from 0.7 branch contains a way to get the cassandra schema for the column family it is querying against though: http://sv

Getting errors with BinSedesTuple in my storefunc

2011-04-08 Thread Jeremy Hanna
I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc. I store it out to hdfs and it visually looks just fine. However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache

Re: help flattening data from cassandra loader

2011-04-06 Thread Jeremy Hanna
oing because it just makes it easier to deal with tabular-like data - we don't have to munge through it quite as much. I'm still pretty low on my pig-fu but others on the list might have better answers on how to deal with that data structure. > > On Apr 6, 2011, at 3:51 PM, Jeremy

Re: help flattening data from cassandra loader

2011-04-06 Thread Jeremy Hanna
I'm going to put a UDF up on the pygmalion project hopefully today that will convert that into something more usable. Props to Jacob from infochimps - he and I have been creating UDFs like that lately for use with Cassandra. There's an associated UDF for getting it back into the key, cols form

Re: Error reading data from Cassandra

2011-04-06 Thread Jeremy Hanna
the next couple of days. Feel free to add to it as well :). https://github.com/jeromatron/pygmalion Jeremy On Apr 6, 2011, at 4:15 AM, Fabio Souto wrote: > It works. Thank you for your help Jeremy!! > > Cheers > Fabio > > On 05/04/2011, at 20:08, Jeremy Hanna wrote: >

Re: Error reading data from Cassandra

2011-04-05 Thread Jeremy Hanna
ra.dht.RandomPartitioner > > > BTW I'm using the pig version that comes with Cassandra, the one in > cassandra/contrib/pig > > Thanks for your time Jeremy! :) > Fabio > > On 05/04/2011, at 17:04, Jeremy Hanna wrote: > >> Fabio, >> >> It look

Re: Error reading data from Cassandra

2011-04-05 Thread Jeremy Hanna
gt; BTW I'm using the pig version that comes with Cassandra, the one in > cassandra/contrib/pig > > Thanks for your time Jeremy! :) > Fabio > > On 05/04/2011, at 17:04, Jeremy Hanna wrote: > >> Fabio, >> >> It looks like you need to set your environ

Re: Error reading data from Cassandra

2011-04-05 Thread Jeremy Hanna
g.PigServer.executeCompiledLogicalPlan(PigServer.java:1198) > at org.apache.pig.PigServer.storeEx(PigServer.java:874) > at org.apache.pig.PigServer.store(PigServer.java:816) > at org.apache.pig.PigServer.openIterator(PigServer.java:728) > ... 7 more >

Re: Error reading data from Cassandra

2011-04-05 Thread Jeremy Hanna
Fabio, Could you post the full stack trace that's found in the pig_.log that's in the directory that you ran pig? Thanks, Jeremy On Apr 5, 2011, at 8:42 AM, Fabio Souto wrote: > Hello, > > I have installed Pig 0.8.0 and Cassandra 0.7.4 and I'm not able to read data > from cassandra. I write

Re: jline and commons-lang - building 0.8.0 download

2011-04-01 Thread Jeremy Hanna
> On Thu, Mar 31, 2011 at 4:46 PM, Alan Gates wrote: > >> Isn't ivy picking it up for you? That's what is supposed to happen. >> >> Alan. >> >> >> On Mar 28, 2011, at 11:32 AM, Jeremy Hanna wrote: >> >> Is there a standard way t

Re: Pig 0.8.0 cassandra output

2011-03-29 Thread Jeremy Hanna
True. It is mentioned in the readme, but maybe it should be more explicit in the readme or in the HadoopSupport page. I haven't had problems with localhost, but how you defined it is the way I set things for running against my cassandra/hadoop hybrid cluster. On Mar 29, 2011, at 12:36 PM, Mar

jline and commons-lang - building 0.8.0 download

2011-03-28 Thread Jeremy Hanna
Is there a standard way to get jline and commons-lang into pig? I work around by copying them into my build/ivy/lib/Pig directory but didn't know if there was a simpler way I was just overlooking. Otherwise I get an UNRESOLVED DEPENDENCIES errors for those two libs when I try to build pig 0.8.

Re: LoadCaster, LoadStoreCaster usage and encoded output

2011-03-24 Thread Jeremy Hanna
n the right track. > > We may have to go in and explicitly check the types of each column and > cast manually. > > --jacob > > On Thu, 2011-03-24 at 13:11 -0500, Jeremy Hanna wrote: >> I see that there are a few LoadCaster implementations in pig 0.8. There

LoadCaster, LoadStoreCaster usage and encoded output

2011-03-24 Thread Jeremy Hanna
I see that there are a few LoadCaster implementations in pig 0.8. There's the Utf8StorageConverter, the HBaseBinaryConverter, and a couple of others. The HBaseStorage class uses the Utf8StorageConverter by default but can be configured to use the HBaseBinaryConverter. Also it's just used as a

Re: Pig output to Cassandra

2011-03-11 Thread Jeremy Hanna
Replying on here too since I noticed it was sent to the pig user list as well... The pig output to cassandra was part of recently resolved CASSANDRA-1828. It's usable and separate so you should be able to download 0.7-branch and build the jar and use it against a 0.7.3 cluster. I've been using

Jar dependencies

2010-11-10 Thread Jeremy Hanna
What is the standard way to copy up jar dependencies to the cluster with Pig (so that the nodes in the cluster don't get runtime errors with class not found exceptions)?

Re: Cassandra 0.7 bootstrap exception on windows

2010-11-10 Thread Jeremy Hanna
moving this to the cassandra user list. On Nov 10, 2010, at 11:05 AM, Aditya Muralidharan wrote: > Hi, > > I'm building (on windows) a release tar from the HEAD of the Cassandra 0.7 > branch. Running a new single node instance of Cassandra gives me the > following bootstrap exception: > INFO 1