Re: useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna
Cool - thanks Dmitriy! On Jun 15, 2011, at 12:54 PM, Dmitriy Ryaboy wrote: > Another tip: > If you parametrize your load statements, it becomes easy to switch > between loading from something like Cassandra, and reading from HDFS > or local fs directly. > > Also: > Try using Pig's "illustrate" c

Re: useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Dmitriy Ryaboy
Another tip: If you parametrize your load statements, it becomes easy to switch between loading from something like Cassandra, and reading from HDFS or local fs directly. Also: Try using Pig's "illustrate" command when working through your flows -- it does some clever things that go far beyond sim

useful little way to run locally with (pig|hive) && cassandra

2011-06-15 Thread Jeremy Hanna
We started doing this recently and thought it might be useful to others. Pig (and Hive) have a sample function that allows you to sample data from your data store. In pig it looks something like this: mysample = SAMPLE myrelation 0.01; One possible use for this, with pig and cassandra is to sol