The big question is how the log file needs to be parsed / formatting. I'd
be inclined to write a UDF that would take the line of text and return a
tuple of the values you'd be storing in hbase.
Then you could do other operations on the bag of tuples that get passed
back.
Alternatively, you
Could you please let us know how exactly you want to parse your logs?
Warm Regards,
Tariq
cloudfront.blogspot.com
On Wed, Feb 26, 2014 at 6:25 PM, David McNelis dmcne...@gmail.com wrote:
The big question is how the log file needs to be parsed / formatting. I'd
be inclined to write a UDF
if you want to load hbase log, why do you not directly write MapReduce
jobs. In pig, you need to write your customized load function. However, if
you write MapReduce job,
you can directly use hbase api.
On Wed, Feb 26, 2014 at 2:15 PM, Mohammad Tariq donta...@gmail.com wrote:
Could you please
Hi Cheolsoo,
I tried to apply the patch and rebuild pig but it's still not working.
When I try to use assert just like in pig docs I get an exception if I add
a message:
a = LOAD '/data' USING PigStorage() AS (source_id:int,
source_name:chararray);
ASSERT a BY source_id 9 'source_id should be
Looks like you miss a comma-
ASSERT a BY source_id 9, 'source_id should be greater than 9';
That's another bug in 0.12 documentation which is fixed now.
On Wed, Feb 26, 2014 at 8:27 AM, Ronald Green green.ron...@gmail.comwrote:
Hi Cheolsoo,
I tried to apply the patch and rebuild pig but
Hi all,
When running pig scripts on my hadoop cluster(deployed with CDH-4.3.1) it
always show this exception:
2014-02-27 00:04:45,656 [main] WARN
org.apache.pig.backend.hadoop23.PigJobControl - falling back to default
JobControl (not using hadoop 0.23 ?)
java.lang.NoSuchFieldException: