Pig by default use plain text file as input/output, unless you write a
custom LoadFunc/StoreFunc. There is no specific Pig storage format.
You can copy the file to local using copyToLocal. If you want to
export directly to SQL table, you need to write a StoreFunc. Pig work
on tuple rather than K,V
Hey Vincent,
Will it be easy for you to isolate this in a test code. That will help to
debug the issue and also fixing it.
Ashutosh
On Fri, Aug 26, 2011 at 05:30, Vincent Barat wrote:
> Hi,
>
> I run PIG jobs from a Java process (using PigServer). Most of which use
> HBaseStorage to load data fr
>
> Should I report this a an issue ?
>
Yes, please. I've found other resource leaks when using PigServer this way,
so this seems like a likely bug. Also, seeing that HTables are never closed
by HBaseStorage is not a good sign.
On Fri, Aug 26, 2011 at 5:30 AM, Vincent Barat wrote:
> Hi,
>
> I r
Thanks Vincent for confirming that issue is resolved.
Ashutosh
On Fri, Aug 26, 2011 at 07:54, Vincent Barat wrote:
> FYI, this was fixed by PIG-2193.
>
> Le 26/07/11 19:40, Vincent Barat a écrit :
>
> Hi,
>>
>> I'm using PIG 0.8.1 with HBase 0.90 and the following script sometime
>> returns an e
Vincent,
Glad that you were able to solve the issue. Ideally, one should be able to
configure log4j externally through log4j.properties config file and not by
setting them explicitly in code. Did you try that?
Ashutosh
On Fri, Aug 26, 2011 at 07:56, Vincent Barat wrote:
> Here is how I solved th
Here is how I solved this issue (it was only related to log4j
configuration):
/* Deactivate most traces from Hadoop and PIG (keep ERROR) */
props.setProperty("log4j.logger.org.apache.hadoop", "ERROR");
props.setProperty("log4j.logger.org.apache.zookeeper", "ERROR");
props
FYI, this was fixed by PIG-2193.
Le 26/07/11 19:40, Vincent Barat a écrit :
Hi,
I'm using PIG 0.8.1 with HBase 0.90 and the following script
sometime returns an empty set, and sometimes work !
start_sessions = LOAD 'startSession' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:
Hi,
I run PIG jobs from a Java process (using PigServer). Most of which
use HBaseStorage to load data from HBase.
Each job is run using a new PigServer object, and I correctly call
pigServer.shutdown() when my pig server is no longer used.
Nevertheless, after a few hours of run, I notice that
Le 23/08/11 20:28, Dmitriy Ryaboy a écrit :
We should add merge join support to HBaseStorage, it should be able to do
that for joins on the table key.
It would be great !
Are your locids skewed? Have you tried using 'skewed' join for the last job?
Actually, if locations are small, you can ev