Re: Pig optimization getting in the way?

2011-02-22 Thread Dmitriy Ryaboy
I was suggesting just a connection per record writer, not a connection per (jdbc, table). That way you are safe even if you are writing into the same table from two streams in the same jvm. D On Tue, Feb 22, 2011 at 10:18 AM, Dexin Wang wrote: > So I can create multiple db connections for each

Re: Pig optimization getting in the way?

2011-02-22 Thread Dexin Wang
So I can create multiple db connections for each (jdbc_url, table) pairs and map each pair to its own connection for record writer. Is that what you are suggesting? Sounds like a good plan. Thanks. On Fri, Feb 18, 2011 at 5:31 PM, Thejas M Nair wrote: > As you are suspecting, both store functio

Re: Pig optimization getting in the way?

2011-02-18 Thread Dmitriy Ryaboy
Open a new connection per Storage instance? Better yet, use a connection pool? On Fri, Feb 18, 2011 at 4:48 PM, Dexin Wang wrote: > I hope that's the case. But > > *mapred.job.reuse.jvm.num.tasks* 1 > However it does seem to be doing the write to two DB tables in the same job > so although it's

Re: Pig optimization getting in the way?

2011-02-18 Thread Thejas M Nair
As you are suspecting, both store functions are probably running in the same map or reduce task. This is a result of multi-query optimization. Try pig -e 'explain -script yourscript.pig' to see the query plan, and you will be able to verify if the store is happening the same map/reduce task. Can

Re: Pig optimization getting in the way?

2011-02-18 Thread Dexin Wang
I hope that's the case. But *mapred.job.reuse.jvm.num.tasks* 1 However it does seem to be doing the write to two DB tables in the same job so although it's not re-using jvm, it is already in one jvm since it's the same task! And since the DB connection is static/singleton as you mentioned, and t

Re: Pig optimization getting in the way?

2011-02-18 Thread Dmitriy Ryaboy
Let me guess -- you have a static JDBC connection that you open in myJDBC, and you have jvm reuse turned on. On Fri, Feb 18, 2011 at 1:41 PM, Dexin Wang wrote: > I ran into a problem that I have spent quite some time on and start to > think > it's probably pig's doing something optimization that

Pig optimization getting in the way?

2011-02-18 Thread Dexin Wang
I ran into a problem that I have spent quite some time on and start to think it's probably pig's doing something optimization that makes this thing hard. This is my pseudo code: raw = LOAD ... then some crazy stuff like filter join group UDF etc A = the result from above operation STORE A INTO