Hi All,

I am using Derby as an embedded database within a Hadoop job to lookup IP
geographic info.  
http://mpouttuclarke.wordpress.com/2010/12/10/java-embedded-db-for-ip2locati
on-in-hadoop/

The problem is that Hadoop has an option called JVM sharing where more than
one thread may be active in the JVM instance.  Since the embedded option
only supports one thread at a time, I have had to turn off JVM sharing for
my IP lookup job (by setting mapred.job.reuse.jvm.num.tasks to 1).

I have already tried embedding the network server and client with a
localhost loopback and it made the code 10 times slower.  So if anyone has
been able to overcome this limitation without using the network server and
client I would be very interested.

Thanks,
Matt

iCrossing Privileged and Confidential Information
This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information of iCrossing. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


Reply via email to