Database connection pooling for a recommendation engine

2013-06-05 Thread Mike W.
Hello,

I am considering to implement a recommendation engine for a small size
website. The website will employ LAMP stack, and for some reasons the
recommendation engine must be written in C++. It consists of an On-line
Component and Off-line Component, both need to connect to MySQL. The
difference is that On-line Component will need a connection pool, whereas
several persistent connections or even connect as required would be
sufficient for the Off-line Component, since it does not require real time
performance in a concurrent requests scenario as in On-line Component.

On-line Component is to be wrapped as a web service via Apache AXIS2. The
PHP frontend app on Apache http server retrieves recommendation data from
this web service module.

There are two DB connection options for On-line Component I can think of:
1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use
connection pool APIs that come as a part of Apache HTTP server. mod_dbd
would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html

As for Off-line Component, a simple DB connection option is direct
connection using ODBC.

Due to lack of web app design experience, I have the following questions:

Option 1 for On-line Component is a tightly coupled design without taking
advantage of pooling APIs in Apache HTTP server. But if I choose Option 2
(3-tiered architecture), as a standalone component apart from Apache HTTP
server, how to use its connection pool APIs?

A Java application can be deployed as a WAR file and contained in a servlet
container such as tomcat(See Mahout in Action, section 5.5), or it can
use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource
(
https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation).
Is there any similar approach for my C++ recommendation engine?

I am not sure if I made a proper prototype. Any suggestions will be
appreciated:)

Thanks,

Mike


Re: Database connection pooling for a recommendation engine

2013-06-05 Thread Sean Owen
Not sure, is this really related to Mahout?

I don't know of an equivalent of J2EE / Tomcat for C++, but there must
be something.

As a general principle, you will have to load your data into memory if
you want to perform the computations on the fly in real time. So how
you access the data isn't so important, just because you will be
reading it all at once.

On Wed, Jun 5, 2013 at 12:44 PM, Mike W. liansh...@gmail.com wrote:
 Hello,

 I am considering to implement a recommendation engine for a small size
 website. The website will employ LAMP stack, and for some reasons the
 recommendation engine must be written in C++. It consists of an On-line
 Component and Off-line Component, both need to connect to MySQL. The
 difference is that On-line Component will need a connection pool, whereas
 several persistent connections or even connect as required would be
 sufficient for the Off-line Component, since it does not require real time
 performance in a concurrent requests scenario as in On-line Component.

 On-line Component is to be wrapped as a web service via Apache AXIS2. The
 PHP frontend app on Apache http server retrieves recommendation data from
 this web service module.

 There are two DB connection options for On-line Component I can think of:
 1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use
 connection pool APIs that come as a part of Apache HTTP server. mod_dbd
 would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html

 As for Off-line Component, a simple DB connection option is direct
 connection using ODBC.

 Due to lack of web app design experience, I have the following questions:

 Option 1 for On-line Component is a tightly coupled design without taking
 advantage of pooling APIs in Apache HTTP server. But if I choose Option 2
 (3-tiered architecture), as a standalone component apart from Apache HTTP
 server, how to use its connection pool APIs?

 A Java application can be deployed as a WAR file and contained in a servlet
 container such as tomcat(See Mahout in Action, section 5.5), or it can
 use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource
 (
 https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation).
 Is there any similar approach for my C++ recommendation engine?

 I am not sure if I made a proper prototype. Any suggestions will be
 appreciated:)

 Thanks,

 Mike


Re: Database connection pooling for a recommendation engine

2013-06-05 Thread Manuel Blechschmidt
Hi Mike,
the following paper contains some comparisons between different database stacks.

I can also give you the QtSQL code if you are interested in it.

http://www.manuel-blechschmidt.de/data/MMRPG2.pdf

/Manuel

Am 05.06.2013 um 13:44 schrieb Mike W.:

 Hello,
 
 I am considering to implement a recommendation engine for a small size
 website. The website will employ LAMP stack, and for some reasons the
 recommendation engine must be written in C++. It consists of an On-line
 Component and Off-line Component, both need to connect to MySQL. The
 difference is that On-line Component will need a connection pool, whereas
 several persistent connections or even connect as required would be
 sufficient for the Off-line Component, since it does not require real time
 performance in a concurrent requests scenario as in On-line Component.
 
 On-line Component is to be wrapped as a web service via Apache AXIS2. The
 PHP frontend app on Apache http server retrieves recommendation data from
 this web service module.
 
 There are two DB connection options for On-line Component I can think of:
 1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use
 connection pool APIs that come as a part of Apache HTTP server. mod_dbd
 would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html
 
 As for Off-line Component, a simple DB connection option is direct
 connection using ODBC.
 
 Due to lack of web app design experience, I have the following questions:
 
 Option 1 for On-line Component is a tightly coupled design without taking
 advantage of pooling APIs in Apache HTTP server. But if I choose Option 2
 (3-tiered architecture), as a standalone component apart from Apache HTTP
 server, how to use its connection pool APIs?
 
 A Java application can be deployed as a WAR file and contained in a servlet
 container such as tomcat(See Mahout in Action, section 5.5), or it can
 use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource
 (
 https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation).
 Is there any similar approach for my C++ recommendation engine?
 
 I am not sure if I made a proper prototype. Any suggestions will be
 appreciated:)
 
 Thanks,
 
 Mike

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B