Database connection pooling for a recommendation engine
Hello, I am considering to implement a recommendation engine for a small size website. The website will employ LAMP stack, and for some reasons the recommendation engine must be written in C++. It consists of an On-line Component and Off-line Component, both need to connect to MySQL. The difference is that On-line Component will need a connection pool, whereas several persistent connections or even connect as required would be sufficient for the Off-line Component, since it does not require real time performance in a concurrent requests scenario as in On-line Component. On-line Component is to be wrapped as a web service via Apache AXIS2. The PHP frontend app on Apache http server retrieves recommendation data from this web service module. There are two DB connection options for On-line Component I can think of: 1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use connection pool APIs that come as a part of Apache HTTP server. mod_dbd would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html As for Off-line Component, a simple DB connection option is direct connection using ODBC. Due to lack of web app design experience, I have the following questions: Option 1 for On-line Component is a tightly coupled design without taking advantage of pooling APIs in Apache HTTP server. But if I choose Option 2 (3-tiered architecture), as a standalone component apart from Apache HTTP server, how to use its connection pool APIs? A Java application can be deployed as a WAR file and contained in a servlet container such as tomcat(See Mahout in Action, section 5.5), or it can use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource ( https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation). Is there any similar approach for my C++ recommendation engine? I am not sure if I made a proper prototype. Any suggestions will be appreciated:) Thanks, Mike
Re: Database connection pooling for a recommendation engine
Not sure, is this really related to Mahout? I don't know of an equivalent of J2EE / Tomcat for C++, but there must be something. As a general principle, you will have to load your data into memory if you want to perform the computations on the fly in real time. So how you access the data isn't so important, just because you will be reading it all at once. On Wed, Jun 5, 2013 at 12:44 PM, Mike W. liansh...@gmail.com wrote: Hello, I am considering to implement a recommendation engine for a small size website. The website will employ LAMP stack, and for some reasons the recommendation engine must be written in C++. It consists of an On-line Component and Off-line Component, both need to connect to MySQL. The difference is that On-line Component will need a connection pool, whereas several persistent connections or even connect as required would be sufficient for the Off-line Component, since it does not require real time performance in a concurrent requests scenario as in On-line Component. On-line Component is to be wrapped as a web service via Apache AXIS2. The PHP frontend app on Apache http server retrieves recommendation data from this web service module. There are two DB connection options for On-line Component I can think of: 1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use connection pool APIs that come as a part of Apache HTTP server. mod_dbd would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html As for Off-line Component, a simple DB connection option is direct connection using ODBC. Due to lack of web app design experience, I have the following questions: Option 1 for On-line Component is a tightly coupled design without taking advantage of pooling APIs in Apache HTTP server. But if I choose Option 2 (3-tiered architecture), as a standalone component apart from Apache HTTP server, how to use its connection pool APIs? A Java application can be deployed as a WAR file and contained in a servlet container such as tomcat(See Mahout in Action, section 5.5), or it can use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource ( https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation). Is there any similar approach for my C++ recommendation engine? I am not sure if I made a proper prototype. Any suggestions will be appreciated:) Thanks, Mike
Re: Database connection pooling for a recommendation engine
Hi Mike, the following paper contains some comparisons between different database stacks. I can also give you the QtSQL code if you are interested in it. http://www.manuel-blechschmidt.de/data/MMRPG2.pdf /Manuel Am 05.06.2013 um 13:44 schrieb Mike W.: Hello, I am considering to implement a recommendation engine for a small size website. The website will employ LAMP stack, and for some reasons the recommendation engine must be written in C++. It consists of an On-line Component and Off-line Component, both need to connect to MySQL. The difference is that On-line Component will need a connection pool, whereas several persistent connections or even connect as required would be sufficient for the Off-line Component, since it does not require real time performance in a concurrent requests scenario as in On-line Component. On-line Component is to be wrapped as a web service via Apache AXIS2. The PHP frontend app on Apache http server retrieves recommendation data from this web service module. There are two DB connection options for On-line Component I can think of: 1. Use ODBC connection pool, I think unixODBC might be a candidate. 2. Use connection pool APIs that come as a part of Apache HTTP server. mod_dbd would be a choice.http://httpd.apache.org/docs/2.2/mod/mod_dbd.html As for Off-line Component, a simple DB connection option is direct connection using ODBC. Due to lack of web app design experience, I have the following questions: Option 1 for On-line Component is a tightly coupled design without taking advantage of pooling APIs in Apache HTTP server. But if I choose Option 2 (3-tiered architecture), as a standalone component apart from Apache HTTP server, how to use its connection pool APIs? A Java application can be deployed as a WAR file and contained in a servlet container such as tomcat(See Mahout in Action, section 5.5), or it can use org.apache.mahout.cf.taste.impl.model.jdbc.ConnectionPoolDataSource ( https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation). Is there any similar approach for my C++ recommendation engine? I am not sure if I made a proper prototype. Any suggestions will be appreciated:) Thanks, Mike -- Manuel Blechschmidt M.Sc. IT Systems Engineering Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B