I did not mention, but I also tried to use the controller and the DB on the same machine. With respect to JDBCBench, I only use a read test with simple queries. The clients vary from 5 to 200 and the reads from 100 to 10000. All the test brought to the same performances (the number of txn/s depends where the controller is located, but in all cases it has a 100% load). I also played a bit with the maximum number of clients allowed, and the controller becomes not overloaded only with 3 parallel clients...
>From the console, with "show backend *", I can see the queries are dispatched among the backends. Only using the Dual-Xeon host as controller and two notebook (slow hard disks) as backends, I can see that the backends are saturated. The bad performance still persists with only one controller in the network (in that case I assume - but I think I am wrong - the group communications protocol does not play an important role). For the communication protocol, I always used the default configuration files shipped with the Sequoia archive. If it can be useful, following you can find my configuration. Thanks, Danilo Levantesi myDB-raidb1.xml: <SEQUOIA> <VirtualDatabase name="myDB" maxNbOfConnections="15" minNbOfThreads="10" maxNbOfThreads="10"> <Distribution hederaPropertiesFile="/hedera_appia.properties"> <MessageTimeouts/> </Distribution> <Backup> <Backuper backuperName="pgdump" className="org.continuent.sequoia.controller.backup.backupers.PostgreSQLBinaryBackuper" /> </Backup> <AuthenticationManager> <Admin> <User username="admin" password=""/> </Admin> <VirtualUsers> <VirtualLogin vLogin="sequoia" vPassword="sequoia"/> </VirtualUsers> </AuthenticationManager> <DatabaseBackend name="backend1" driver="org.postgresql.Driver" url="jdbc:postgresql://10.0.0.101:5432/sequoia" connectionTestStatement="select now()"> <DatabaseSchema dynamicPrecision="table" /> <ConnectionManager vLogin="sequoia" rLogin="sequoia" rPassword="sequoia"> <VariablePoolConnectionManager initPoolSize="10" minPoolSize="5" maxPoolSize="15" idleTimeout="30" waitTimeout="10" /> </ConnectionManager> </DatabaseBackend> <DatabaseBackend name="backend2" driver="org.postgresql.Driver" url="jdbc:postgresql://10.0.0.102:5432/sequoia" connectionTestStatement="select now()"> <DatabaseSchema dynamicPrecision="table" /> <ConnectionManager vLogin="sequoia" rLogin="sequoia" rPassword="sequoia"> <VariablePoolConnectionManager initPoolSize="10" minPoolSize="5" maxPoolSize="15" idleTimeout="30" waitTimeout="10" /> </ConnectionManager> </DatabaseBackend> <RequestManager> <RequestScheduler> <RAIDb-1Scheduler level="passThrough"/> </RequestScheduler> <RequestCache> <MetadataCache/> <ParsingCache/> <ResultCache granularity="table"/> </RequestCache> <LoadBalancer> <RAIDb-1> <WaitForCompletion policy="first"/> <RAIDb-1-LeastPendingRequestsFirst/> </RAIDb-1> </LoadBalancer> <RecoveryLog driver="org.postgresql.Driver" url="jdbc:postgresql://127.0.0.1:5432/sequoia_recovery" login="sequoia" password="sequoia"> <RecoveryLogTable tableName="RECOVERY" logIdColumnType="BIGINT NOT NULL" vloginColumnType="VARCHAR NOT NULL" sqlColumnType="VARCHAR NOT NULL" extraStatementDefinition=",PRIMARY KEY (log_id)"/> <CheckpointTable tableName="CHECKPOINT" checkpointNameColumnType="VARCHAR NOT NULL"/> <BackendTable tableName="BACKEND" databaseNameColumnType="VARCHAR NOT NULL" backendNameColumnType="VARCHAR NOT NULL" checkpointNameColumnType="VARCHAR NOT NULL"/> <DumpTable tableName="DUMP" dumpNameColumnType="VARCHAR NOT NULL" dumpDateColumnType="TIMESTAMP" dumpPathColumnType="VARCHAR NOT NULL" dumpFormatColumnType="VARCHAR NOT NULL" checkpointNameColumnType="VARCHAR NOT NULL" backendNameColumnType="VARCHAR NOT NULL" tablesColumnType="VARCHAR NOT NULL"/> </RecoveryLog> </RequestManager> </VirtualDatabase> </SEQUOIA> Il Thursday 24 April 2008 18:54:40 Emmanuel Cecchet ha scritto: > Hi Danilo, > > You should avoid using Sequoia 3, it is not supported anymore. Use the > latest 2.10 branch code until 2.10.10 is released. > What is the workload generated by JDBCBench? > How many reads? How many writes? How much concurrency (how many clients > in parallel)? > Given the throughput you obtain with a single database (3500/sec), I > guess the queries are extremely simple. The fact that your backends are > not doing much with Sequoia is probably because you have a lot of writes > that need to be serialized and it is expected that writes don't scale > with Sequoia (we have to eliminate the indeterminism of the workload by > total ordering). > The high cpu load on the controller is likely due to a bad group > communication configuration. Other possibilities are a bug or another > misconfiguration (avoid caching if you have a lot of writes). > Also if a single node is not saturated by the workload you won't see any > improvement by adding nodes. > Using dedicated nodes add network latency so that can also have a > significant impact especially if you workload has a limited parallelism. > > Don't hesitate to give us more details about your JDBCBench workload. > Thanks for your feedback, > Emmanuel > > > I am trying a RAIDb-1 Sequoia cluster. My configuration consists of: > > - one or two dedicated controllers > > - each controller supported by 2 PostgreSQL or MySQL hosts as backends > > Each controller was tested with different operating systems: > > - Slackware 12.0 (with kernel 2.6.21.5-smp SMP) > > - Debian 4 (2.6.18-4-686 SMP) > > - Gentoo (2.6.24) > > - RedHat Enterprise 4 (2.6.9-5.ELsmp SMP) > > - Windows XP Professional SP 2 > > with different Java VMs: > > - J2RE 1.4.2 > > - JDK 1.5.0_15 > > - JDK 1.6 > > and with different hardware: > > - 2 [EMAIL PROTECTED], 2Gb Ram > > - Pentium 4 [EMAIL PROTECTED], 2 Gb Ram > > - Pentium 4 @3GHz, 512 Mb Ram > > - Virtual Machine > > > > With all the configurations, using the tool JDBCBench (executed from a > > dedicated host), I noticed a big cpu load on the controller and very poor > > performance. > > For example, with a direct access to a single PostgreSQL node I can reach > > 3500 transactions/sec, instead of a 800txn/s obtained by a controller > > with 2 PostgreSQL hosts. > > In the latter case, the DB hosts are not overloaded: PostgreSQL (and > > MySql) uses about 5% CPU and generate only 150Kbyte/s network traffic. > > > > I tried to use both JGroups and Appia (both SEQ and TOKEN, UDP and TCP) > > with the same results (only few differences, but always with high cpu > > load). > > > > I also tried Sequoia 2.10.9, 3.0-beta2 and 3.0-beta3 (I can't use cvs > > version because I get a null pointer exception in Controller.java:185, in > > function sendJmxNotification(), with respect to > > notificationBroadcasterSupport). > > > > I used the last available JDBC drivers for PostgreSQL > > (postgresql-8.2-508.jdbc3.jar) and for MySQL > > (mysql-connector-java-5.1.5-bin.jar). > > > > I also enabled cache and disabled SQL monitoring, without big > > improvements. Also tried different schedulers (RoundRobin and > > LeastPendingRequestsFirst), with both WaitForCompletion policy "all" and > > "first". > > > > >From the published presentations, I can read that the performance with > > > Sequoia > > > > should increase, whereas in my case they decrease (I need about 3 > > controllers, each one with 2 backends, to reach the same performance of a > > single PostgreSQL host). Searching in the mailing list I read some users > > with the same problems, but I could not find a solution. > > > > Any help is very appreciated. > > > > Yours faithfully, > > Danilo Levantesi > > > > _______________________________________________ > > Sequoia mailing list > > [email protected] > > https://forge.continuent.org/mailman/listinfo/sequoia _______________________________________________ Sequoia mailing list [email protected] https://forge.continuent.org/mailman/listinfo/sequoia
