Help with multi-AZ RDS failover
So, I have a wonderful setup right now with JRuby, torquebox, Roda and Sequel connecting to PostgreSQL RDS instance in AWS. It is the most productive stack I have worked on, it is truly beautiful to work on these JSON/API micro services I have built. :) One minor issue I am having is with multi-AZ failover. I am running some tests with my multi-AZ RDS, and I am trying to minimize my downtime when an outage occurs, or if I were to "upgrade" and resize the RDS database instance (thus requiring a failover/reboot of the multi-AZ). Right now when I initiate the reboot/failover of RDS - my API service takes requests and just blocks as it is trying to connect to the DB. I have tried playing around with different configuration options in Sequel - but none seem to really affect anything. Basically the DB status says "rebooting" and each incoming request blocks, eventually responding with 504 after about 60 seconds. At this point (or shortly after), the RDS database status changes to "available", yet it still takes another 45 seconds before requests start responding normally again. Here is a sample of my top level config for the Sequel.connect: DB = Sequel.connect(:adapter => 'postgres', :host => contacts_db_host,:user => contacts_db_user, :password => contacts_db_pwd, :database => contacts_db_name, :servers => {:node_1 => DB_NODE_LIST[:node_1], :node_2 => DB_NODE_LIST[:node_2], :read_only =>{}}, :servers_hash=>Hash.new{|h,v| raise Exception.new("Unknown server: #{v}")}, :loggers => [SimpleLogger.logger], :max_connections => 25, :connect_timeout => 5, :pool_timeout => 5) I should note that I am running in jRuby 9.1.2.0 and with pg_jruby driver. Adding/changing :connect_timeout => 5, :pool_timeout => 5 seems to have no effect Surprisingly I also added this line right after my config setup to enable connection_validation in hopes that once it tried to fetch a connection that was down it would quickly try again, but this made zero difference as well. DB.extension :connection_validator #use only for faster DB related scheduled outages DB.pool.connection_validation_timeout = -1 I have also adjusted the DNS cache TTL according to this doc to be 15 seconds => http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html Here are the errors I see in the log: 16:45:05 + severity=ERROR, error is PG::ConnectionBad: Connection timed out, error backtrace => org/jruby/pg/Connection.java:366:in `initialize' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:244:in `connect' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/server_logging.rb:40:in `connect' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool.rb:116:in `make_new' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:286:in `make_new' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:241:in `available' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:181:in `_acquire' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:195:in `block in acquire' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in `block in sync' [NUQSO2] org/jruby/ext/thread/Mutex.java:151:in `synchronize' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/threaded.rb:282:in `sync' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:194:in `acquire' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/extensions/connection_validator.rb:98:in `acquire' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/connection_pool/sharded_threaded.rb:132:in `hold' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/database/connecting.rb:285:in `synchronize' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/postgres.rb:838:in `literal_string_append' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:79:in `literal_append' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/dataset/sql.rb:491:in `complex_expression_sql_append' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/adapters/shared/postgres.rb:1302:in `complex_expression_sql_append' [NUQSO2] /opt/api-contacts/api-bundle/jruby/2.3.0/gems/sequel-4.43.0/lib/sequel/model/associations.rb:2587:in `complex_expression_sql_append'
Migrations with sharded databases question
I've been able to successfuly setup my sharded databases and load records into it with my custom sharding scheme. So far I run Sequel::Migrator.apply(DB, './migrate') and it picks up my migration scripts. In my migration script I am doing this: Sequel.migration do up do DB.each_server do |db| db.create_table(:blah) ... end end And it works great - however I notice that the schema_info table is only in my :default database (the first shard if you will). Is this expected? My table is created with the proper schema in the "second" database shard, and read/writes work fine - so everything is working. Just curious if its expected to only write out the schema_info table on one of the databases. Thanks! -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to sequel-talk+unsubscr...@googlegroups.com. To post to this group, send email to sequel-talk@googlegroups.com. Visit this group at https://groups.google.com/group/sequel-talk. For more options, visit https://groups.google.com/d/optout.
Re: Sharding issues
Yes, that seemed to fix it. Thanks!! J On Thursday, January 12, 2017 at 4:16:17 PM UTC-5, Jeremy Evans wrote: > > On Thursday, January 12, 2017 at 12:52:17 PM UTC-8, Jay Danielian wrote: >> >> Thanks for the response, you are a lifesaver! Yes, removing the plugin >> seems to work. >> >> I can confirm with tracing it through it seems like the issue is in the >> prepared_statements plugin. The code dataset.opts.fetch(:server, :default) >> seems to miss the supplied :server key (or set_server key) as dataset.opts >> is missing the supplied value, so I think :default is always returned. >> >> # Return a prepared statement that can be used to insert a row >> using the given columns >> # and return that column values for the row created. >> def prepared_insert_select(cols) >> if dataset.supports_insert_select? >> cached_prepared_statement(:insert_select, >> prepared_columns(cols)){prepare_explicit_statement(naked.clone(:server=>dataset.opts.fetch(:server, >> >> :default)), :insert_select, prepared_statement_key_hash(cols))} >> end >> end >> >> It *seems* like these methods that do the preparation >> _insert_raw, _insert_select_raw, _refresh_get would work if inside those >> methods there was something like >> >> model.dataset.opts.merge!(ds.opts) >> >> before calling the model.send(:prepared_blah...) would work as I could >> see the ds.opts variable having the :server key set but the model.dataset >> did not (which is what is being used to pick the server). >> >> Thanks for the great support! >> > > I've fixed the issues in the prepared_statements plugin (and related > plugins), could you please try the master branch and see if that fixes the > issues for you? > > Thanks, > Jeremy > -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to sequel-talk+unsubscr...@googlegroups.com. To post to this group, send email to sequel-talk@googlegroups.com. Visit this group at https://groups.google.com/group/sequel-talk. For more options, visit https://groups.google.com/d/optout.
Re: Sharding issues
Thanks for the response, you are a lifesaver! Yes, removing the plugin seems to work. I can confirm with tracing it through it seems like the issue is in the prepared_statements plugin. The code dataset.opts.fetch(:server, :default) seems to miss the supplied :server key (or set_server key) as dataset.opts is missing the supplied value, so I think :default is always returned. # Return a prepared statement that can be used to insert a row using the given columns # and return that column values for the row created. def prepared_insert_select(cols) if dataset.supports_insert_select? cached_prepared_statement(:insert_select, prepared_columns(cols)){prepare_explicit_statement(naked.clone(:server=>dataset.opts.fetch(:server, :default)), :insert_select, prepared_statement_key_hash(cols))} end end It *seems* like these methods that do the preparation _insert_raw, _insert_select_raw, _refresh_get would work if inside those methods there was something like model.dataset.opts.merge!(ds.opts) before calling the model.send(:prepared_blah...) would work as I could see the ds.opts variable having the :server key set but the model.dataset did not (which is what is being used to pick the server). Thanks for the great support! J On Thursday, January 12, 2017 at 3:20:52 PM UTC-5, Jeremy Evans wrote: > > On Thursday, January 12, 2017 at 11:53:50 AM UTC-8, Jay Danielian wrote: >> >> After adding DB.extension :server_logging to get further details about >> what is executing where I see a potential issue (question is still why). >> >> Here is what I see in the logs. Looks interesting, I have bolded the >> relevant where it looks like it is starting on node_2, then does the actual >> execution on the :default node. >> >> I, [2017-01-12T14:47:26.377432 #3176] INFO -- : (0.000112s) (conn: >> 70118877855900, *server: node_2*) BEGIN >> I, [2017-01-12T14:47:26.382943 #3176] INFO -- : (0.001143s) (conn: >> 70118870257360, *server: default*) PREPARE smpsp_1 AS INSERT INTO >> "contacts" ("shard_id", "owner_id", "first_name", "last_name", "job_title") >> VALUES ($1, $2, $3, $4, $5) RETURNING "id", "shard_id", "owner_id", >> "unified_id", "external_id", "source_id", "created_on", "updated_on", >> "prefix", "first_name", "last_name", "middle_name", "suffix", >> "company_name", "job_title", "home_email_1", "home_email_2", >> "home_email_3", "work_email_1", "work_email_2", "work_email_3", >> "work_phone", "work_phone_direct", "work_fax", "home_fax", >> "mobile_phone_1", "mobile_phone_2", "home_phone_1", "home_phone_2", >> "home_street_1", "home_city_1", "home_state_1", "home_postal_code_1", >> "home_country_1", "home_street_2", "home_city_2", "home_state_2", >> "home_postal_code_2", "home_country_2", "work_street_1", "work_city_1", >> "work_state_1", "work_postal_code_1", "work_country_1", "work_street_2", >> "work_city_2", "work_state_2", "work_postal_code_2", "work_country_2", >> "is_deleted", "lock_version", "properties", "notes" >> I, [2017-01-12T14:47:26.392660 #3176] INFO -- : (0.009518s) (conn: >> 70118870257360,* server: default*) EXECUTE smpsp_1 (INSERT INTO >> "contacts" ("shard_id", "owner_id", "first_name", "last_name", "job_title") >> VALUES ($1, $2, $3, $4, $5) RETURNING "id", "shard_id", "owner_id", >> "unified_id", "external_id", "source_id", "created_on", "updated_on", >> "prefix", "first_name", "last_name", "middle_name", "suffix", >> "company_name", "job_title", "home_email_1", "home_email_2", >> "home_email_3", "work_email_1", "work_email_2", "work_email_3", >> "work_phone", "work_phone_direct", "work_fax", "home_fax", >> "mobile_phone_1", "mobile_phone_2", "home_phone_1", "home_phone_2", >> "home_street_1", "home_city_1", "hom
Re: Sharding issues
After adding DB.extension :server_logging to get further details about what is executing where I see a potential issue (question is still why). Here is what I see in the logs. Looks interesting, I have bolded the relevant where it looks like it is starting on node_2, then does the actual execution on the :default node. I, [2017-01-12T14:47:26.377432 #3176] INFO -- : (0.000112s) (conn: 70118877855900, *server: node_2*) BEGIN I, [2017-01-12T14:47:26.382943 #3176] INFO -- : (0.001143s) (conn: 70118870257360, *server: default*) PREPARE smpsp_1 AS INSERT INTO "contacts" ("shard_id", "owner_id", "first_name", "last_name", "job_title") VALUES ($1, $2, $3, $4, $5) RETURNING "id", "shard_id", "owner_id", "unified_id", "external_id", "source_id", "created_on", "updated_on", "prefix", "first_name", "last_name", "middle_name", "suffix", "company_name", "job_title", "home_email_1", "home_email_2", "home_email_3", "work_email_1", "work_email_2", "work_email_3", "work_phone", "work_phone_direct", "work_fax", "home_fax", "mobile_phone_1", "mobile_phone_2", "home_phone_1", "home_phone_2", "home_street_1", "home_city_1", "home_state_1", "home_postal_code_1", "home_country_1", "home_street_2", "home_city_2", "home_state_2", "home_postal_code_2", "home_country_2", "work_street_1", "work_city_1", "work_state_1", "work_postal_code_1", "work_country_1", "work_street_2", "work_city_2", "work_state_2", "work_postal_code_2", "work_country_2", "is_deleted", "lock_version", "properties", "notes" I, [2017-01-12T14:47:26.392660 #3176] INFO -- : (0.009518s) (conn: 70118870257360,* server: default*) EXECUTE smpsp_1 (INSERT INTO "contacts" ("shard_id", "owner_id", "first_name", "last_name", "job_title") VALUES ($1, $2, $3, $4, $5) RETURNING "id", "shard_id", "owner_id", "unified_id", "external_id", "source_id", "created_on", "updated_on", "prefix", "first_name", "last_name", "middle_name", "suffix", "company_name", "job_title", "home_email_1", "home_email_2", "home_email_3", "work_email_1", "work_email_2", "work_email_3", "work_phone", "work_phone_direct", "work_fax", "home_fax", "mobile_phone_1", "mobile_phone_2", "home_phone_1", "home_phone_2", "home_street_1", "home_city_1", "home_state_1", "home_postal_code_1", "home_country_1", "home_street_2", "home_city_2", "home_state_2", "home_postal_code_2", "home_country_2", "work_street_1", "work_city_1", "work_state_1", "work_postal_code_1", "work_country_1", "work_street_2", "work_city_2", "work_state_2", "work_postal_code_2", "work_country_2", "is_deleted", "lock_version", "properties", "notes"); [6992, "9775d295-b58f-4452-81a9-b49672b0fb9e", "Frank", "Jones", "Intern"] I, [2017-01-12T14:47:26.395028 #3176] INFO -- : (0.000212s) (conn: 70118877855900, *server: node_2*) COMMIT Thanks! J On Thursday, January 12, 2017 at 2:23:42 PM UTC-5, Jay Danielian wrote: > > Quick update - I have tried to simplify things further, yet it still is > not working. I realized from before that I only need the symbol key that > has the different sharding config attached to it. But even using the > :servers_hash option to tell me when my shard :server is not found, it > still does not work. I don't get an error when passing in the symbol > :node_2, so it seems to think its ok - yet the contact is still saved into > contact-db-1. > > DB = Sequel.connect(:adapter => 'postgres', :host => 'localhost',:user => > 'contact_user', :password => 'password', :database => 'contact-db-1', > :servers => {:node_1 => { :database => > 'contact-db-1'}, :node_2 => { :database => 'contact-db-2' }, :read_only > =>{}}, :servers_hash=>Hash.new{|h,v| raise Exception.new("Unkno
Re: Sharding issues
Quick update - I have tried to simplify things further, yet it still is not working. I realized from before that I only need the symbol key that has the different sharding config attached to it. But even using the :servers_hash option to tell me when my shard :server is not found, it still does not work. I don't get an error when passing in the symbol :node_2, so it seems to think its ok - yet the contact is still saved into contact-db-1. DB = Sequel.connect(:adapter => 'postgres', :host => 'localhost',:user => 'contact_user', :password => 'password', :database => 'contact-db-1', :servers => {:node_1 => { :database => 'contact-db-1'}, :node_2 => { :database => 'contact-db-2' }, :read_only =>{}}, :servers_hash=>Hash.new{|h,v| raise Exception.new("Unknown server: #{v}")}, :loggers => [Logger.new($stdout)]) neither contact.save(:server => :node_2).save or contact.set_server(:node_2).save seems to save the record into contact-db-2 database. Thanks! J On Thursday, January 12, 2017 at 11:26:26 AM UTC-5, Jay Danielian wrote: > > I am new to Sequel, so I am likely doing something wrong - but would > appreciate some guidance. I am trying to test out sharding locally by > having two postgres databases locally running. > > I start them as such: (I am able to connect to both using psql, so they > are running properly.) > > pg_ctl -D ~/work/postgres-db/contact-db-1 -l logfile start > > pg_ctl -D ~/work/postgres-db/contact-db-2 -l log file start > > > > I have connection information as follows: > > NODE_LIST = {} > > DEFAULT_DB_CONNECTION = > 'postgres://contact_user:pwd@localhost/contact-db-1' > > NODE_LIST[:node_1] = {:host => 'localhost',:user => 'contact_user', > :password => 'pwd', :database => 'contact-db-1'} > > NODE_LIST[:node_2] = {:host => 'localhost',:user => 'contact_user', > :password => 'pwd', :database => 'contact-db-2'} > > > DB = Sequel.connect(DEFAULT_DB_CONNECTION, :servers => NODE_LIST, :loggers > => [Logger.new($stdout)]) > > > I have code in a spec (just trying to test this out) like this: > > >contact = Contact.new do |c| > > c.owner_id = owner_id > > c.first_name = 'Frank' > > c.last_name = 'Jones' > > c.job_title = 'Intern' > >end > > >contact.set_server(NODE_LIST[:node_2]).save #initial save > > > Yet, I never see this record get committed to the contact-db-2 database. > All my inserts go into the contact-db-1 (which is the default). I can see > all my inserts when querying them via psql, so the basic functionality > seems to be working fine. Any pointers as to what I am doing wrong? Not > sure if it is an issue with having two identical nodes / hosts setup and > only varying the database name? Or I am just overlooking something even > more obvious :) > > > Thanks!! > > > > > -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to sequel-talk+unsubscr...@googlegroups.com. To post to this group, send email to sequel-talk@googlegroups.com. Visit this group at https://groups.google.com/group/sequel-talk. For more options, visit https://groups.google.com/d/optout.
change servers listing without restarting?
I am new to sequel, and its pretty amazing. I have a noob question, is it possible to change the :servers listing without restarting? My use case is I want to use the application level sharding support that sequel provides. When adding new capacity (databases/shards) I would like them to be available to my running application without needing to restart my server to begin using the new database hosts. Ideally I can tap into a method that reloads the existing connection information (with new :servers list) and the current connection pool gets invalidated and reloaded to use the new servers. I hope this makes sense. Thanks! -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to sequel-talk+unsubscr...@googlegroups.com. To post to this group, send email to sequel-talk@googlegroups.com. Visit this group at https://groups.google.com/group/sequel-talk. For more options, visit https://groups.google.com/d/optout.