Re: [galaxy-dev] Startup error, Galaxy DB got corrupted?

Leon Mei Wed, 19 Jun 2013 03:14:43 -0700

Dear Dannon,

Thanks for your suggestion!


We did give it a try with the backup copy of 10 days back and that worked!
We keep the postgres log and the failed postgres backup (using pg_dump with
both custom and plain text format) for now and will look into what exactly
happened during those 10 days.

After this update, it seems we also have solved the problem of missing user
email addresses in the add quota page. Need a bit more test further to see
it all goes well.

Cheers,
Leon



On Tue, Jun 18, 2013 at 3:24 AM, Dannon Baker <dannon.ba...@gmail.com>wrote:

> The error "ERROR:  could not read block 326 in file "base/257536/259291":
> read only 0 of 8192 bytes" line makes me wonder if your database may have
> been corrupt prior to the upgrade/dump.  Some brief googling suggests that
> this could be hardware failure induced among other things.  Can you check
> the postgres logs for any likely explanations?
>
> How exactly did you perform your database backup and attempted recreation?
>
> -Dannon
>
>
> On Mon, Jun 17, 2013 at 5:12 PM, Leon Mei <hailiang....@nbic.nl> wrote:
>
>> Dear list,
>>
>> Yesterday, I noticed that the user quota setting on our Galaxy had a
>> problem - the list of user email address is missing from the manage quota
>> page. So I thought to give a Galaxy codebase update a try after backing up
>> the Galaxy dir and the database.
>>
>> Running hg head gives me:
>> ==
>> postgres@ubuntu:/mnt/galaxyTools/galaxy-2013-05-13$ hg head
>> changeset:   10003:b4a373d86c51
>> tag:         tip
>> parent:      10001:471484ff8be6
>> user:        greg
>> date:        Wed Jun 12 11:48:09 2013 -0400
>> summary:     Add targets to Repository Actions menu items.
>> ==
>>
>> I updated the DB schema to support the code update. But the restart of
>> Galaxy failed with an error in paster.log:
>> ==
>> Traceback (most recent call last):
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/webapps/galaxy/buildapp.py",
>> line 35, in app_factory
>>     app = UniverseApplication( global_conf = global_conf, **kwargs )
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/app.py", line 164,
>> in __init__
>>     self.job_manager = manager.JobManager( self )
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/manager.py",
>> line 36, in __init__
>>     self.job_handler.start()
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/handler.py",
>> line 34, in start
>>     self.job_queue.start()
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/handler.py",
>> line 77, in start
>>     self.__check_jobs_at_startup()
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/handler.py",
>> line 125, in __check_jobs_at_startup
>>     self.dispatcher.recover( job, job_wrapper )
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/handler.py",
>> line 620, in recover
>>     self.job_runners[runner_name].recover( job, job_wrapper )
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/runners/local.py", line
>> 128, in recover
>>     job_wrapper.change_state( model.Job.states.ERROR, info = "This job
>> was killed when Galaxy was restarted.  Please retry the job." )
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/jobs/__init__.py",
>> line 824, in change_state
>>     dataset.state = state
>>   File "/mnt/galaxyTools/galaxy-2013-05-13/lib/galaxy/model/__init__.py",
>> line 1163, in set_dataset_state
>>     object_session( self ).flush() #flush here, because hda.flush() won't
>> flush the Dataset object
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
>> line 1718, in flush
>>     self._flush(objects)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/session.py",
>> line 1789, in _flush
>>     flush_context.execute()
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/unitofwork.py",
>> line 331, in execute
>>     rec.execute(self)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/unitofwork.py",
>> line 475, in execute
>>     uow
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/persistence.py",
>> line 59, in save_obj
>>     mapper, table, update)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/orm/persistence.py",
>> line 485, in _emit_update_statements
>>     execute(statement, params)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/engine/base.py",
>> line 1449, in execute
>>     params)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/engine/base.py",
>> line 1584, in _execute_clauseelement
>>     compiled_sql, distilled_params
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/engine/base.py",
>> line 1698, in _execute_context
>>     context)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/engine/base.py",
>> line 1691, in _execute_context
>>     context)
>>   File
>> "/mnt/galaxyTools/galaxy-2013-05-13/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs4.egg/sqlalchemy/engine/default.py",
>> line 331, in do_execute
>>     cursor.execute(statement, parameters)
>> InternalError: (InternalError) could not read block 326 in file
>> "base/257536/259291": read only 0 of 8192 bytes
>>  'UPDATE dataset SET update_time=%(update_time)s, state=%(state)s WHERE
>> dataset.id = %(dataset_id)s' {'update_time': datetime.datetime(2013, 6,
>> 17, 12, 42, 38, 145908), 'state': 'error', 'dataset_id': 68792}
>> ==
>>
>> I went on to check the Galaxy database (in postgres) which confirmed the
>> error in the paster.log.
>> ==
>> galaxy=# select * from dataset where id =
>> 68792;
>> id   |        create_time         |        update_time         | state |
>> deleted | purged | purgable | external_filename | _extra_files_path |
>> file_size | total_size | object_store_id | uuid
>>
>> -------+----------------------------+----------------------------+-------+---------+--------+----------+-------------------+-------------------+-----------+------------+-----------------+------
>>  68792 | 2013-06-12 06:55:17.608725 | 2013-06-12 06:55:19.396557 | error
>> | f       | f      | t        |                   |
>> |           |            |                 |
>> (1 row)
>>
>> galaxy=# UPDATE dataset SET update_time='2013-06-17 12:42:38.145908',
>> state='error' WHERE dataset.id = 68792;
>> ERROR:  could not read block 326 in file "base/257536/259291": read only
>> 0 of 8192 bytes
>> galaxy=# UPDATE dataset SET state='error' WHERE dataset.id = 68792;
>> UPDATE 1
>> ==
>>
>> When I tried to restore the database from the dumped copy I made
>> yesterday, I got the following error:
>> ==
>> ERROR:  insert or update on table "galaxy_session_to_history" violates
>> foreign key constraint "galaxy_session_to_history_
>> session_id_fkey"
>> DETAIL:  Key (session_id)=(956338) is not present in table
>> "galaxy_session".
>> ERROR:  insert or update on table "job" violates foreign key constraint
>> "job_session_id_fkey"
>> DETAIL:  Key (session_id)=(852060) is not present in table
>> "galaxy_session".
>> ERROR:  insert or update on table "workflow_step_connection" violates
>> foreign key constraint "workflow_step_connection_input_step_id_fkey"
>> DETAIL:  Key (input_step_id)=(18422) is not present in table
>> "workflow_step".
>> ERROR:  insert or update on table "workflow_step_connection" violates
>> foreign key constraint "workflow_step_connection_output_step_id_fkey"
>> DETAIL:  Key (output_step_id)=(18422) is not present in table
>> "workflow_step".
>> ==
>>
>> It seems our postgres DB got corrupted? Any suggestion on a fix?
>>
>> Thanks a lot!
>>
>> Leon
>>
>> --
>> Hailiang (Leon) Mei
>> Netherlands Bioinformatics Center
>> BioAssist NGS Taskforce
>>  - 
>> http://ngs.nbic.nl<https://wiki.nbic.nl/index.php/Next_Generation_Sequencing>
>> Skype: leon_mei    Mobile: +31 6 41709231
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>   http://lists.bx.psu.edu/
>>
>> To search Galaxy mailing lists use the unified search at:
>>   http://galaxyproject.org/search/mailinglists/
>>
>
>


-- 
Hailiang (Leon) Mei
Netherlands Bioinformatics Center
BioAssist NGS Taskforce
 - http://ngs.nbic.nl<https://wiki.nbic.nl/index.php/Next_Generation_Sequencing>
Skype: leon_mei    Mobile: +31 6 41709231

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Startup error, Galaxy DB got corrupted?

Reply via email to