I noticed a similar thread where someone ran a fsck and recovered.  I tried
a fsck with no luck.  I ran db_verify on all of the .db files and it didn¹t
show anything.  Below is the debug output of the server:

[D 06/29 15:29] Passing tcp://oss004-4:3337 as BMI listen address.
[D 06/29 15:29] BMI_tcp_initialize: Initializing TCP/IP module.
[D 06/29 15:29] BMI_tcp_initialize: TCP/IP module successfully initialized.
[D 06/29 15:29] Server using shm key hint: 373672738
[D 06/29 15:29] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 11
[D 06/29 15:29] Default socket buffers send:16384 receive:87380
[D 06/29 15:29] Setting socket buffer size for send:0 receive:0
[D 06/29 15:29] Reread socket buffers send:16384 receive:87380
[D 06/29 15:29] [BMI CONTROL]: BMI_set_info: set_info: 0 option: 12
[D 06/29 15:29] Default socket buffers send:16384 receive:87380
[D 06/29 15:29] Setting socket buffer size for send:0 receive:0
[D 06/29 15:29] Reread socket buffers send:16384 receive:87380
[D 06/29 15:29] dbpf_thread_initialize: initialized
[D 06/29 15:29] [SYNC_COALESCE]: dbpf_sync_context_init for context 0 called
[D 06/29 15:29] dbpf_collection_lookup of coll: pvfs2-fs
[D 06/29 15:29] dbpf using default db cache size.
[D 06/29 15:29] dbpf using shm key: 1020239961
[D 06/29 15:29] collection lookup: version is 0.1.4
[D 06/29 15:29] [SYNC_COALESCE]: dbpf_sync_context_init for context 1 called
[D 06/29 15:29] dbpf collection 373672578 - Setting handle timeout to
360000000 microseconds
[D 06/29 15:29] - set handle re-use timeout to 360 seconds (ret=0)
[D 06/29 15:29] dbpf collection 373672578 - Setting cache keywords of
attribute cache to dh,
[D 06/29 15:29] Setting dbpf_attr_cache keywords to:
dh,
[D 06/29 15:29] dbpf collection 373672578 - Setting cache size of attribute
cache to 511
[D 06/29 15:29] dbpf collection 373672578 - Setting maximum elements of
attribute cache to 1024
[D 06/29 15:29] dbpf collection 373672578 - Initialize collection attr.
cache
[D 06/29 15:29] There are 1 cacheable keywords registered
[D 06/29 15:29] dbpf_attr_cache_initialize: initialized
[D 06/29 15:29] dbpf collection 373672578 - Setting collection handle ranges
to 
4323455642275676148-4611686018427387890,8935141660703064036-9223372036854775
778
[D 06/29 15:29] op_queue add: 0x9f96380
[D 06/29 15:29] dbpf_thread_function started
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] handle_new_connection: Assigning socket 11 to new method
addr.
[D 06/29 15:29] tcp_do_work_recv: Reading header for new op.
[D 06/29 15:29] tcp_do_work_recv: Received new message; mode: 2.
[D 06/29 15:29] tcp_do_work_recv: tag: 5865658
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9f96380
[D 06/29 15:29] handle_new_connection: Assigning socket 12 to new method
addr.
[D 06/29 15:29] op_queue add: 0x9f9da50
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9f9da50
[D 06/29 15:29] op_queue add: 0x9fa63d0
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fa63d0
[D 06/29 15:29] op_queue add: 0x9fad360
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fad360
[D 06/29 15:29] op_queue add: 0x9fb0bf0
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fb0bf0
[D 06/29 15:29] op_queue add: 0x9fb2f90
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fb2f90
[D 06/29 15:29] op_queue add: 0x9fb5ab0
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fb5ab0
[D 06/29 15:29] op_queue add: 0x9fc7a30
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fc7a30
[D 06/29 15:29] op_queue add: 0x9fca500
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fca500
[D 06/29 15:29] op_queue add: 0x9fca690
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fca690
[D 06/29 15:29] op_queue add: 0x9fe1980
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: 1)
[D 06/29 15:29] op_queue add: 0x9fe1980
[D 06/29 15:29] op_queue add: 0x9fe2330
[D 06/29 15:29] [DBPF THREAD]: STARTING TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES)
[E 06/29 15:29] dbpf_dspace_iterate_handles_op_svc: Invalid argument
[D 06/29 15:29] [DBPF THREAD]: FINISHED TROVE SERVICE ROUTINE
(DSPACE_ITERATE_HANDLES) (ret: -1073742095)
[D 06/29 15:29] op_queue add: 0x9fe2330
[D 06/29 15:29] trove_dspace_iterate_handles failed
[E 06/29 15:29] Error adding handle range
4323455642275676148-4611686018427387890,8935141660703064036-9223372036854775
778 to filesystem pvfs2-fs
[E 06/29 15:29] Error: Could not initialize server interfaces; aborting.
[E 06/29 15:29] Error: Could not initialize server; aborting.
[D 06/29 15:29] *** server shutdown in progress ***


-Randy


From: Randall Martin <[email protected]>
Date: Mon, 29 Jun 2009 14:05:33 -0400
To: <[email protected]>
Subject: [Pvfs2-users] PVFS server won't start

One of our PVFS servers crashed and now it won¹t start back.  It was
previously working since June 2 until today¹s crash.  Any ideas on how to
fix it?  I was running the 2.8.1 released version, but I also tried the HEAD
version with no change in symptoms.

>From the server log:

[D 06/29 13:49] PVFS2 Server version 2.8.1pre1-2009-06-26-182521 starting.
[E 06/29 13:49] dbpf_dspace_iterate_handles_op_svc: Invalid argument
[E 06/29 13:49] Error adding handle range
4323455642275676148-4611686018427387890,8935141660703064036-9223372036854775
778 to filesystem pvfs2-fs
[E 06/29 13:49] Error: Could not initialize server interfaces; aborting.
[E 06/29 13:49] Error: Could not initialize server; aborting.

My config file:


<Defaults>
    UnexpectedRequests 50
    EventLogging none
    EnableTracing no
    LogStamp datetime
    BMIModules bmi_tcp
    FlowModules flowproto_multiqueue
    PerfUpdateInterval 1000
    ServerJobBMITimeoutSecs 30
    ServerJobFlowTimeoutSecs 30
    ClientJobBMITimeoutSecs 300
    ClientJobFlowTimeoutSecs 300
    ClientRetryLimit 60
    ClientRetryDelayMilliSecs 10000
    PrecreateBatchSize 512
    PrecreateLowThreshold 256
</Defaults>

<Aliases>
    Alias oss001-1 tcp://oss001-1:3334
    Alias oss001-2 tcp://oss001-2:3335
    Alias oss001-3 tcp://oss001-3:3336
    Alias oss001-4 tcp://oss001-4:3337

    Alias oss002-1 tcp://oss002-1:3334
    Alias oss002-2 tcp://oss002-2:3335
    Alias oss002-3 tcp://oss002-3:3336
    Alias oss002-4 tcp://oss002-4:3337

    Alias oss003-1 tcp://oss003-1:3334
    Alias oss003-2 tcp://oss003-2:3335
    Alias oss003-3 tcp://oss003-3:3336
    Alias oss003-4 tcp://oss003-4:3337

    Alias oss004-1 tcp://oss004-1:3334
    Alias oss004-2 tcp://oss004-2:3335
    Alias oss004-3 tcp://oss004-3:3336
    Alias oss004-4 tcp://oss004-4:3337
</Aliases>


<ServerOptions>
    Server oss001-1
    StorageSpace /ost1
    LogFile /var/log/pvfs2-server.oss001-1.log
</ServerOptions>
<ServerOptions>
    Server oss001-2
    StorageSpace /ost2
    LogFile /var/log/pvfs2-server.oss001-2.log
</ServerOptions>
<ServerOptions>
    Server oss001-3
    StorageSpace /ost3
    LogFile /var/log/pvfs2-server.oss001-3.log
</ServerOptions>
<ServerOptions>
    Server oss001-4
    StorageSpace /ost4
    LogFile /var/log/pvfs2-server.oss001-4.log
</ServerOptions>


<ServerOptions>
    Server oss002-1
    StorageSpace /ost5
    LogFile /var/log/pvfs2-server.oss002-1.log
</ServerOptions>
<ServerOptions>
    Server oss002-2
    StorageSpace /ost6
    LogFile /var/log/pvfs2-server.oss002-2.log
</ServerOptions>
<ServerOptions>
    Server oss002-3
    StorageSpace /ost7
    LogFile /var/log/pvfs2-server.oss002-3.log
</ServerOptions>
<ServerOptions>
    Server oss002-4
    StorageSpace /ost8
    LogFile /var/log/pvfs2-server.oss002-4.log
</ServerOptions>


<ServerOptions>
    Server oss003-1
    StorageSpace /ost9
    LogFile /var/log/pvfs2-server.oss003-1.log
</ServerOptions>
<ServerOptions>
    Server oss003-2
    StorageSpace /ost10
    LogFile /var/log/pvfs2-server.oss003-2.log
</ServerOptions>
<ServerOptions>
    Server oss003-3
    StorageSpace /ost11
    LogFile /var/log/pvfs2-server.oss003-3.log
</ServerOptions>
<ServerOptions>
    Server oss003-4
    StorageSpace /ost12
    LogFile /var/log/pvfs2-server.oss003-4.log
</ServerOptions>


<ServerOptions>
    Server oss004-1
    StorageSpace /ost13
    LogFile /var/log/pvfs2-server.oss004-1.log
</ServerOptions>
<ServerOptions>
    Server oss004-2
    StorageSpace /ost14
    LogFile /var/log/pvfs2-server.oss004-2.log
</ServerOptions>
<ServerOptions>
    Server oss004-3
    StorageSpace /ost15
    LogFile /var/log/pvfs2-server.oss004-3.log
</ServerOptions>
<ServerOptions>
    Server oss004-4
    StorageSpace /ost16
    LogFile /var/log/pvfs2-server.oss004-4.log
</ServerOptions>

<Filesystem>
    Name pvfs2-fs
    ID 373672578
    RootHandle 1048576
    FileStuffing yes
    <MetaHandleRanges>
        Range oss001-1 3-288230376151711745
        Range oss001-2 288230376151711746-576460752303423488
        Range oss001-3 576460752303423489-864691128455135231
        Range oss001-4 864691128455135232-1152921504606846974
        Range oss002-1 1152921504606846975-1441151880758558717
        Range oss002-2 1441151880758558718-1729382256910270460
        Range oss002-3 1729382256910270461-2017612633061982203
        Range oss002-4 2017612633061982204-2305843009213693946
        Range oss003-1 2305843009213693947-2594073385365405689
        Range oss003-2 2594073385365405690-2882303761517117432
        Range oss003-3 2882303761517117433-3170534137668829175
        Range oss003-4 3170534137668829176-3458764513820540918
        Range oss004-1 3458764513820540919-3746994889972252661
        Range oss004-2 3746994889972252662-4035225266123964404
        Range oss004-3 4035225266123964405-4323455642275676147
        Range oss004-4 4323455642275676148-4611686018427387890
    </MetaHandleRanges>
    <DataHandleRanges>
        Range oss001-1 4611686018427387891-4899916394579099633
        Range oss001-2 4899916394579099634-5188146770730811376
        Range oss001-3 5188146770730811377-5476377146882523119
        Range oss001-4 5476377146882523120-5764607523034234862
        Range oss002-1 5764607523034234863-6052837899185946605
        Range oss002-2 6052837899185946606-6341068275337658348
        Range oss002-3 6341068275337658349-6629298651489370091
        Range oss002-4 6629298651489370092-6917529027641081834
        Range oss003-1 6917529027641081835-7205759403792793577
        Range oss003-2 7205759403792793578-7493989779944505320
        Range oss003-3 7493989779944505321-7782220156096217063
        Range oss003-4 7782220156096217064-8070450532247928806
        Range oss004-1 8070450532247928807-8358680908399640549
        Range oss004-2 8358680908399640550-8646911284551352292
        Range oss004-3 8646911284551352293-8935141660703064035
        Range oss004-4 8935141660703064036-9223372036854775778
    </DataHandleRanges>
    <StorageHints>
        TroveSyncMeta no
        TroveSyncData no
        TroveMethod alt-aio
    </StorageHints>
    <Distribution>
        Name simple_stripe
        Param strip_size
        Value 1048576
    </Distribution>
</Filesystem>


Thanks,
Randy


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to