Hmm...my db_recovery docs say that This example only works for the
keyval.db file.  The dataspace_attributes.db
file requires a different modification (not provided here).  The file
I'm having trouble with is the dataspace_attributes.db.

--Jim

On Wed, Apr 4, 2012 at 11:04 AM, Phil Carns <[email protected]> wrote:
> Another option to consider is the technique described in
> pvfs2/doc/db-recovery.txt.  It describes how to dump and reload two types of
> db files.  The latter is the one you want in this case
> (dataspace_attributes.db).  Please make a backup copy of the original .db
> file if you try this.
>
> One thing to look out for that isn't mentioned in the doc is that the
> rebuilt dataspace_attributes.db will probably be _much_ smaller than the
> original.  This doesn't mean that it lost data, its just that Berkeley DB
> will pack it much more efficiently when all of the entries are rebuilt at
> once.
>
> -Phil
>
>
> On 04/02/2012 01:09 PM, Jim Kusznir wrote:
>>
>> Thanks Boyd:
>>
>> We have 3 io servers, each also running metadata servers.  One will
>> not come up (that's the 3rd server).  I did try and run the db check
>> command (forget the specifics), and it did return a single chunk of
>> entries that are not readable.  As you may guess from the above, I've
>> never interacted with bdb on a direct or low level.  I don't have a
>> good answer for #3; I noticed about 1/3 of the directory entries were
>> "red" on the terminal, and several individuals contacted me with pvfs
>> problems.
>>
>> I will begin building new versions of bdb.  Do I need to install this
>> just on the servers, or do the clients need it as well?
>>
>> --Jim
>>
>> On Sun, Apr 1, 2012 at 4:03 PM, Boyd Wilson<[email protected]>  wrote:
>>>
>>> Jim,
>>> We have been discussing your issue internally.   A few questions:
>>> 1. How many metadata servers do you have?
>>> 2. Do you know which one is affected (if there is more than one)?
>>> 3. How much of the file system can you currently see?
>>>
>>> The issue you mentioned seems to be the one we have seen with the earlier
>>> versions of BerkeleyDB and we have not seen them with the newer versions
>>> as
>>> Becky mentioned.  In our discussions we can't recall if we tried doing a
>>> low
>>> level BDB access to the MD for the unaffected entries and back them up so
>>> they can be restored in a new BDB.  If you are comfortable with lower
>>> level
>>> BDB commands you may want to see if you can read the entries up to the
>>> corruption and after, if you can do both, you may be able to write a
>>> small
>>> program to read out all the entries into a file or another BDB, then
>>> rebuild
>>> the BDB with the valid entries.
>>>
>>> thx
>>> -boyd
>>>
>>> On Sat, Mar 31, 2012 at 6:07 PM, Becky Ligon<[email protected]>  wrote:
>>>>
>>>> Jim:
>>>>
>>>> I understand your situation.  Here at Clemson University, we went
>>>> through
>>>> the same situation a couple of years ago.  Now, we backup the metadata
>>>> databases.  We don't have the space to backup our data either!
>>>>
>>>> Under no circumstances should you run pvfs2-fsck.  If you do, then we
>>>> won't be able to help at all, if you run this command in the destructive
>>>> mode.  If you're willing, Omnibond MAY be able to write some utilities
>>>> that
>>>> we help you recover most of the data.  You will have to speak to Boyd
>>>> Wilson
>>>> ([email protected]) and workout something.
>>>>
>>>> Becky Ligon
>>>>
>>>>
>>>> On Fri, Mar 30, 2012 at 5:55 PM, Jim Kusznir<[email protected]>  wrote:
>>>>>
>>>>> I made no changes to my environment; it was up and running just fine.
>>>>> I ran db_recover, and it immediately returned, with no apparent sign
>>>>> of doing anything but creating a log.000000001 file.
>>>>>
>>>>> I have the centos DB installed, db4-4.3.29-10.el5
>>>>>
>>>>> I have no backups; this is my high performance filesystem of 99TB; it
>>>>> is the largest disk we have and therefore have no means of backing it
>>>>> up.  We don't have anything big enough to hold that much data.
>>>>>
>>>>> Is there any hope?  Can we just identify and delete the files that
>>>>> have the db dammange on it?  (Note that I don't even have anywhere to
>>>>> back up this data to temporally if we do get it running, so I'd need
>>>>> to "fix in place".
>>>>>
>>>>> thanks!
>>>>> --Jim
>>>>>
>>>>> --Jim
>>>>>
>>>>> On Fri, Mar 30, 2012 at 2:44 PM, Becky Ligon<[email protected]>
>>>>>  wrote:
>>>>>>
>>>>>> Jim:
>>>>>>
>>>>>> If you haven't made any recent changes to your pvfs environment or
>>>>>> Berkeley
>>>>>> Db installation, then it looks like you have a corrupted metadata
>>>>>> database.
>>>>>> There is no way to easily recover.  Sometimes, the Berkeley db command
>>>>>> "db_recover" might work, but PVFS doesn't have transactions turned on,
>>>>>> so
>>>>>> normally it doesn't work.  It's worth a try, just to be sure.
>>>>>>
>>>>>> Do you have any recent backups of the databases?  If so, then you will
>>>>>> need
>>>>>> to use a set of backups that were created around the same time, so the
>>>>>> databases will be somewhat consistent with each other.
>>>>>>
>>>>>> Which version of Berkeley are you using?  We have had corruption
>>>>>> issues
>>>>>> with
>>>>>> older versions of it.  We strongly recommend 4.8 or higher.  There are
>>>>>> some
>>>>>> know problems with threads in the older versions .
>>>>>>
>>>>>> Becky Ligon
>>>>>>
>>>>>> On Fri, Mar 30, 2012 at 3:28 PM, Jim Kusznir<[email protected]>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi all:
>>>>>>>
>>>>>>> I got some notices from my users with "wierdness with pvfs2" this
>>>>>>> morning, and went and investagated.  eventually, I found the
>>>>>>> following
>>>>>>> on one of my 3 serers:
>>>>>>>
>>>>>>> [S 03/30 12:22] PVFS2 Server on node pvfs2-io-0-2 version 2.8.2
>>>>>>> starting...
>>>>>>> [E 03/30 12:23] Warning: got invalid handle or key size in
>>>>>>> dbpf_dspace_iterate_handles().
>>>>>>> [E 03/30 12:23] Warning: skipping entry.
>>>>>>> [E 03/30 12:23] c_get failed on iteration 3044
>>>>>>> [E 03/30 12:23] dbpf_dspace_iterate_handles_op_svc: Invalid argument
>>>>>>> [E 03/30 12:23] Error adding handle range
>>>>>>> 1431655768-2147483649,3579139414-4294967295 to filesystem pvfs2-fs
>>>>>>> [E 03/30 12:23] Error: Could not initialize server interfaces;
>>>>>>> aborting.
>>>>>>> [E 03/30 12:23] Error: Could not initialize server; aborting.
>>>>>>>
>>>>>>> ------------
>>>>>>> pvfs2-fs.conf:
>>>>>>> -----------
>>>>>>>
>>>>>>> <Defaults>
>>>>>>>        UnexpectedRequests 50
>>>>>>>        EventLogging none
>>>>>>>        LogStamp datetime
>>>>>>>        BMIModules bmi_tcp
>>>>>>>        FlowModules flowproto_multiqueue
>>>>>>>        PerfUpdateInterval 1000
>>>>>>>        ServerJobBMITimeoutSecs 30
>>>>>>>        ServerJobFlowTimeoutSecs 30
>>>>>>>        ClientJobBMITimeoutSecs 300
>>>>>>>        ClientJobFlowTimeoutSecs 300
>>>>>>>        ClientRetryLimit 5
>>>>>>>        ClientRetryDelayMilliSecs 2000
>>>>>>>        StorageSpace /mnt/pvfs2
>>>>>>>        LogFile /var/log/pvfs2-server.log
>>>>>>> </Defaults>
>>>>>>>
>>>>>>> <Aliases>
>>>>>>>        Alias pvfs2-io-0-0 tcp://pvfs2-io-0-0:3334
>>>>>>>        Alias pvfs2-io-0-1 tcp://pvfs2-io-0-1:3334
>>>>>>>        Alias pvfs2-io-0-2 tcp://pvfs2-io-0-2:3334
>>>>>>> </Aliases>
>>>>>>>
>>>>>>> <Filesystem>
>>>>>>>        Name pvfs2-fs
>>>>>>>        ID 62659950
>>>>>>>        RootHandle 1048576
>>>>>>>        <MetaHandleRanges>
>>>>>>>                Range pvfs2-io-0-0 4-715827885
>>>>>>>                Range pvfs2-io-0-1 715827886-1431655767
>>>>>>>                Range pvfs2-io-0-2 1431655768-2147483649
>>>>>>>        </MetaHandleRanges>
>>>>>>>        <DataHandleRanges>
>>>>>>>                Range pvfs2-io-0-0 2147483650-2863311531
>>>>>>>                Range pvfs2-io-0-1 2863311532-3579139413
>>>>>>>                Range pvfs2-io-0-2 3579139414-4294967295
>>>>>>>        </DataHandleRanges>
>>>>>>>        <StorageHints>
>>>>>>>                TroveSyncMeta yes
>>>>>>>                TroveSyncData no
>>>>>>>        </StorageHints>
>>>>>>> </Filesystem>
>>>>>>> -------------
>>>>>>> Any suggestions for recovery?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> --Jim
>>>>>>> _______________________________________________
>>>>>>> Pvfs2-users mailing list
>>>>>>> [email protected]
>>>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Becky Ligon
>>>>>> OrangeFS Support and Development
>>>>>> Omnibond Systems
>>>>>> Anderson, South Carolina
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Becky Ligon
>>>> OrangeFS Support and Development
>>>> Omnibond Systems
>>>> Anderson, South Carolina
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pvfs2-users mailing list
>>>> [email protected]
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to