If the assetstore was on the filesystem that you recovered, and you can still 
connect to the database, you may be able to find the content.  The files 
haven't been put into any special encoding, just renamed. 

Make sure that you turn all of your cron jobs off so that the cleanup job that 
Sands mentioned doesn't run.

Then you can run queries on the bitstream table.

select name, internal_id from bitstream;  

That should give a list of the original filenames, with their new names in the 
assetstore.

This might narrow it down more:
select name, internal_id from bitstream where deleted='t'; 
but I'm not near a database at the moment to test.

When you have the internal_id for whatever file you're looking for, that will 
also give you the location in the assetstore.

  For example, if the internal_id is 23495820534557474793935, the file 
23495820534557474793935 will be in the assetstore in subdirectory 23/49/58. 
Note that these directory names are created from the first digits of the new 
filename.  (23/49/58)20534557474793935.

Just copy your file to an appropriate location and rename it.

If you have someone who likes perl or other scripting languages, you can 
automate this.  

I think this will work if you have one assetstore and all of the original 
filenames were unique.

--keith

----- Original Message -----
From: "Jeffrey W. Pearson" <[email protected]>
To: "Mark Diggory" <[email protected]>
Cc: "Peter Dietz" <[email protected]>, [email protected]
Sent: Friday, May 28, 2010 5:34:35 PM GMT -05:00 US/Canada Eastern
Subject: Re: [Dspace-devel] possible to restore data?

Thank you for the feedback even though it is bad news. Just to clarify 
though and making sure I understand.


We don't care about the metadata. The plan was to transfer the content 
to another cms. We were going to have to open each one and re-catalog 
anyway. The point of concern is getting those content files back.


Agani, I have everything at the file system. So theoretically, I 
actually have the content files. They are just encoded into the 
'bitstream' files that dspace uses and we can't read them. There is 
nothing that will extract the original pdf files from these?



Once again, thank you all for the suggestions and information.


Jeff Pearson
USC Libraries




Mark Diggory wrote:
> Jeff,
>
> Unfortunately, There is significantly more than just the Community 
> table that is effected here.  You actually loose the Item, Metadata 
> and Bundle table entries that were present.  All that is left is the 
> Bitstream Table with the flag that the Bitstream has been deleted. You 
> would need to restore the database from a backup to recover that 
> database state for not only the Community and Collections objects, but 
> also the Item and Bundle Objects the Bitstreams were attached to.
>
> This is unfortunate news. The only other possibility for recovering 
> the data may be if a third party harvested your OAI gateway and has 
> indexed those dc records, unfortunately, I think such data would also 
> be partial. 
>
> Sorry the news is not better.
> Mark
>
>
> On May 28, 2010, at 12:23 PM, Jeffrey W. Pearson wrote:
>
>> Thank you for the info so far: The problem with getting technical info
>> from a user :-(
>>
>>
>> OK. The user did not delete the individual items, he actually deleted
>> the COMMUNITY. I dont see the data for the record in the community
>> table. Are we SOL?
>>
>>
>>
>> Again, thank you all for the suggestions and the user GREATLY
>> appreciates the efforts to saving his fanny...
>>
>>
>>
>> Jeff Pearson
>> USC Libraries
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Peter Dietz wrote:
>>> I would look in the database table "item". It has a field called
>>> "item"."in_archive". Typically when an item gets deleted it sets
>>> item.in_archive to FALSE
>>>
>>> So from the DB, you could look for all of the "deleted" items. So use
>>> your favorite SQL query tool (perhaps pgAdmin3 if your DB is postgres)
>>> SELECT * FROM item where item.in_archive = false;
>>>
>>>
>>> Or get all the metadata for the deleted items.
>>> SELECT
>>>  item.item_id,
>>>  item.in_archive,
>>>  metadatafieldregistry.element,
>>>  metadatafieldregistry.qualifier,
>>>  metadatavalue.text_value,
>>>  item.last_modified
>>> FROM
>>>  public.item,
>>>  public.metadatavalue,
>>>  public.metadatafieldregistry
>>> WHERE
>>>  item.item_id = metadatavalue.item_id AND
>>>  metadatavalue.metadata_field_id =
>>> metadatafieldregistry.metadata_field_id AND
>>>  item.in_archive = FALSE
>>> ORDER BY last_modified DESC;
>>>
>>> However, I think deletions to communities and collections make them go
>>> away, the items may remain behind as an artifact.
>>>
>>>
>>> Peter Dietz
>>> Systems Developer/Engineer
>>> Ohio State University Libraries
>>>
>>>
>>>
>>> On Fri, May 28, 2010 at 2:35 PM, Sands Alden Fish <[email protected] 
>>> <mailto:[email protected]>
>>> <mailto:[email protected]>> wrote:
>>>
>>>    Others should provide some clarification on this, but typically
>>>    deletions are not "hard deletes" but soft, in that there is a
>>>    deletion flag that a cleanup process operates on occasionally.
>>>     Perhaps the database still contains the items, and you can revert
>>>    the deletion by modifying the correct tables?
>>>
>>>    Sorry that this isn't more detailed.  I've never had to dig around
>>>    in that logic before.  
>>>
>>>    --
>>>    sands fish
>>>    Software Engineer
>>>    MIT Libraries
>>>    Technology Research & Development
>>>    [email protected] <mailto:[email protected]> <mailto:[email protected]>
>>>    E25-131
>>>
>>>
>>>
>>>
>>>    On May 28, 2010, at 2:30 PM, Jeffrey W. Pearson wrote:
>>>
>>>>    Quick question:
>>>>
>>>>    We had a user delete things he should not have. We have been able to
>>>>    restore the filesystem stuff from backups. Unfortunately, the
>>>>    database
>>>>    backups have been lost. All we really need is the content files. Is
>>>>    there a way to rebuild the content files from just using what is
>>>>    on the
>>>>    file system?
>>>>
>>>>    Any help would be GREATLY appreciated, especially by the user who
>>>>    deleted the data....
>>>>
>>>>
>>>>
>>>>    Jeff Pearson
>>>>    USC Libraries
>>>>
>>>>    
>>>> ------------------------------------------------------------------------------
>>>>
>>>>    _______________________________________________
>>>>    Dspace-devel mailing list
>>>>    [email protected] 
>>>> <mailto:[email protected]>
>>>>    <mailto:[email protected]>
>>>>    https://lists.sourceforge.net/lists/listinfo/dspace-devel
>>>
>>>
>>>    
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>    _______________________________________________
>>>    Dspace-devel mailing list
>>>    [email protected] 
>>> <mailto:[email protected]>
>>>    <mailto:[email protected]>
>>>    https://lists.sourceforge.net/lists/listinfo/dspace-devel
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Dspace-devel mailing list
>> [email protected] 
>> <mailto:[email protected]>
>> https://lists.sourceforge.net/lists/listinfo/dspace-devel
>
> Mark R. Diggory
> Head of U.S. Operations - @mire
>
> http://www.atmire.com - Institutional Repository Solutions
> http://www.togather.eu - Before getting together, get t...@ther
>

------------------------------------------------------------------------------

_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

------------------------------------------------------------------------------

_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to