Not yet.  I am still working on it.  I would like to avoid using
the GUI to submit.  Instead, I would like to be able to recursively
go through a dir and its sub-dirs and automatically crawl.
Has anybody done this before?

Thanks,

-Lei


On 2/1/07, Jayan Chirayath Kurian <[EMAIL PROTECTED]> wrote:

 You solved your problem in importing documents or are u using the
interface to upload documents into the repository.



Jayan


 ------------------------------

*From:* Pan Family [mailto:[EMAIL PROTECTED]
*Sent:* Friday, February 02, 2007 5:19 AM
*To:* Jayan Chirayath Kurian
*Subject:* Re: [Dspace-tech] how can I find out the collectionID?



Thanks a lot!

-Pan

On 1/31/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote:

<? xml version="1.0" encoding="iso-8859-1" ?>

- <!--

 title of pdf AMIC_1984_10_CM_03.pdf

* * -->

*-* <dublin_core>

* * <dcvalue element="*creator*" qualifier ="*conference*">*AMIC-Chiangmai
** University** Refresher Course on Communication Research Methodology :
Chiangmai, Oct 29-Nov 2, 1984.*</dcvalue>

* * <dcvalue element="*title*" qualifier ="*none*">*The Logic of Social
Science Research.*</dcvalue>

* * <dcvalue element="*contributor*" qualifier ="*author*">*Atal, Yogesh.*
</dcvalue>

* * <dcvalue element="*date*" qualifier ="*issued*">*1984-10-29*</ dcvalue
>

* * </dublin_core>




 ------------------------------

*From:* Pan Family [mailto:[EMAIL PROTECTED]
*Sent:* Thursday, February 01, 2007 3:52 AM
*To:* Jayan Chirayath Kurian
*Cc:* dspace-tech@lists.sourceforge.net


*Subject:* Re: [Dspace-tech] how can I find out the collectionID?



Could you please kindly provide a sample Dublin_core.xml?

I assumed that dsrun would recursively go through the
directories and index all the files under them.  Apparently
I was wrong.  The requirement of Dublin_core.xml and
the content file makes the process much less automatic.
Is there a way around this?

Thanks a lot!

-Pan

 On 1/30/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote:




 ------------------------------

*From:* Pan Family [mailto: [EMAIL PROTECTED]
*Sent:* Wednesday, January 31, 2007 1:15 PM
*To:* Jayan Chirayath Kurian
*Cc:* Dorothea Salo; dspace-tech@lists.sourceforge.net
*Subject:* Re: [Dspace-tech] how can I find out the collectionID?



Ok.  I will give this a try.

Still two questions:
(1) Where can I get the file Dublin_core.XML?

Dublin_core.xml contains the meta data descriptions of the resource (e.g.
title, date published etc). You have to create the xml file using a notepad.

(2) Let's say I only want to index one file named: foo.pdf, and I put
     it under /Users/pan/tmp/foo.pdf and pass src=/Users/pan to dsrun
     Is foo.pdf considered the content file or the resource?  And which is
     the third type of file?

foo.pdf is the resource (i.e. pdf or ppt or jpeg…..)

Content file is a text file that just contains the name of the resource
i.e. foo.pdf


Thanks a lot!

-Pan

On 1/30/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote:

I feel the tmp directory should have (1) the Dublin_core.XML (2) contents
file and (3) actual resource. The tmp directory should have all these files
without any more subdirectories for these files. Can you try with 
source=/Users/pan/
and removing all subdirectories under tmp and having only these 3 files
listed above. Hope it works.



My structure is src = C:\DSpace\bin\archive_directory

The archive_directory contains the directory Item_001

Item_001 contains (1) Dublin_core.XML (2) contents file and (3) actual
resource.

There are no more subdirectories under Item_001.



Thanks,

Jayan


 ------------------------------

*From:* Pan Family [mailto: [EMAIL PROTECTED]
*Sent:* Wednesday, January 31, 2007 4:06 AM
*To:* Jayan Chirayath Kurian
*Cc:* Dorothea Salo; dspace-tech@lists.sourceforge.net


*Subject:* Re: [Dspace-tech] how can I find out the collectionID?



Thanks for your help!

I am working on Mac OS X.  Yes, "pan" contains "tmp"

It seems that for me the dir that I give to source= cannot contain any
subdirs.  For example, if I give it "/Users/pan/" I got an error
complaining about the missing file ".fvwm/dublin_core.xml"
.fvwm is a subdir under "Users/pan/"

If I give it "/Users/pan/tmp/"
then it complains about the same missing file under the subdirs
of "tmp" until I removed all the subdirs under "tmp"
But I still don't get the files under "tmp" imported to my collection,
even if no error shows after I removed all subdirs.

bubba:$ dsrun org.dspace.app.itemimport.ItemImport --add --eperson=
[EMAIL PROTECTED] --collection=123456789/2 --source=/Users/pan/
--mapfile=/Users/pan/test_map --test
**Test Run** - not actually importing items.
Destination collections:
Owning  Collection: PODAAC collection
Adding items from directory: /Users/pan/
Generating mapfile: /Users/pan/test_map
Adding item from directory .fvwm
java.io.FileNotFoundException : /Users/pan/.fvwm/dublin_core.xml (No such
file or directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:106)
        at java.io.FileInputStream .<init>(FileInputStream.java:66)
        at sun.net.www.protocol.file.FileURLConnection.connect(
FileURLConnection.java:70)
        at sun.net.www.protocol.file.FileURLConnection.getInputStream(
FileURLConnection.java :161)
        at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown
Source)
        at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse (Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse (Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown
Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java
:172)
        at org.dspace.app.itemimport.ItemImport.loadXML (ItemImport.java
:1269)
        at org.dspace.app.itemimport.ItemImport.loadDublinCore(
ItemImport.java:795)
        at org.dspace.app.itemimport.ItemImport.loadMetadata(
ItemImport.java:780)
        at org.dspace.app.itemimport.ItemImport.addItem (ItemImport.java
:626)
        at org.dspace.app.itemimport.ItemImport.addItems(ItemImport.java
:498)
        at org.dspace.app.itemimport.ItemImport.main(ItemImport.java:407)
java.io.FileNotFoundException: /Users/pan/.fvwm/dublin_core.xml (No such
file or directory)
***End of Test Run***

On 1/29/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote:

Can you please try with source=/Users/pan/

I encountered the same problem on windows platform. This was rectified by
giving the main folder name with the import command. I assume that "pan"
contains the subfolder "tmp" which infact contains the pdf file. Hope you
will let me know if this works with you.



Thanks,

Jayan


 ------------------------------

*From:* [EMAIL PROTECTED] [mailto:
[EMAIL PROTECTED] *On Behalf Of *Pan Family
*Sent:* Tuesday, January 30, 2007 8:02 AM
*To:* Dorothea Salo
*Cc:* dspace-tech@lists.sourceforge.net
*Subject:* Re: [Dspace-tech] how can I find out the collectionID?



Hi Dorothea:

Thanks a lot for your help!
In my case, the handle is 123456789/2.
So I used the following command to add
a pdf file under /User/pan/tmp, but somehow
the pdf file was not added into the collection
and the file test_map is empty.  No error
message was shown either.  I wonder what
I did wrong.  Could you give me some ideas
on how to debug?

Thanks again,

-Pan

bubba:~/dspace-1.4.1-source /bin pan$ dsrun
org.dspace.app.itemimport.ItemImport --add [EMAIL PROTECTED]/2 
--source=/Users/pan/tmp/
--mapfile=/Users/pan/tmp/test_map
Destination collections:
Owning  Collection: PODAAC collection
Adding items from directory: /Users/pan/tmp/
Generating mapfile: /Users/pan/tmp/test_map

On 1/29/07, *Dorothea Salo *<[EMAIL PROTECTED]> wrote:

Pan Family wrote:
> dsrun org.dspace.app.itemimport.ItemImport --add
> [EMAIL PROTECTED]  --collection=collectionID --source=items_dir
> --mapfile=mapfile
>
> Hi,
>
> The above command for batch import requires
> the collectionID as input.  I wonder how
> I can find out this ID?  Is it the string
> that I used to name my collection, or an ID
> that DSpace uses internally?

        You can use the collection's handle for this; go to the
collection's home page
and use the numbers after "handle/" in the URL.

        If you should need the internal DSpace collection ID for some
reason, though,
log in, surf to the collection page, and then use the "Edit" button under
Admin
Tools. From there, choose "Collection's Authorizations," and DSpace will
pop up
the "DB ID" in the title of the page.

        (I hope there's an easier way to do this! There certainly should
be.)

Dorothea

--
Dorothea Salo, Digital Repository Services Librarian
(703)993-3742     [EMAIL PROTECTED]     AIM: gmumars
MSN 2FL, Fenwick Library
George Mason University
4400 University Drive, Fairfax VA 22031

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share
your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech











-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to