[dspace-tech] Dissemination Crosswalk class for OAI

2017-03-02 Thread Evgeni Dimitrov
I have in every ingested item a mets.xml file with descriptive metadata in
MARC.
I am looking at SimpleDCDisseminationCrosswalk.java
I can code something similar, which will extract the MARC xml from the mets
file.

I can not figure out what to do after that - how to describe in xoai.xml
one context with this single marcxml format - which is produced by my
MarcXMLDisseminationCrosswalk.java

Could you help?

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Error limiting filter-media's ImageMagick PDF Thumbnail plugin to PDF bitstreams

2017-03-02 Thread Alan Orth
Hello,

I did a bit of testing and it seems that this problem with arises when an
item has a JPG in the ORIGINAL bundle. The "ImageMagick PDF Thumbnail"
plugin will process both PDFs and JPGs if they are present in the ORIGINAL
bundle, despite being configured to only process "Adobe PDF" input formats.
In our case we have some JPGs in the ORIGINAL bundle because editors had
manually created thumbnails and uploaded them during item submission, but
this is beside the issue.

- Item with a PDF in the ORIGINAL bundle:

$ [dspace]/bin/dspace filter-media -f -i 10568/16881 -p "ImageMagick PDF
Thumbnail" -v
The following MediaFilters are enabled:
Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
IM Thumbnail earlywinproposal_esa_postharvest.pdf is replacable.
File: earlywinproposal_esa_postharvest.pdf.jpg
IM Image Param: /tmp/impdfthumb6654586450122351471.pdf[0] -flatten
/tmp/impdfthumb6654586450122351471.pdf.jpg
IM Thumbnail Param: /tmp/impdfthumb6654586450122351471.pdf.jpg -thumbnail
300x300 /tmp/impdfthumb6654586450122351471.pdf.jpg.jpg
FILTERED: bitstream 13787 (item: 10568/16881) and created
'earlywinproposal_esa_postharvest.pdf.jpg'

- Item with a JPG in the ORIGINAL bundle:

$ [dspace]/bin/dspace filter-media -f -i 10568/33941  -p "ImageMagick PDF
Thumbnail" -v
The following MediaFilters are enabled:
Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
Generated Thumbnail strengtheningPig.jpg matches pattern and is replacable.
File: strengtheningPig.jpg.jpg
IM Image Param: /tmp/impdfthumb5315798183586620841.pdf[0] -flatten
/tmp/impdfthumb5315798183586620841.pdf.jpg
IM Thumbnail Param: /tmp/impdfthumb5315798183586620841.pdf.jpg -thumbnail
300x300 /tmp/impdfthumb5315798183586620841.pdf.jpg.jpg
FILTERED: bitstream 23121 (item: 10568/33941) and created
'strengtheningPig.jpg.jpg'

- Item with a JPG in the THUMBNAIL bundle (manually uploaded after item
submission):
$ [dspace]/bin/dspace filter-media -f -i 10568/24655 -p "ImageMagick PDF
Thumbnail" -v
The following MediaFilters are enabled:
Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter

The expected behavior is that the "ImageMagick PDF Thumbnail" plugin should
not process JPGs, but it does process them when they are in the ORIGINAL
bundle, despite its configuration in dspace.cfg. We are using DSpace 5.5. I
will file an issue on Jira.

Regards,

On Tue, Feb 21, 2017 at 11:08 AM Alan Orth  wrote:

I think I found a bug in filter-media. I'm trying to force the
re-generation of all PDF thumbnails in a collection by limiting the
filter-media command to the ImageMagick PDF Thumbnail plugin, but I see it
still processing JPGs:

---
$ [dspace]/bin/dspace filter-media -f -i 10568/16856 -p "ImageMagick PDF
Thumbnail"
...
File: EnvtNaturalRes.jpg.jpg
FILTERED: bitstream 80165 (item: 10568/76133) and created
'EnvtNaturalRes.jpg.jpg'
File: zemadim_2016.pdf.jpg
FILTERED: bitstream 85076 (item: 10568/77324) and created
'zemadim_2016.pdf.jpg'
---

The configuration for filter-media's ImageMagick plugins is:

---
filter.org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter.inputFormats
= BMP, GIF, image/png, JPG, TIFF, JPEG, JPEG 2000
filter.org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.inputFormats
= Adobe PDF
---

The expected behavior is that filter-media only processes bitstreams
matching the input formats listed in the plugin's configuration. In my case
I would be generating thumbnails for thousands of items and so this creates
lots of extra I/O and wastes CPU cycles.

We are running DSpace version 5.5.

Thank you,
-- 

Alan Orth
alan.o...@gmail.com
https://englishbulgaria.net
https://alaninkenya.org
https://mjanja.ch

-- 

Alan Orth
alan.o...@gmail.com
https://englishbulgaria.net
https://alaninkenya.org
https://mjanja.ch

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Error limiting filter-media's ImageMagick PDF Thumbnail plugin to PDF bitstreams

2017-03-02 Thread Alan Orth
Hello,

I filed a Jira issue for DSpace 5.x here:
https://jira.duraspace.org/browse/DS-3516

Would be nice if someone could test this on DSpace 6.x and see if it is
still an issue.

Cheers,

On Thu, Mar 2, 2017 at 6:10 PM Alan Orth  wrote:

> Hello,
>
> I did a bit of testing and it seems that this problem with arises when an
> item has a JPG in the ORIGINAL bundle. The "ImageMagick PDF Thumbnail"
> plugin will process both PDFs and JPGs if they are present in the ORIGINAL
> bundle, despite being configured to only process "Adobe PDF" input formats.
> In our case we have some JPGs in the ORIGINAL bundle because editors had
> manually created thumbnails and uploaded them during item submission, but
> this is beside the issue.
>
> - Item with a PDF in the ORIGINAL bundle:
>
> $ [dspace]/bin/dspace filter-media -f -i 10568/16881 -p "ImageMagick PDF
> Thumbnail" -v
> The following MediaFilters are enabled:
> Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
> org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
> IM Thumbnail earlywinproposal_esa_postharvest.pdf is replacable.
> File: earlywinproposal_esa_postharvest.pdf.jpg
> IM Image Param: /tmp/impdfthumb6654586450122351471.pdf[0] -flatten
> /tmp/impdfthumb6654586450122351471.pdf.jpg
> IM Thumbnail Param: /tmp/impdfthumb6654586450122351471.pdf.jpg -thumbnail
> 300x300 /tmp/impdfthumb6654586450122351471.pdf.jpg.jpg
> FILTERED: bitstream 13787 (item: 10568/16881) and created
> 'earlywinproposal_esa_postharvest.pdf.jpg'
>
> - Item with a JPG in the ORIGINAL bundle:
>
> $ [dspace]/bin/dspace filter-media -f -i 10568/33941  -p "ImageMagick PDF
> Thumbnail" -v
> The following MediaFilters are enabled:
> Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
> org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
> Generated Thumbnail strengtheningPig.jpg matches pattern and is replacable.
> File: strengtheningPig.jpg.jpg
> IM Image Param: /tmp/impdfthumb5315798183586620841.pdf[0] -flatten
> /tmp/impdfthumb5315798183586620841.pdf.jpg
> IM Thumbnail Param: /tmp/impdfthumb5315798183586620841.pdf.jpg -thumbnail
> 300x300 /tmp/impdfthumb5315798183586620841.pdf.jpg.jpg
> FILTERED: bitstream 23121 (item: 10568/33941) and created
> 'strengtheningPig.jpg.jpg'
>
> - Item with a JPG in the THUMBNAIL bundle (manually uploaded after item
> submission):
> $ [dspace]/bin/dspace filter-media -f -i 10568/24655 -p "ImageMagick PDF
> Thumbnail" -v
> The following MediaFilters are enabled:
> Full Filter Name: org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
> org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter
>
> The expected behavior is that the "ImageMagick PDF Thumbnail" plugin
> should not process JPGs, but it does process them when they are in the
> ORIGINAL bundle, despite its configuration in dspace.cfg. We are using
> DSpace 5.5. I will file an issue on Jira.
>
> Regards,
>
> On Tue, Feb 21, 2017 at 11:08 AM Alan Orth  wrote:
>
> I think I found a bug in filter-media. I'm trying to force the
> re-generation of all PDF thumbnails in a collection by limiting the
> filter-media command to the ImageMagick PDF Thumbnail plugin, but I see it
> still processing JPGs:
>
> ---
> $ [dspace]/bin/dspace filter-media -f -i 10568/16856 -p "ImageMagick PDF
> Thumbnail"
> ...
> File: EnvtNaturalRes.jpg.jpg
> FILTERED: bitstream 80165 (item: 10568/76133) and created
> 'EnvtNaturalRes.jpg.jpg'
> File: zemadim_2016.pdf.jpg
> FILTERED: bitstream 85076 (item: 10568/77324) and created
> 'zemadim_2016.pdf.jpg'
> ---
>
> The configuration for filter-media's ImageMagick plugins is:
>
> ---
> filter.org.dspace.app.mediafilter.ImageMagickImageThumbnailFilter.inputFormats
> = BMP, GIF, image/png, JPG, TIFF, JPEG, JPEG 2000
> filter.org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.inputFormats
> = Adobe PDF
> ---
>
> The expected behavior is that filter-media only processes bitstreams
> matching the input formats listed in the plugin's configuration. In my case
> I would be generating thumbnails for thousands of items and so this creates
> lots of extra I/O and wastes CPU cycles.
>
> We are running DSpace version 5.5.
>
> Thank you,
> --
>
> Alan Orth
> alan.o...@gmail.com
> https://englishbulgaria.net
> https://alaninkenya.org
> https://mjanja.ch
>
> --
>
> Alan Orth
> alan.o...@gmail.com
> https://englishbulgaria.net
> https://alaninkenya.org
> https://mjanja.ch
>
-- 

Alan Orth
alan.o...@gmail.com
https://englishbulgaria.net
https://alaninkenya.org
https://mjanja.ch

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Topic for a future DCAT call: "Future of the fileformat registry"

2017-03-02 Thread Bram Luyten
Pauline sent a followup on this that hadn't made it to the list yet:

Thanks very much Bram. Those sound like functions that would be a great 
help to us. I have just been training two new colleagues to check and 
approve submissions, which has made me think a bit about some DSpace 
functionality and documentation.


I would also love to see these improvements re the file format registry:


 1. The file format registry should be searchable; whereas currently the 
descriptive information and list of file format extensions is hidden, such 
that the user has to click on each format to see which file extensions it 
is linked to. We currently have 343 formats, with a further five or so 
we're preparing to add, so that is a lot of clicking around.


 2. When viewing the list of bitstreams in a submitted item, the system 
should provide a link to (or a tool-tip containing) the file format name 
and description. For example we've recorded .mat as a file extension of a 
format we call "Matlab". But when my trainee colleague examines a 
submission containing .mat files, all DSpace shows him as format 
information is "Text" because that is the MIME type we have recorded for 
that file type. He found this confusing. And when we looked at the file 
format registry to get some more information on this mysterious .mat 
extension, of course a Ctrl-F to Find "mat" in the page which lists our 343 
formats produced many spurious matches because it hit every occurrence of 
the word "information". Whereas a Ctrl-F to find ".mat" produced no hits 
because that did not match the *name* of the format, 'Matlab'. So the 
registry was effectively useless. If the registry was more easily 
searchable, it would be more worthwhile for us to use it to record 
information in it.


 3. When viewing the list of bitstreams on the full item view of a deposit, 
a user should be provided with a link to (or a tool-tip containing) the 
file format name and description. If this were the case, we could record 
information in our file format registry that would help users to read and 
process data in formats unfamiliar to them.


On Saturday, 18 February 2017 13:44:45 UTC+1, Bram Luyten wrote:
>
> Hi Pauline,
>
> your suggestion merits its own thread.
>
> I definitely agree that the file format registry, and broader, the digital 
> preservation capabilities of DSpace deserve more attention and improvement. 
>
> The fact that we're currently only looking at file extension and that we 
> haven't hooked in something like DROID or PRONOM for more solid file format 
> identification is also a blind spot.
>
> Would be curious to see what your biggest issues and suggestions are for 
> the file format registry.
>
> thanks,
>
> Bram
>
>
> [image: logo] Bram Luyten
> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
> Esperantolaan 4, Heverlee 3001, Belgium
> atmire.com 
> 
>
> On 18 February 2017 at 13:19, WARD Pauline  wrote:
>
>> It's not a pressing issue for us at Edinburgh.
>>
>>
>> I think actually the File Format Registry could be made much more 
>> functional (compared to how it works in v5.2), much more useful both for 
>> curators, depositors and especially end-users. Of course, I'm running a 
>> research *data* repository, so new file formats generate a lot of work for 
>> us. There's been an acceleration of the appearance of new file formats I 
>> think, certainly in the deposits we're receiving. 
>>
>>
>> But maybe colleagues running publication repositories would be less 
>> interested in that...? Should I just maybe draft up some suggestions into 
>> use cases? Or, Bram, do you know, is the file format registry an area that 
>> has been improved in v6? Thanks very much.
>>
>>
>> :p
>>
>>
>> Pauline Ward
>>
>> Research Data Service Assistant
>>
>> University of Edinburgh
>>
>> Argyle House, 3 Lady Lawson St, Edinburgh
>>
>> tel: 0131 651 5277
>>
>> @PaulineData 
>>
>> The University of Edinburgh is a charitable body, registered in Scotland, 
>> with registration number SC005336.
>>
>>
>> --
>>
>>  
>> Hi, 
>>
>> apologies for cross posting but though this would be of interest to the 
>> different lists.
>>
>> Because this could be a relatively technical topic, I wanted to see if 
>> there's an interest to dedicate one of the next DCAT calls to DSpace 
>> performance:
>>
>> - are your pages loading fast enough?
>> - are you suffering from downtime and how are you dealing with this?
>> - which performance related JIRA tickets are out there and should we 
>> raise attention to them?
>> - Show & tell of approaches, for example, 
>> https://wiki.duraspace.org/display/~terrywbrady/Using+New+Relic+to+Monitor+XMLUI
>>
>> What do you think? Too technical? Relevant? Should we schedule it?
>>
>> cheers,
>>
>> Bram
>>
>> [image: logo] Bram Luyten
>> 250-B Suite 3A, Lucius Gordon Drive, Wes

Re: [dspace-tech] Integrating LCSH as a controlled vocabulary for the subject field

2017-03-02 Thread Tim Donohue
Hi,

We are not legally allowed to redistribute LCSH with DSpace out-of-the-box.
This is why DSpace only supports specific vocabularies by default (like
SRSC), as some vocabularies cannot be redistributed within other software.

Currently, as of DSpace 5 or 6, the only way to add new controlled
vocabularies is to create a new XML config file to describe that
vocabulary.  You can see the XML config file for SRSC here (as an example):
https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/controlled-vocabularies/srsc.xml

So, the only way to currently support LCSH would be to create such a config
file for LCSH. Since we aren't legally allowed to redistribute that with
our code, we haven't created one. It's possible someone else created on
manually, but I'm unaware of it (but maybe someone else will speak up here
if they have).

Another option would be to enhance the current DSpace functionality to use
Library of Congress's "Linked Data Service".  This provides an API to
search/select terms from LCSH. http://id.loc.gov/  Again, as far as I'm
aware no one has implemented this (though it'd make a great feature to give
back to DSpace).

- Tim


On Wed, Feb 22, 2017 at 12:05 PM Long nam  wrote:

> hi All,
>
>
> Currently in our metadata schema, uploaders can input anything they want
> in the dc.subject field.  I would prefer to see this field reserved for LC
> Subject Headings, and I note that using this field for a controlled
> vocabulary is also recommended by Dublin core:
>
>
> Can someone please point out an instruction on how to enable it in Dspace?
>
>
> Many thanks
>
> -LN
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>
-- 

Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Harvesting single items (with bitstream) in DSpace

2017-03-02 Thread euler
Hi All,

I have a use case scenario in which I have to only harvest a specific item. 
I came across this post from SO: Harvesting single items with DSpace 
. His solution worked when I 
tried it, but in my case, I would like to have the bitstream included in 
the created ZIP file. Also, instead of using the default oai_dc 
metadataPrefix, I would like to use dim. Below is his solution in php code. 
Hope someone can point me how to modify his code to meet my requirements.

http://dspace.library.uu.nl/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:dspace.library.uu.nl:";
 . $handle;

// get XML from source repository
$sxe = simplexml_load_file($harvest, "SimpleXMLElement");

// add namespace schema-urls
$sxe->registerXPathNamespace('oai_dc', 
'http://www.openarchives.org/OAI/2.0/oai_dc/');
$sxe->registerXPathNamespace('dc', 'http://purl.org/dc/elements/1.1/');

// get Dublin Core (dc) elements from the XML
foreach($sxe->xpath("//oai_dc:dc") as $entry) {
$child = $entry->children('dc', true);
}

// add dc-elements (names and values) to array
foreach($child as $elementName => $elementValue) {$elements[$elementName][]  = 
$elementValue;}

// create zip-object and -file
$zip = new ZipArchive();
$zip->open("doc/importZip.zip", ZipArchive::CREATE);

// create a directory in the zip-object
$zip->addEmptyDir("item");

// create Dublin Core XML object
$oXML = new DOMDocument();
$oXML->encoding  = "UTF-8";
$oXML->formatOutput  = true;
$oXML->xmlStandalone = false;

$oRoot = $oXML->createElement('dublin_core');
$oRoot->setAttribute('schema', 'dc');
$oXML->appendChild($oRoot);

// add elements and their values to XML object
foreach($elements as $elementName => $elementValues) {
foreach($elementValues as $elementValue) {
$oDcValue = $oXML->createElement('dcvalue');
$oDcValue->setAttribute('element', $elementName);
$oText = $oXML->createTextNode($elementValue);
$oDcValue->appendChild($oText);
$oRoot->appendChild($oDcValue);
}
}

// save created XML to string
$dublinCoreXml = $oXML->saveXML();

// add XML-string as file to zip-object
$zip->addFromString("item/1/dublin_core.xml", $dublinCoreXml);

// add handle as file to zip-object
$zip->addFromString("item/1/handle", $handle);

$zip->close();

?>


Thanks in advance,
euler

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Change color from Green to Blue : Jspui interface see image

2017-03-02 Thread Lewatle Johannes Phaladi
Hi All,

I would like to change the color on green on the image to blue or to remove 
that image completely.

Regards,
Lewatle 

On Wednesday, 22 February 2017 11:35:20 UTC+2, Lewatle Johannes Phaladi 
wrote:
>
>
> 
> Dear Team,
>
> I am customizing JSPUI on development site, I would like to change green 
> color as it appears on the image to blue.
>
> Regards,
> Lewatle 
>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.