Re: [Dspace-tech] Issues in Media Filter PDF Text Extractor (PDFFilter and XPDF)
Hi euler This seems similar to http://dspace.2283337.n4.nabble.com/Character-encoding-issues-in-Discovery-search-results-tp4675835p4675839.html Perhaps it can help. euler schreef op 08/06/15 om 15:00: Dear All, I am having issues with the text extraction of pdfs having non latin characters and east asian languages. I tried switching to xpdf from pdfbox's pdffilter but it is also not properly extracting the text from the pdf. If I tried to extract the text from the pdf using the command line tools (ie java -jar pdfbox-app-1.8.7.jar ExtractText -encoding UTF-8 for pdfbox and pdftotext -enc UTF-8 for xpdf), it is properly extracting the text. Does anybody encountered that issue and how did you solved it? I looked at the XPDF2Text.java and in line 53 it does include the UTF-8 encoding ("@COMMAND@", "-q", "-enc", "UTF-8", "@infile@", "-"). I'm wondering why it is not properly extracting the text when I run filter-media but is working when I am running it from the command line. In PDFFilter.java, I tried using PDFTextStripper pts = new PDFTextStripper("UTF-8") but the result is still the same. Would greatly appreciate any hints, tips, suggestions and help. Thanks in advance and regards, euler -- View this message in context: http://dspace.2283337.n4.nabble.com/Issues-in-Media-Filter-PDF-Text-Extractor-PDFFilter-and-XPDF-tp4678283.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Assetstore not showing its contents DSpace 5.2 interface
It would then seem you need to run the psql command with -U dspaceadm and not -U dspaceadmin Regards Lewatle Phaladi schreef op 08/06/15 om 14:39: Hi Hilton, Please see the following results: dspace@dspace:~/install/dspace5.2$ ./bin/dspace database test Attempting to connect to database using these configurations: - URL: jdbc:postgresql://localhost:5432/wiredspace - Driver: org.postgresql.Driver - Username: dspaceadm - Password: [hidden] - Schema: Testing connection... Connected successfully! Regards, Lewatle *From:* Hilton Gibson [hilton.gib...@gmail.com] *Sent:* Monday, June 08, 2015 1:41 PM *To:* Lewatle Phaladi *Cc:* DSpace-tech@lists.sourceforge.net *Subject:* Re: [Dspace-tech] Assetstore not showing its contents DSpace 5.2 interface Hi Lewatle First test the database connection: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Install_DSpace/S09 Cheers hg *Hilton Gibson* Ubuntu Linux Systems Administrator Stellenbosch University Library http://staff.lib.sun.ac.za/~hgibson/docs/cv/cv.html <http://staff.lib.sun.ac.za/%7Ehgibson/docs/cv/cv.html> On 8 June 2015 at 13:33, Lewatle Phaladi <mailto:lewatle.phal...@wits.ac.za>> wrote: Hi All, I am getting the following error during upgrade on step 10. b https://wiki.duraspace.org/display/DSDOC5x/Upgrading+DSpace#UpgradingDSpace-ManuallyUpgradingSolrIndexes , all my upgrade steps went well and database migrate finished well on step 10. c . The new version is upgraded well and the current issue is on the assetstore, all the previous loaded items before upgrade are not visible under any collection, I can see only metadata but not PDF's, JPEG's etc. on upgraded version and the only step that is not completed is 10. b on upgrade documentation, is there any reason that may cause the following error? dspace@dspace:~/install/dspace5.2$ psql -U dspaceadmin etc/postgres/update-sequences.sql psql: FATAL: Peer authentication failed for user "dspaceadmin" Regards, Lewatle This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorised signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary. -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net <mailto:DSpace-tech@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorised signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary. -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.n
Re: [Dspace-tech] DSpace 5.2 ORCID lookup using input forms
Hi Hilton Gibson The information you're looking for is in the dspace.cfg configuration: choices.presentation.dc.contributor.author = authorLookup This will add a look-up button in the submission forms next to the corresponding input fields. But I think it only works for fields with input-type "onebox" or "name". The "vocabulary" tag in input-forms.xml is not related to authority control. Antoine Hilton Gibson schreef op 03/06/15 om 19:43: Sorry for the cross-post. Hi All, While configuring my dev server, I noticed there is no information about configuring an input form to use the ORCID lookup. See: https://wiki.duraspace.org/display/DSDOC5x/ORCID+Integration According to this information I can define a vocabulary lookup: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Submissions/Forms#.3Cvocabulary.3E What XML attribute do I use? Also see: http://wiki.lib.sun.ac.za/index.php/SUNScholar/Researcher_Identification/5.X/ORCID Regards hg -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] About XMLUI and cache
Hi Evgeni Dimitrov You have to modify the "public Serializable getKey()" and "public SourceValidity getValidity()" methods of that same class. These methods evaluate whether addBody should be called or whether the cached document may be used. Evgeni Dimitrov schreef op 13/05/15 om 11:01: Hi, This is about DSpace 5.1. I am trying to make a small change in the way an item is displayed. When the user has "write" rights on the item it should display as originally. Otherwise the item handle is passed to another web application. I added several lines in the beginning of org.dspace.app.xmlui.aspect.artifactbrowser.ItemViewer.addBody Request request = ObjectModelHelper.getRequest(objectModel); if(!AuthorizeManager.authorizeActionBoolean(context, item, Constants.WRITE)) { StringBuilder redirectURL = new StringBuilder(); redirectURL.append(request.getContextPath().replace("/xmlui", "/dspviewerf/")); redirectURL.append(item.getHandle()); redirectURL.append("?sq="); redirectURL.append("123"); HttpServletResponse httpResponse = (HttpServletResponse) objectModel.get(HttpEnvironment.HTTP_RESPONSE_OBJECT); httpResponse.sendRedirect(redirectURL.toString()); } It works as expected for the user without "write" rights. It works as expected for the user with "write" rights. But after it has worked once for the user with "write" rights, it creates cache and displays the same page for the user without "write" rights. What can I reasonably do? Disable the cache in some way? Best regards Evgeni -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Importing authority keys
Hi Pablo I'm not very clear on your setting and which action you're trying to make. There 2 commands available to import items using CSV: - The Batch Metadata Editing => bin/dspace metadata-import - The import via Simple Archive Format (in combination with BTE) => bin/dspace import The output you provided corresponds to the import via Simple Archive Format. This one is not configured by /config/modules/bulkedit.cfg (you wouldn't need to modify this anyway because import and export use the same configuration) The BTE import for CSV has some configuration in config/spring/api/bte.xml, but as you experienced this import does not seem to have support for authority keys. You said you're using dspace 4.1.2 but you also said the import was slow because "the import needs to query solr about that author in order to get his authority key". However the solr authority cache is new in dspace 5. Could you explain? Pablo Buenaposada schreef op 26/01/15 om 10:28: I only see dspace wiki links there. What's supposed I have to look? -- View this message in context: http://dspace.2283337.n4.nabble.com/Importing-authority-keys-tp4673813p4676379.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Character encoding issues in Discovery search results
Hi Alan Orth -Dfile.encoding=UTF-8 should be added to the "bin/dspace" command. Here is the line: https://github.com/DSpace/DSpace/blob/dspace-4.2/dspace/bin/dspace#L75 Then rerun 'index-discovery -b'. I believe this will resolve your problem. Antoine Snyers Alan Orth schreef op 09/12/14 14:49: Hi, Our DSpace 4.2's Discovery search results displays snippets from the item's full-text PDF extract, but we get mojibake (strange characters) in the summaries (see attached photo). Browsing to the item's PDF-extracted text bitstream indeed shows the strange characters, and Firefox's developer tools show the encoding is ISO-8859-1. What's strange is, if I download the file the resulting encoding is UTF-8, and these characters display properly. I have tried the following: - Confirmed our Tomcat connectors are using URIEncoding="UTF-8" - Forced "-Dfile.encoding=UTF-8" in JAVA_OPTS and manually re-run `filter-media' as well as `index-discovery -b' What could I be missing? Thanks! -- Alan Orth alan.o...@gmail.com <mailto:alan.o...@gmail.com> https://alaninkenya.org https://mjanja.ch "In heaven all the interesting people are missing." -Friedrich Nietzsche GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0 -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- logo *Antoine Snyers* /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/ /Esperantolaan 4, Heverlee 3001, Belgium/ www.atmire.com <http://atmire.com/website/?q=services&utm_source=emailfooter&utm_medium=email&utm_campaign=antoine> -- Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server from Actuate! Instantly Supercharge Your Business Reports and Dashboards with Interactivity, Sharing, Native Excel Exports, App Integration & more Get technology previously reserved for billion-dollar corporations, FREE http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] not work, ReIndexing Content for Browse or Search, upgrade 4.1
This is probably be the issue: https://jira.duraspace.org/browse/DS-2038 Julio Pemau schreef op 15/07/14 11:03: We areupgrading from version 1.6.2 to version 4.1 of dspace. We use oracle as the database, dspace is installed on Linux and we use JSPUI. We performedthe update to version 4.1 successfully, but the faceted searching and browsing functionalities don´t work. We haveexecuted "dspace index-discovery -b" command and the process was completed without error, but faceted search still not working. Any tips? Thanks in advance -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette