filetype and filename indexes are not automatically recomputed by bibindex when a bibdoc is deleted
First of all, let me say that I didn't have time to thoroughly test this, and I might be wrong in my findings. It seems that at least in the master branch when one deletes ALL bibdocs from a record (say record:97 in the Atlantis site) and runs a search with: filetype:a-z Results are returned even for that record, when they shouldn't. (Results are also returned if one searches the deleted filename in the filename index) It seems that my /normal/ bibindex task does not see that something has changed in the record. Even if one runs it manually (bibindex -a -i 97 -u admin), all indexes remain the same. If one forces reindexing of the specific record (bibindex -a -i 97 --force -u admin), ALL indexes are recomputed and everything works as expected, but this is not applicable to a production system. Should I use a different setting in my everyday bibindex tasklet, or is this a bug? Kind regards, Theodoros
Re: filetype and filename indexes are not automatically recomputed by bibindex when a bibdoc is deleted
Thanks Sam for your prompt reply! On 2/3/2015 5:19 μμ, Samuele Kaplun wrote: This is unfortunately a known bug with a partial solution that is not yet ready to integrate. Please see: https://github.com/inveniosoftware/invenio/issues/2448 and the corresponding (WIP) PR: https://github.com/inveniosoftware/invenio/pull/2646 !! I remember bumping into 2646 by chance a few days ago, but I didn't make the connection because of the title that mentioned 'author-indexing holes'. It seems that deleting a docfile, indeed updates the record 'outside of MARC', so I can now see the relation to this issue. A dirty workaround we have implemented in INSPIRE OPS branch is: https://github.com/inspirehep/invenio/commit/aab658404d5187231b1759395d0a3d0f8d2ce6df maybe that it can help with your case? (although it will slow down a bit BibIndex). Thanks for the quick'n'dirty workaround. To be honest, it's not a major issue for me right now. I just wanted to report it because it /seems/ important. We have only a relatively small number of deleted bibdocs and I was wondering why I was still getting their records when searching with the filetype index. I can easily find the rec ids of the deleted bibdocs and force a reindexing on these ids only. I will wait until a 'proper' solution is presented to fix this for good (it will hopefully be part of the 1.2 version, no?) Thanks again, Theodoros
Two bibauthorid issues in master
Hello everyone, I'm experiencing the following 'issue' with bibauthorid. (I'm using the latest master branch and the demo site with only the demo records). # sudo -u apache /opt/invenio/bin/bibauthorid -u admin --disambiguate --from-scratch Bibauthorid Task Submission === Username: admin 2015-02-06 11:04:56 -- Task #15 submitted. [root@droopy invenio]# sudo -u apache /opt/invenio/bin/bibauthorid 15 2015-02-06 11:05:02 -- Task #15 started. Process Process-185: Traceback (most recent call last): File /usr/lib64/python2.7/multiprocessing/process.py, line 258, in _bootstrap Process Process-186: Traceback (most recent call last): File /usr/lib64/python2.7/multiprocessing/process.py, line 258, in _bootstrap self.run() self.run() File /usr/lib64/python2.7/multiprocessing/process.py, line 114, in run File /usr/lib64/python2.7/multiprocessing/process.py, line 114, in run self._target(*self._args, **self._kwargs) self._target(*self._args, **self._kwargs) File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_wedge.py, line 70, in wedge File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_wedge.py, line 70, in wedge matr = ProbabilityMatrix(cluster_set.last_name) matr = ProbabilityMatrix(cluster_set.last_name) AttributeError: 'function' object has no attribute 'last_name' AttributeError: 'function' object has no attribute 'last_name' Process Process-187: [...] (lots of similar errors) [...] Process Process-367: Traceback (most recent call last): File /usr/lib64/python2.7/multiprocessing/process.py, line 258, in _bootstrap self.run() File /usr/lib64/python2.7/multiprocessing/process.py, line 114, in run self._target(*self._args, **self._kwargs) File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_wedge.py, line 70, in wedge matr = ProbabilityMatrix(cluster_set.last_name) AttributeError: 'function' object has no attribute 'last_name' Process Process-368: Traceback (most recent call last): File /usr/lib64/python2.7/multiprocessing/process.py, line 258, in _bootstrap self.run() File /usr/lib64/python2.7/multiprocessing/process.py, line 114, in run self._target(*self._args, **self._kwargs) File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_wedge.py, line 70, in wedge matr = ProbabilityMatrix(cluster_set.last_name) AttributeError: 'function' object has no attribute 'last_name' 2015-02-06 11:08:15 -- Task #15 finished. [RUNNING] 2015-02-06 11:08:15 -- Unexpected error occurred: local variable 'tortoise_db_name' referenced before assignment. 2015-02-06 11:08:15 -- Unexpected error occurred: local variable 'tortoise_db_name' referenced before assignment. 2015-02-06 11:08:15 -- Traceback is: 2015-02-06 11:08:15 -- Traceback is: 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibtask.py, line 610, in task_init 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibtask.py, line 610, in task_init 2015-02-06 11:08:15 -- ret = _task_run(task_run_fnc) 2015-02-06 11:08:15 -- ret = _task_run(task_run_fnc) 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibtask.py, line 1173, in _task_run 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibtask.py, line 1173, in _task_run 2015-02-06 11:08:15 -- if callable(task_run_fnc) and task_run_fnc(): 2015-02-06 11:08:15 -- if callable(task_run_fnc) and task_run_fnc(): 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_daemon.py, line 157, in _ta sk_run_core 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_daemon.py, line 157, in _ta sk_run_core 2015-02-06 11:08:15 -- run_tortoise(bool(bibtask.task_get_option(from_scratch))) 2015-02-06 11:08:15 -- run_tortoise(bool(bibtask.task_get_option(from_scratch))) 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_daemon.py, line 319, in run _tortoise 2015-02-06 11:08:15 -- File /usr/lib64/python2.7/site-packages/invenio/bibauthorid_daemon.py, line 319, in run _tortoise 2015-02-06 11:08:15 -- insert_user_log(tortoise_db_name, '-1', '', '', '', timestamp=start_time) 2015-02-06 11:08:15 -- insert_user_log(tortoise_db_name, '-1', '', '', '', timestamp=start_time) 2015-02-06 11:08:15 -- Exiting. 2015-02-06 11:08:15 -- Exiting. - One issue seems to be: AttributeError: 'function' object has no attribute 'last_name' . Seems important, but this doesn't break things. - The Unexpected error occurred: local variable 'tortoise_db_name' referenced before assignment does break the process and happens only with '--from-scratch' is used. Looking at bibauthorid_daemon.py it seems that moving tortoise_db_name and start_time definitions outside the else block solves the problem. Having
Re: Two bibauthorid issues in master
On 6/2/2015 11:23 πμ, Tibor Simko wrote: To add to Sam's message, basically the BibAuthorID module in Invenio/master works mostly for INSPIRE conditions only. With default Invenio settings, it leads to a problem described in detail here: https://github.com/inveniosoftware/invenio/issues/1862 That's it! I was not getting results because of LIMIT_TO_COLLECTIONS = ['HEP'] :) P.S. Would using BibAuthority instead of BibAuthorID be an option on your installation? Well, to my understanding, BibAuthorID is THE tool to use if one has the authorities and author IDs already created. On the other hand, If one has a bunch of names and publications and wants to automatically create a quick'n'dirty list and then let users to slowly clean up the mess, BibAuthorID may also be useful :) It also gives a nice author page with summaries, etc. I'm also mentioning it because bibAuthorID will be included in the v1.2 release... Even if it doesn't produce the best results for non-Inspire sites (which is OK), it wouldn't look good if -in it's current form- /breaks/ for atlantis.cfg... Cheers, Theodoros
Re: Multi-Record Editor question
On 7/3/2014 10:12 πμ, Samuele Kaplun wrote: This can be achieved through the use of the new BibCheck module in invenio- master and above. Thanks Sam for the info and the instant reply :) BibCheck looks good enough for my task. [BTW, I'm running a master, but a checkout before 18/12/2013 (date this module was introduced), so I didn't have it around and that's a pity because it looks powerful enough, both for automatic checks and updates]
Multi-Record Editor question
Hello everyone, Is there a way in Muilti-Record Editor to define a subfield action that references a value of another field/subfield? (I'm after updating subfield a/field x and dynamically give it the value of subfield b/field y) If this is not possible with the GUI, what is the best way to go about it? (I'm betting on API/python) Best regards, Theodoros
Exception when having an index with no fields to get data from
Hello everyone, I recently got an AttributeError: 'JsonReader' object has no attribute 'decode' error from the bibindex task. What I did before getting the error (with bibsched in 'manual' mode and no running jobs) was: a) to to purge all index tables related to index id 20 (authority author) and b) remove the 'authority author' set of fields from the authority author index. The first part is common practice and I don't think it could be related to the error. The second part is the most suspicious, because now the respective index has NO related fields (this is how I want it to be for the time being because, until a way to force the tokenizer to deal only with records from the AUTHORITY collection is found). Indeed, if I reconnect the authority author set of fields to the authority author index, things go back to normal. Could you replicate it in a stock/demo invenio site? If this is true, I would happily create a ticket so that similar situations (indexes with no fields to get data from) are handled properly in the future. The excerpt from the bibedit log is the following: 2014-02-07 12:01:35 -- No new authority records added. idxPHRASE18F is up to date 2014-02-07 12:01:35 -- idxPHRASE18F contains 230 words from 87272 records 2014-02-07 12:01:35 -- idxPHRASE18F is in consistent state 2014-02-07 12:01:35 -- idxWORD20F contains 0 words from 0 records 2014-02-07 12:01:35 -- idxWORD20F is in consistent state 2014-02-07 12:01:35 -- idxWORD20F for 204053-204053 is in consistent state 2014-02-07 12:01:35 -- idxWORD20F adding records #204053-#204053 started 2014-02-07 12:01:35 -- Exception caught: 'JsonReader' object has no attribute 'decode' 2014-02-07 12:01:36 -- idxWORD20F normal wordtable flush started 2014-02-07 12:01:36 -- ...updating 0 words into idxWORD20F started 2014-02-07 12:01:36 -- ...updating 0 words into idxWORD20F ended 2014-02-07 12:01:36 -- ...updating reverse table idxWORD20R started 2014-02-07 12:01:36 -- ...updating reverse table idxWORD20R ended 2014-02-07 12:01:36 -- idxWORD20F normal wordtable flush ended 2014-02-07 12:01:36 -- Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/bibtask.py, line 996, in _task_run if callable(task_run_fnc) and task_run_fnc(): File /usr/lib64/python2.6/site-packages/invenio/bibindex_engine.py, line 1435, in task_run_core wordTable.add_recIDs_by_date(task_get_option(modified), task_get_option(flush)) File /usr/lib64/python2.6/site-packages/invenio/bibindex_engine.py, line 821, in add_recIDs_by_date self.add_recIDs(alist, opt_flush) File /usr/lib64/python2.6/site-packages/invenio/bibindex_engine.py, line 757, in add_recIDs just_processed = self.add_recID_range(i_low, i_high) File /usr/lib64/python2.6/site-packages/invenio/bibindex_engine.py, line 885, in add_recID_range new_words = tokenizing_function(record) File /opt/invenio/lib/python/invenio/bibindex_tokenizers/BibIndexAuthorTokenizer.py, line 334, in tokenize_for_words return self.tokenize_for_words_default(phrase) File /opt/invenio/lib/python/invenio/bibindex_tokenizers/BibIndexAuthorTokenizer.py, line 299, in tokenize_for_words_default return super(BibIndexAuthorTokenizer, self).tokenize_for_words(phrase) File /usr/lib64/python2.6/site-packages/invenio/bibindex_tokenizers/BibIndexDefaultTokenizer.py, line 80, in tokenize_for_words phrase = wash_for_utf8(phrase) File /usr/lib64/python2.6/site-packages/invenio/textutils.py, line 405, in wash_for_utf8 text.decode(utf-8) AttributeError: 'JsonReader' object has no attribute 'decode' 2014-02-07 12:01:37 -- Task #255 finished but not resubmitted. [CERROR] Cheers, Theodoros Theodoropoulos
authority indexes to gather data from authority records only
Hello everyone and apologies if this has been discussed (or even tackled). The issue I'm having is that authority-related indexes (ie authority author) are getting too large because some of the 'authority' fields that should be indexed coincide with some popular bibliographic fields (such as 500__a which is see also/personal name in authority records and note field in bibliographic records) and thus the relative index fills up with 'garbage'. With that in mind, I think it would be nice to be able to restrict indexing to certain collections only. If this seems like a bad policy (and maybe it is), a check button could instead exist in the bibindexadmin page that could be used to indicate an authority-related index (and thus restrict the data gathered to authority-related records only). Does this seem reasonable? Cheers, Theodoros
Re: authority indexes to gather data from authority records only
On 3/2/2014 1:31 μμ, Alexander Wagner wrote: BTW: we used 400 instead of 500 for synonyms. Did we do this one wrong? I don't think you did anything wrong there. 400 is 'see' and 500 is 'see also' for authorities. For alternative names 400 seems more proper, although both can be used without problem. (I think that 400 is mainly used if you want to directly point to another authority that deals with the same person -ie a name variant-, where 500 is used when you want to point to a relative or a more generic term/name). Anyway, I still have 500 because this is the default value for the 'demo' site :) You see, I still don't have any usable authorities from an 'authority' collection (it's in the TODO list and I'm waiting for this feature to get more mature before I use it in production). Having said that, I'm keeping the default indexes from the latest master for better future compliance and so that I'll be able to test stuff :) Cheers, Theodoros
Re: authority indexes to gather data from authority records only
On 3/2/2014 2:47 μμ, Alexander Wagner wrote: Anyway, I still have 500 because this is the default value for the 'demo' site :) Ah. It could be that when Chris did this stuff there were some peding issues in discussion or just a typo as well. Clarification: I have BOTH 400a and 500a for authority author index (so Chris did a wonderful job including them both), but it's the 500a that causes garbage in the relevant tables because it's heavily used with other bibliographic records (while 400a is not).
webstatadmin/custom events question
Hello everyone, While trying to see how custom statistics work in Invenio, I was wondering what are the 'allowed' param values in webstat.cfg The reason I'm asking the list is because after 'loading' the default webstat.cfg with --load-config [*], the 'doctype' column in the staEVENT04 table gets populated with the tranlated doctype long name values (doctype_lname) instead of the doctype 'code' that one would expect... Cheers, Theodoros [*] webstat_custom_event_4 is defined as: [webstat_custom_event_4] name = websubmissions param1 = doctype
IndexError: string index out of range [in external collections when certain conditions apply]
Hello everyone, I'm recently seeing this error a lot: ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 424, in _handler return root._traverse(req, path, False, guest_p) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse result = _check_result(req, obj(req, form)) File /usr/lib64/python2.6/site-packages/invenio/websearch_webinterface.py, line 525, in __call__ out = perform_request_search(req, **argd) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 5282, in perform_request_search return prs_perform_search(kwargs=kwargs, **kwargs) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 5293, in prs_perform_search return prs_search(kwargs=kwargs, **kwargs) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 5465, in prs_search output = prs_search_common(kwargs=kwargs, **kwargs) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 6214, in prs_search_common prs_search_hosted_collections(kwargs=kwargs, **kwargs) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 5702, in prs_search_hosted_collections (hosted_colls_results, hosted_colls_timeouts) = calculate_hosted_collections_results(req, [p, p1, p2, p3], f, hosted_colls, verbose, ln, CFG_HOSTED_COLLECTION_TIMEOUT_ANTE_SEARCH) File /usr/lib64/python2.6/site-packages/invenio/websearch_external_collections.py, line 348, in calculate_hosted_collections_results verbosity_level) File /usr/lib64/python2.6/site-packages/invenio/websearch_external_collections.py, line 381, in calculate_hosted_collections_search_params basic_search_units = create_basic_search_units(None, pattern, field) File /usr/lib64/python2.6/site-packages/invenio/search_engine.py, line 801, in create_basic_search_units if f and p[0] == '' and p[-1] == '': IndexError: string index out of range I've noted that the problem arises only when all the following are true: - Externally hosted collections - The f in the search uri has a value (ie f=title, f=year, etc). The same searches in the 'any field' index work without errors. - The search-for value must be empty - Search is performed from within the external collection (see below for explanation) Steps to replicate: 1. Go inside an externally hosted collection (do NOT just click search in the top collection and then select the external collection from the dropdown menu, as the error does NOT(!) appear). 2. Select a search index other than 'any field' 3. Leave the search-for term, empty 4. Hit search Can you verify my findings? If so, I should probably create a ticket... Best regards, Theodoros Theodoropoulos ps. btw, I don't think it's a timeout issue because the same search in the other invenio server that hosts the collection returns results almost instantly.
strange exception in Make_Dummy_MARC_XML_Record
Hello everyone, For some strange reason, I got this exception with Make_Dummy_MARC_XML_Record websubmit function. I don't know if the produced dummy_marcxml_rec is used somewhere (ie in the approval process), but I included it anyway. FYI, this is the first time this exception is thrown (after several successful submissions), the dummy_marcxml_rec file exists in the record directory, is valid xml and readable/writable by apache, so I don't know why I get it... * 2013-12-18 12:40:43 - InvenioWebSubmitFunctionError: Error: Unable to create dummy MARC XML record [/opt/invenio/var/data/submit/storage/running/IKEEART/1387362462_5686/dummy_marcxml_rec]. Bibconvert reported no error, but the record was unreadable later. (Make_Dummy_MARC_XML_Record.py:120:Make_Dummy_MARC_XML_Record) [...] ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/websubmit_engine.py, line 1125, in endaction ln=ln) File /usr/lib64/python2.6/site-packages/invenio/websubmit_engine.py, line 1736, in print_function_calls func_returnval = eval(function(parameters=parameters, curdir=curdir, form=form, user_info=user_info), the_globals) File string, line 1, in module File /opt/invenio/lib/python/invenio/websubmit_functions/Make_Dummy_MARC_XML_Record.py, line 120, in Make_Dummy_MARC_XML_Record raise InvenioWebSubmitFunctionError(err_msg) InvenioWebSubmitFunctionError: Error: Unable to create dummy MARC XML record [/opt/invenio/var/data/submit/storage/running/IKEEART/1387362462_5686/dummy_marcxml_rec]. Bibconvert reported no error, but the record was unreadable later. ** Stack frame details Frame Make_Dummy_MARC_XML_Record in /opt/invenio/lib/python/invenio/websubmit_functions/Make_Dummy_MARC_XML_Record.py at line 120 --- 117 Bibconvert reported no error, but the record was \ 118 unreadable later. % (curdir, CFG_WEBSUBMIT_DUMMY_XML_NAME) 119 register_exception(req=req_obj, prefix=err_msg) 120 raise InvenioWebSubmitFunctionError(err_msg) 121 122 # Escape XML-reserved chars and clean the unsupported ones (mainly 123 # control characters) --- [...] If it doesn't seem important to you (or doesn't ring any bells) I'll ignore it and move on. Cheers, Theodoros
mailutils.send_mail with multiple recipients sends mail only to the first recipient
Hello all, Strange as it might seem, I'm trying to send an email from a webfunction to two recipients and it's only delivered to the first one! I tried the same addresses using mailutils in ipython: from invenio import mailutils mailutils.send_email([serverurl],[email1],[email2],subject=test,content=test content, debug_level=9) and I see that it's only send to the FIRST email address! (BTW, both emails are valid and working :) but i ) Having said that, when I enable CFG_DEVEL_SITE=1 and run THE SAME command, as an admin I get an email that shows both addresses, so it's not an issue of splitting the email strings etc, but something deeper... (maybe python-related?) Can you verify it too? It seems a bit strange to be a bug and not having been reported by anyone else, on the other hand, I cannot find anything wrong with my tests... Hmmm... Best regards, Theodoros
Re: mailutils.send_mail with multiple recipients sends mail only to the first recipient
On 5/12/2013 7:06 μμ, Theodoros Theodoropoulos wrote: from invenio import mailutils mailutils.send_email([serverurl],[email1],[email2],subject=test,content=test content, debug_level=9) and I see that it's only send to the FIRST email address! (BTW, both emails are valid and working :) but i ) FYI, the above SAME code in invenio 0.99.x branch works without problems!! If you can verify my findings, I think I should create a ticket with 'critical' priority (because mails could be lost if not fixed) (I also tried using list of strings in the replyto, with the same results) Cheers, Theodoros
Re: mailutils.send_mail with multiple recipients sends mail only to the first recipient
Correction: (I also tried using list of strings in the TOADDR, with the same results) meaning: mailutils.send_email(serverurl, [email1,email2], subject=test, content=test content, debug_level=9)
Re: mailutils.send_mail with multiple recipients sends mail only to the first recipient [FOUND IT]
I found the culprit! It's remove_temporary_emails() ! Although the function works as expected (tried it), it changes 'toaddr' from list of strings, to strings. Here's the code: if type(toaddr) is str: toaddr = toaddr.strip().split(',') up to here toaddr it's definitely list of strings... toaddr = remove_temporary_emails(toaddr) Here it's a simple string Then things start to not work as expected. If I comment out remove_temporary_emails, everything works as expected. I'll post a ticket. remove_temporary_emails should return list of strings. On 6/12/2013 8:39 πμ, Theodoros Theodoropoulos wrote: Correction: (I also tried using list of strings in the TOADDR, with the same results) meaning: mailutils.send_email(serverurl, [email1,email2], subject=test, content=test content, debug_level=9)
(simple?) search question
Hello everyone, I have this situation: Say I have the following marc: 700__ $$alastname1, firstname1$$eauthor 700__ $$alastname2, firstname2$$eeditor and i want to find all records where author is lastname1, firstname1 and THIS PERSON is editor Provided that i have an author index with 100/700 a and e logical fields, if I use the search query author:editor and author:lastname1, firstname1, the above record will be returned, but it shouldn't. So, briefly, what I need is a type of sub-search in the same field as the first argument... Is there a way (using regular expressions or a trick/hack) to make this work and get some proper results? Thanks in advance, Theodoros
Re: (simple?) search question
On 3/12/2013 4:23 μμ, Tibor Simko wrote: On Tue, 03 Dec 2013, th...@physics.auth.gr wrote: So, briefly, what I need is a type of sub-search in the same field as the first argument... Is there a way (using regular expressions or a trick/hack) to make this work and get some proper results? Right now you'd have to make a second pass: firstly, retrieve results as you do now (including false positives); secondly, pass through results calling field instance filtering function that would pick only the fields having wanted another subfield. WRT filtering, see: http://invenio-software.org/ticket/1550 and the related commits. So, while this is doable, and the master branch has ingredients ready, there is some code to write and run after your searches, which may be too slow in case there is a huge number of results returned in the first step before the filtering step is called. Luckily(?), I'm running master, so all the functions are there, and it's good to know that there is a programmatic way to make it work, however in my case i need to provide a static search string (to pass to an external site), so that only certain records get returned to their xml parser and displayed. BTW just yesterday we thought about adding a special index for a similar searching need, auhtor-at-institute, where a user would type: ellis@CERN and the query would return all records where 700 $a contained `ellis' with the corresponding 700 $u containing `cern'. So your email is coming very timely :) P.S. The post-filtering technique could be generalised into a new second-order search operator, say @, so that all the following queries would be legal: author:ellis@affiliation:CERN author:ellis@relator:editor title:of@provenance:arXiv Wow! The second-order search operation is exactly what i need! This is indeed a very powerful option to have around! (just imagine being able to use the same idea to create a n-th order search operator!) Thanks for the prompt reply, Best regards, Theodoros ps. I suppose that this enhancement will require a lot of new code in several modules... Let's hope that I'll be able to patch my installation if/when committed to master...
Re: External collection items not counted in the parent collection in websearch
Thanks Nikos, I applied it already and I can verify that it's working! On 2/12/2013 9:58 ??, Nikos Kasioumis wrote: A patch has been provided for this issue http://invenio-software.org/ticket/1651 and it has been merged in http://invenio-software.org/repo/invenio/commit/?id=6be1b3d522ae94383710fed1d29ae8122ac21b8f.
How to find records with fulltexts? (Introduction of fulltextcount index?)
Hello everyone, I'm looking for an easy way to explain to a librarian how to get a list of records that have (-not) fulltext(s). (I remember something similar being discussed a while ago with Ferran and Alexander). Searching for: 8564_:[CFG_SITE_URL]* gives me all records with local urls, but it's not very pretty. [btw, replacing 8564_ with fulltext gives no results]. Maybe a 'special' fulltextcount index (like authorcount) would be useful... Having said that, I suppose I would need a custom tokenizer for that, no? Also, if possible, please add a hint in the search-guide help page (in the span-queries paragraph maybe?) about ways to check the existence (or not) of a marc (sub)field, as this seems to be a common request by librarians. Mostly for my reference, such an example would be: Show records that have ie a year field: and year:0-Z Show records that DO NOT have ie a year field: and not year:$-~ (this trick doesn't work for the 'fulltext' index. 8564_ should be used instead.) Best regards, Theodoros
External collection items not counted in the parent collection in websearch
Hello all, I'm not sure if anyone actively uses externally hosted collections, but I verified that items in such collections are not counted in the parent collection(s) in websearch. I'm not interested in nested external collections (I think that this is impossible anyway), I just want to have a proper sum of records even if the (first) child collection is external. See the attached image for an example. The 'external' server is also an invenio instance and everything else works properly. Is it possible to make this work? Should I create a ticket? Cheers, Theodoros
Re: Bibupload/FFT only, replacing Marc records
Thank you Sam for the clarification (btw, I 'fixed' my records by just re-uploading[1] the marc, so no problem there). Bibdocfile is a powerful module and definitely my best friend in this case, however, I indeed find bibupload documentation a bit misleading on this matter. As bibupload is a module widely used by people and that this issue could potentially lead to temporary loss of marc data (you don't always have the xml of the records to 'fix' them and reverting all of them into a previous revision is a bit 'tricky'), I will submit a ticket (of 'trivial' priority) so that -when time for the documentation updates comes- you also update the relevant part of the -otherwise clear and extensive- bibupload documentation. Cheers, Theodoros [1] Using bibedit list-revisions followed by --revert-to-revision for these records would be a 'cleaner' alternative, but I also found the opportunity to do some trivial batch changes to the marc, so reuploading was a better solution in this case. On 15/11/2013 10:01 μμ, Samuele Kaplun wrote: In data venerdì 15 novembre 2013 18:14:57, Theodoropoulos Theodoros ha scritto: Hello everyone, Today I run a bibupload -n -r xxx.xml job with only FFT tags in order to enrich some of my records with fulltexts. Everything went fine (no errors/no warnings) but the affected records got replaced entrirely by 3 tags (001/005/856) wiping out everything else. Shouldn't 'replace' mode (when used only with FFT tags) be smart enough and not touch (replace) the bibs? Reading the bibupload docs, I believed that I was allowed to use 'replace' mode with files... Is this a known behavior (probably even fixed in a personal branch), or did I hit another bug? ;) Hi! It's a feature ;-) not a bug (OK maybe it's a bug in the documentation)? bibupload --replace is really a replace, no matter what you think: if you pass an empty record with FFTs it will drop the existing records (and all the bibdocs attached) and replace with the empty record, executing the FFTs. If you want to just drop the files you can use bibdocfile CLI first. Alternatively you can send a bibupload --correct (with empty record, thus corrcting nothing, and FFTs, thus correcting corresponding bibdocs.) Cheers! Sam
Important change in recent shibd.conf that breaks invenio/shibboleth interaction
I would like to share an issue I experienced today and is likely to come up with any new invenio/shibboleth installations. It seems the in recent shibboleth rpms (at least 2.5.x ones) there is a new directive that goes by default into shibd.conf and breaks how invenio interacts with the shibd deamon. The 'suspicious' new code in shibd.conf is the following: Location /Shibboleth.sso Satisfy Any Allow from all /Location A simple test to replicate the problem (provided that you have at least set up invenio-apache-vhost-ssl.conf to use shibboleth) is to check for a link that should -normally- be handled by shibboleth. For example: /Shibboleth.sso/Login The symptom is that the relevant part of the invenio-apache-vhost-ssl.conf is overridden for the /Shibboleth.sso/ urls and hence not handled properly... Thus, instead of logging in, you get a 404 error from Invenio. The solution is to remove (or comment-out) the relevant 4 lines from shibd.conf and restart apache. In case you verify my findings, you might want to put a note in the HowToIntegrateWithShibboleth wiki entry... Best regards, Theodoros Theodoropoulos ps. Although it's simple to understand once you see it, it took me ~4 hours to rule out all other possible shibboleth/invenio/wsgi/etc configuration problems and start looking at the basic apache configurations line-by-line...
Re: Exception in bibrank (citerank_citation_t) with empty citations table
Hello Tibor, Are there really no citations to read? (1) If you don't have any citations on your system, then this simply means that the ranking daemon is not ready for a situation with zero citations. We used to have such a problem in the past: http://invenio-software.org/ticket/499 Maybe it is back. In this case you may want to inactivate this ranking method until we check and fix. I'm pretty sure that there are no citations in my system (at least not at this point). So, if you test in a dummy Invenio site with a few demo records with no citations (the Poetry collection maybe?) and verify my findings, it might be a bug. Having said that, I won't be needing citation-based ranking in the near future (or at least until I put some useful citation data), and I would be happy to inactivate it, but how? Cheers, Theodoros
BibIndex VERY slow for itemcount and filetype indexes and Atlas records
(Following Sam's suggestion I'm re-sending this to the dev list.) I'm currently experiencing VERY slow bibindex times (~2-3min/rec) for itemcount and filetype indexes[1] and records that have many authors (for example Atlas related experiments). I'm running the latest master. Have you noticed it too at CERN? I understand that -as Esteban mentioned- these indexes are very complex and they are computed on-the-fly, but still the time spent on them makes the re/indexing unusable. I could always re-enable CFG_BIBUPLOAD_SERIALIZE_RECORD_STRUCTURE and maybe speed things up a litte, but I was trying to save some space in the DB :) I don't want to delete the index, in order to be on par with other installations and I only have ~200 such records, but as you understand it takes several hours to index them! (If you experience this issue as well, with all these experiments going on, it will take forever to re/index INSPIRE regardless of all the computing power you put on it...). Is there anything that I could do to make it run a bit faster? I was also wondering... while you're optimizing the algorithms (and only if bibfield is not used elsewhere) whether I could tweak (i.e. empy) bibfield.cfg to speed things up. Cheers, Theodoros [1] The itemcount index is bearable (maybe because I'm not using circulation and hence I don't have items?), the filetype takes forever...
Re: Exception in bibrank (citerank_citation_t) with empty citations table
On 12/11/2013 5:17 μμ, Tibor Simko wrote: The UI way is to use the BibRank Admin Interface: https://localhost/admin/bibrank/bibrankadmin.py there is an option to `Delete' a ranking method. Oh, I was disabling citation and citerank_citation_t (for all my collections) but I suppose this isn't enough... I will try delete. I'm a newbie in BibRank and I never really touched the default settings... As an additional feedback, bibrank -S produces the following: 2013-11-12 18:15:27 -- Running rank method: times cited. 2013-11-12 18:15:27 -- No new records added since last time method was run 2013-11-12 18:15:27 -- Showing statistic for selected method 2013-11-12 18:15:27 -- Method name: times cited 2013-11-12 18:15:27 -- Short name: citation 2013-11-12 18:15:27 -- Last run: 2013-11-12 08:06:54 2013-11-12 18:15:27 -- Number of records: 10 2013-11-12 18:15:27 -- Lowest value: 1 - Number of records: 5 2013-11-12 18:15:27 -- Highest value: ('', -99) - Number of records: 0 2013-11-12 18:15:27 -- Divided into 10 sets: 2013-11-12 18:15:27 -- Exception caught: can only concatenate tuple (not int) to tuple 2013-11-12 18:15:27 -- Exception caught: can only concatenate tuple (not int) to tuple 2013-11-12 18:15:27 -- Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/bibrank_tag_based_indexer.py, line 382, in bibrank_engine rank_method_code_statistics(rank_method_code) File /usr/lib64/python2.6/site-packages/invenio/bibrank_tag_based_indexer.py, line 309, in rank_method_code_statistics lower = -1.0 + ((float(max + 1) / 10)) * (i - 1) TypeError: can only concatenate tuple (not int) to tuple 2013-11-12 18:15:27 -- Exception caught: 2013-11-12 18:15:27 -- Exception caught: 2013-11-12 18:15:27 -- Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/bibrank.py, line 163, in task_run_core func_object(key) File /usr/lib64/python2.6/site-packages/invenio/bibrank_tag_based_indexer.py, line 472, in citation return bibrank_engine(run) File /usr/lib64/python2.6/site-packages/invenio/bibrank_tag_based_indexer.py, line 398, in bibrank_engine raise StandardError StandardError 2013-11-12 18:15:27 -- Task #158 finished. [ERROR] But should(?) probably not worry you too much as it might be related to the fact that bibrank did not finish properly...
Re: Exception in bibrank (citerank_citation_t) with empty citations table
On 12/11/2013 6:26 μμ, Alessio Deiana wrote: It is complaining that you have no citations indexed by bibrank. Therefore ranking by citation won’t work. Did you run bibrank -w citation ? Or maybe your local records don’t have any valid ref/cit Well... this is my point. That in case no citations are present (which applies to my records), this exception pops ups... This is the first time bibrank is run and I didn't specify any particular ranking method (so citation was executed too). I now DELETED (as opposed to disabling them for all my collections) citerank_citation_t, citerank_pagerank_c and citerank_pagerank_t methods (keeping backups of the cfg files) and the remaining ranking methods seem to be executed smoothly (fingers crossed).
Re: argument of type 'NoneType' is not iterable for filetype/itemcount indexes and deleted records [SOLVED]
Thanks Esteban, with the patch the [re]indexing works :) Cheers, Theodoros On 11/11/2013 3:23 μμ, Esteban J. G. Gabancho wrote: I was able to reproduce the error in my machine I found what was cousing it, I atach you a very little patch. Once you apply it could you please run: inveniocfg --load-bibfield-conf dbexec echo DELETE FROM bibfmt WHERE format='recjson' Cheers, Esteban.
Error when reverting a deleted record from Bibedit (gui)
In bibedit, i tried to revert back a DELETED record and got the following error: 2013-11-11 15:07:57 -- Task #131 started. 2013-11-11 15:07:57 -- Input file '/opt/invenio/var/tmp-shared/bibedit-cache/bibedit_record_2_1.xml', input mode 'replace'. 2013-11-11 15:07:57 -- -ERROR: Missing 005 - 2013-11-11 15:07:57 -- Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/bibtask.py, line 984, in _task_run if callable(task_run_fnc) and task_run_fnc(): File /usr/lib64/python2.6/site-packages/invenio/bibupload.py, line 2912, in task_run_core results_for_callback=results_for_callback) File /usr/lib64/python2.6/site-packages/invenio/bibupload.py, line 2837, in bibupload_records tmp_vers = tmp_vers) File /usr/lib64/python2.6/site-packages/invenio/bibupload.py, line 332, in bibupload submit_ticket_for_holding_pen(rec_id, err, Missing 005. Inserting record into holding pen.) File /usr/lib64/python2.6/site-packages/invenio/bibupload.py, line 741, in submit_ticket_for_holding_pen bibcatalog_system.ticket_submit(subject=%s: %s by %s % (msg, rec_id, user), recordid=rec_id, text=text, queue=CFG_BIBUPLOAD_CONFLICTING_REVISION_TICKET_QUEUE, owner=uid) File /usr/lib64/python2.6/site-packages/invenio/bibcatalog_system_email.py, line 88, in ticket_submit ownerset = owner: + escape_shell_arg(owner) + '\n' File /usr/lib64/python2.6/site-packages/invenio/shellutils.py, line 298, in escape_shell_arg raise TypeError(msg) TypeError: ERROR: escape_shell_arg() expected string argument but got '1L' of type 'type 'long''. 2013-11-11 15:07:57 -- Task #131 finished. [CERROR] FYI, /opt/invenio/var/tmp-shared/bibedit-cache/bibedit_record_2_1.xml DOES have a 005 field and is: controlfield tag=00520130514094613.0/controlfield Any ideas? Cheers, Theodoros
Exception in bibrank (citerank_citation_t) with empty citations table
Hello everyone, It seems that when there are no citations in a site's records, the bibrank job (more specifically the citerank_citation_t method) produces an exception and the whole bibrank job stops. This is part of the relevant log: 2013-11-12 08:11:49 -- size of reversedict 10 2013-11-12 08:11:49 -- size of citationdict 10 2013-11-12 08:11:49 -- size of selfcitedbydict 10 2013-11-12 08:11:49 -- size of selfcitdict 10 2013-11-12 08:11:49 -- Total time of get_citation_weight(): 237.47 sec 2013-11-12 08:11:49 -- No need to update the indexes for citations. 2013-11-12 08:11:49 -- Running rank method: citerank_citation_t 2013-11-12 08:11:49 -- Error while extracting citation data from rnkCITATIONDATA table 2013-11-12 08:11:49 -- Error: No citations to read! 2013-11-12 08:11:49 -- Error: No citations to read! 2013-11-12 08:11:49 -- Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/bibtask.py, line 984, in _task_run if callable(task_run_fnc) and task_run_fnc(): File /usr/lib64/python2.6/site-packages/invenio/bibrank.py, line 163, in task_run_core func_object(key) File /usr/lib64/python2.6/site-packages/invenio/bibrank_citerank_indexer.py, line 782, in citerank raise Exception Exception FYI, I have not changed the default citerank_citation_t.cfg Is this something known and maybe(?) fixed in a personal branch or should I create a ticket? Cheers, Theodoros Theodoropoulos
Re: bibcatalog_system_email exception in BibEdit
Only for historical reasons (and while waiting for the 'INSPIRE mega-branch' soon-to-be-merged to master), I managed to easily revert this behavior by commenting out createReq({recID: gRecID, requestType: 'getTickets'}, onGetTicketsSuccess); in bibedit_engine.js Ok, it's not really a 'fix', but it's the best one can do while the relevant function is not implemented in bibcatalog_system_email.py :) Cheers, Theodoros On 4/11/2013 11:26 πμ, Samuele Kaplun wrote: Hi Theodoros, In data lunedì 4 novembre 2013 11:23:20, Theodoros Theodoropoulos ha scritto: It seems that in the master branch (at least a few commits back) if one tries to edit any record (even from the demo site), the following exception is thrown: uri: /record/edit/? ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 507, in application ret = invenio_handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler return _handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 424, in _handler return root._traverse(req, path, False, guest_p) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 239, in _traverse return obj._traverse(req, path, do_head, guest_p) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse result = _check_result(req, obj(req, form)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_webinterface.py, line 144, in index json_data)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_engine.py, line 406, in perform_request_ajax response.update(perform_request_bibcatalog(request_type, recid, uid)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_engine.py, line 1289, in perform_request_bibcatalog status=['new', 'open'], recordid=recid) File /usr/lib64/python2.6/site-packages/invenio/bibcatalog_system_email.py, line 58, in ticket_search raise NotImplementedError NotImplementedError btw, BibCatalog configuration in invenio-local.conf has the default values and I'm not using RT (so CFG_BIBCATALOG_SYSTEM = EMAIL and so on). I'm not using any new/fancy stuff, so I wonder why the NonImplementedError... Should I open a ticket, did I misconfigure something or should I simply wait for this feature to be implemented? I'm running v1.1.2.513-53f4 (a few commits back, but I will try to compile and use the latest one). I know that master branch is a bit experimental (an at some points not very advisable to put into production), but I will be needing authorities in this system, so maint-1.1 was not an option and I cannot wait for 1.2... Would you consider the current master a bit too risky/unstable to use in production? We discovered this bug also when applying master branch to the INSPIRE project. A fix is coming soon... Cheers! Sam
bibcatalog_system_email exception in BibEdit
Good morning, It seems that in the master branch (at least a few commits back) if one tries to edit any record (even from the demo site), the following exception is thrown: uri: /record/edit/? ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 507, in application ret = invenio_handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler return _handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 424, in _handler return root._traverse(req, path, False, guest_p) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 239, in _traverse return obj._traverse(req, path, do_head, guest_p) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse result = _check_result(req, obj(req, form)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_webinterface.py, line 144, in index json_data)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_engine.py, line 406, in perform_request_ajax response.update(perform_request_bibcatalog(request_type, recid, uid)) File /usr/lib64/python2.6/site-packages/invenio/bibedit_engine.py, line 1289, in perform_request_bibcatalog status=['new', 'open'], recordid=recid) File /usr/lib64/python2.6/site-packages/invenio/bibcatalog_system_email.py, line 58, in ticket_search raise NotImplementedError NotImplementedError btw, BibCatalog configuration in invenio-local.conf has the default values and I'm not using RT (so CFG_BIBCATALOG_SYSTEM = EMAIL and so on). I'm not using any new/fancy stuff, so I wonder why the NonImplementedError... Should I open a ticket, did I misconfigure something or should I simply wait for this feature to be implemented? I'm running v1.1.2.513-53f4 (a few commits back, but I will try to compile and use the latest one). I know that master branch is a bit experimental (an at some points not very advisable to put into production), but I will be needing authorities in this system, so maint-1.1 was not an option and I cannot wait for 1.2... Would you consider the current master a bit too risky/unstable to use in production? Best regards, Theodoros Theodoropoulos
invenio_2013_08_22_hstRECORD_affected_fields.py minor issue
invenio_2013_08_22_hstRECORD_affected_fields upgrade recipe executes: ALTER TABLE hstRECORD ADD COLUMN affected_fields text NOT NULL default '' AFTER job_details The upgrade script gives a warning that BLOB and TEXT columns cannot have DEFAULT values. Although probably not important to the table's functionality, you could try removing/changing default '' to suppress the warning... Cheers, Theodoros Theodoropoulos
Re: bibupload: ERROR: Invalid Revision (with 'fake' record replacement procedure)
On 31/10/2013 5:46 μμ, Samuele Kaplun wrote: During a migration phase to a system running the latest 1.1.x and using the usual recipe [copy bibrec IDs along with creation/modification times between systems and then 'replace'(insert) records with bibupload -n -r xxx.xml], I get the following error: -ERROR: Invalid Revision - 'INVALID REVISION : 20130930185930 for Record 250733 not in Archive.' This is kind of strange as the revision should always ends with a .0... Hmmm... I verified that the xml contains a 'proper' 005 (that ends with a .0). Also, the code in bibupload_revisionverifier.py seems correct. Maybe it's just a display issue? If so, this does the trick: --- /opt/invenio/lib/python/invenio/bibupload_revisionverifier.py 2013-10-21 08:01:29.463652013 +0300 +++ /opt/invenio/lib/python/invenio/bibupload_revisionverifier.py.test 2013-11-01 08:59:28.510219744 +0200 @@ -361,7 +361,7 @@ r_date = upload_rev.split('.')[0] if r_date not in [k[1] for k in get_record_revisions(self.rec_id)]: -raise InvenioBibUploadInvalidRevisionError(self.rec_id, r_date) +raise InvenioBibUploadInvalidRevisionError(self.rec_id, upload_rev) else: raise InvenioBibUploadMissing005Error(self.rec_id) This check obviously relates to tag 005 and did not exist in previous releases. Is there a flag to override this behavior and have the records replaced (inserted) anyway? (Or should i put all 005s in some table prior to import?) Nope. And actually it's strange that you have a revision that does not exist in the history of records... Well, In this particular case, its normal. See, that I'm only 'faking' bibrecs by just manually creating entries in bibrec. Nothing more. And the idea is to upload records from another invenio instance, so that timestamps and bibrec IDs are the same across systems (I didn't bother messing around with hstRECORD, as I thought bibupload would update it during import). Indeed dropping the 005 would do the trick... Indeed. Tried it and it works. But if you can ticketize this with more details we can work on a real solution :-) I would be happy to, but it might be nothing wrong here... Unless of course we're talking about a feature request for a new flag in bibupload (ie --skip-005) to properly handle (skip) 005s in input records... If you're in to this, you could implement a whole set of 'skips' (like --skip-001) or -even better- a more generic --skip-tag=XXX,YYY. Sounds interesting? Cheers, Theodoros
Re: Lost with LibreOffice configuration
On 31/10/2013 11:42 πμ, Samuele Kaplun wrote: What you can do as a quick workaround is to add right before the: [...] sys.path.append(/opt/invenio/lib/python) from invenio.websubmit_file_converter import CFG_OPENOFFICE_TMPDIR [...] But of course! (Silly me...) I also had to add: sys.path.append(/usr/lib64/python2.6/site-packages) for the remaining modules that are needed Hopefully this should be enough. If the LibreOffice Python and the regular Python have two different major version they will though keep on recompiling the .pyc files that inveniounoconv needs :-( LibO python is 2.6.1 and system python is 2.6.6. I suppose that this will not pose an extra problem here, right? FYI, /opt/invenio/bin/inveniocfg --check-openoffice now returns Ok :) Thank you Sam for helping me!
Re: bibupload: ERROR: Invalid Revision (with 'fake' record replacement procedure)
Commenting my own mail: I suppose i could remove 005s completely, but since revisions between the two systems are identical, I was seeking an easy way to put 005s as well. If not possible/advisable/easy to do, I'll skip them :) T. T.
Lost with LibreOffice configuration
Good evening, I happened to compile invenio with LibreOffice 4.x (--with-openoffice-python=/opt/libreoffice4.0/program/python) but it seems that versions 4.x contain python 3.x, and thus /opt/invenio/bin/inveniounoconv cannot run properly (there are important changes in print becoming print() and in exceptions). (I now remember hitting that same wall in the past too, but...) The question is: What do I have to do to 'tell' invenio to use LibreOffice 3.6 python WITHOUT COMPILING EVERYTHING from source again? (don't ask...) What I tried already: I changed CFG_PATH_OPENOFFICE_PYTHON = /opt/libreoffice3.6/program/python inside invenio-autotools.conf and rerun inveniocfg --update-all Now inveniocfg --list conains the correct OFFICE_PYTHON path. FYI, I also changed the 1st line in /opt/invenio/bin/inveniounoconv to: #!/opt/libreoffice3.6/program/python But when executing /opt/invenio/bin/inveniounoconv, I STILL get the following error: Traceback (most recent call last): File /opt/invenio/bin/inveniounoconv, line 50, in module from invenio.websubmit_file_converter import CFG_OPENOFFICE_TMPDIR ImportError: No module named invenio.websubmit_file_converter (ipython produces no error for the same line) I even tried using CFG_PATH_OPENOFFICE_PYTHON = /usr/bin/python (because invenio/websubmit_file_converter.py exists only in system's site-packages) the but I see the same error. I'm a bit lost here :( Best regards, Theodoropoulos Theodoros
Re: Questions regarding Knowledge Bases [again]
Thanks Jerome, Your comments were indeed very helpful. I was confusing the notions of 'reciprocal' and 'search' but your examples made it all clear [1] I'm still unable to use KBs in websubmit template files (although the same function works in bibconvert) and I was wondering what might be wrong. Seeing an older Tibor presentation, I was left to believe that KBs work with the websubmit module. So, where does the module look for the relevant files? I've tried /opt/invenio/var/www/admin/bibconvert/, /opt/invenio/etc/bibconvert/config/ (the dir where the .tpl files reside), the database (BibKnowledge), but no luck :( I fear the solution is somewhere under my nose, but I cannot find it... Best regards, Theodoros [1] Having said that, I seem to recall that the first element in a KB works like a 'key'. Therefore, looking into your example, the 3rd line could(?) produce some 'strange' results, no? A---B B---A A---C (hmm...) DEF---GHI
Access restrictions in Invenio's OAI-PMH service
Hello everyone, I was recently re-configuring my Invenio oai repository configuration and the got the following idea regarding managing access in the oai2d service: Since responses to requests sent to the OAI service are formatted as standard HTTP responses (see: http://www.openarchives.org/OAI/openarchivesprotocol.html#HTTPResponseFormat), a firerole(-like) access list could be applied to this service, so when requests come from certain IP ranges, the service could work as expected, or else an 403 error could be returned. For this, a new webaccess role/action(like 'accessoaiservice') could be created, so that all relevant libraries can refer to it. If one wants to spend more time on it, a web gui could be created in the oairepository admin page, that -in addition to the firerole support- could allow a set of specific user/password pairs to access the service. To take it even further, these access restrictions could(?) be applied per OAI collection, allowing different people/IPs to access different sets of setSpecs. If you think that it might be useful to others and could be implemented in a future release, I would be happy to create a relevant 'enhancement request' ticket, so that it does not get forgotten. Kind regards, Theodoros Theodoropoulos
Is_Original_Submitter issue with webaccess submit role for MBI
Hi to all, I'm creating some custom websubmit functions to handle local needs when it comes to ability to Modify an existing record (similar to Is_Original_Submitter) and I came across the following issue: In order for one to get the Modify existing record button in submit screen, he/she must have access to MBI for that doctype. But then, when Is_Original_Submitter is executed, the check (auth_code, auth_message) = acc_authorize_action(user_info, submit,verbose=0,doctype=doctype, act=act) will always return auth_code=0 (Authorization granted). Which means that ALL users who can see the Modify existing record button will be able to make the modification because they have *explicit* power to do so. So, how can one get this power only because he/she is the original submitter? An idea would be that the user does not have the the MBI for this doctype, but gets special permission for a specific record from Is_Original_Submitter. But then how the user can hit the MBI button? It's not shown... Even if I put it as a static link with all the required parameters, a big warning will popup saying You are not authorized to perform this action. Try to login with another account. and Is_Original_Submitter will get no change to get executed and give them the special permission for this record... Am I missing something obvious? The only solution I see is to give everyone who has SBI, MBI permissions as well, and IN ADDITION run checks to see if a user is the original submitter before allowing him/her to perform modifications to the record. But then a new action (or role?) that will act as a modify-superadmin will be required that will allow for record modifications even if 'original submitter' checks are not satisfied... Is this the way to go? If so, addition to the default action(/roles ?) and changes to Is_Original_Submitter websubmit function might be needed for the demo site. Best regards, Theodoros
Re: Copyright notice per docfile
Although I vaguely remember hearing something about it in the past, I have a simple question related to this topic[1]: Can we get 'See/See also' related results using KBs? For example, I have: 100 $0 ID:1 $a Lastname, Firstname and in another record: 100 $0 ID:1 $a SimilarLastname, SimilarFirstname (note the SAME ID for the two names!) What I want is to get results for SimilarLastname too when I search for Lastname. [btw, mapping: 1---Lastnameand Lastname---SimilarLastname Seems not a proper option because we lose the 'connection' that SimilarLastname has also ID1 and maybe other IDs may have SimilarLastname. Also, even with that I'm not sure how I can apply both these KBs in a search query] If this is impossible with current implementation using KBs, then I believe it should be noted as another VERY useful thing that will come along with Authorities in Invenio. Cheers, Theodoros [1] You realize that we keep discussing this very important issue of Authorities in a thread with an irrelevant subject :) Let's hope that we'll manage to dig these replies up in a future search for this topic...
Re: Copyright notice per docfile
On 28/5/2013 8:02 μμ, Tibor Simko wrote: Depending on the Invenio version, the BibDoc objects have `more info' property store that basically serve as a storage for any key-value combination. We use it e.g. to store technical metadata about pictures: width, height, position on page. It could be used to store per-docfile license/copyright information. Alternatively, in the past, there was also an approach to enrich BibDoc independently with licensing information and then represent it in MARC accordingly. We did not commit this part, but we may perhaps revive it with Jerome. Licensing Information per docfile is a useful thing to have in Invenio. Having said that, using a seperate 'field' in bibdocfile -like comment or description- to keep it (although one possible way to do it) will probably require a lot of changes in several files. Did you opt for some solution in the meantime? It would be nice to settle on a technique and offer it out-of-the-box with Invenio demo submissions. I think the former technique would be preferable. I had to come up with a solution really quick with the invenio code that was available two months ago. So, i decided to add the license in bibdocfile's comment field in the form: License:by-sa. For that, I'm using a heavily modified version of Upload_Files websubmit function. I have also written/edited relevant format templates/output formats to 'decode' this data and display the license images (currently only used for CC licenses and only if comment field begins with license:) and link to the detailed license url. I'm attaching 3 small screenshots in case you want to take a sneak preview of the output result. The filenames are _really_ dummy (i'm now ashamed of them) and the general idea was to include the most peculiar example with a lot of attachments and file versions/file formats, etc. The only known drawback I'm having with my approach is that the the tab 'Files' does NOT show any Licensing information. I know this is important, BUT in order to implement it, I would have to make a lot of changes in several core Invenio functions (in definitions, return values etc) and I wanted to avoid that as much as possible, hoping that a more proper/complete solution would come at some point from you :) Best regards, Theodoros attachment: Invenio-CCLicencesInDetRec.pngattachment: Invenio-CCLicencesInMiniPanel.pngattachment: Invenio-CCLicencesInBriefRec.png
Re: Question about document migration to new server
On 29/5/2013 10:34 πμ, Tibor Simko wrote: That's generally very good way of thinking. In this concrete use case, the trouble is that the Invenio upload API does not allow to submit a file with a given wanted docID. (Unlike submitting a record with a given wanted recID, which is possible.) So the proper answer to Theodoros's need would be to extend Invenio's FFT API support to allow forcing of docIDs, for example. However, this would take time. Hence the need for a low-level DB dump/load solution that we have been discussing in this thread. (BTW another `proper' solution could be simply not to reuse the old docIDs, but load the files anew via FFT, keeping only recIDs the same. The docIDs are never exposed to end users, so this might be acceptable. However, some massaging would have to be done a posteriori anyway, in order to port old logs and stats information. Hence the first solution, more advantageous in my eyes.) True. I also mentioned this, more 'proper' alternative to move bibdocs, and I would be happy with it, if and only if the bibdoc STATS would be exported along with each docfile. Since this is currently not available, I would have to separately move the rnkDOWNLOADS and change the bibdoc ids from old to new. I decided that this in not a good idea (mainly because I would have to alter millions of lines, and also because I would be messing with statistics and it would be difficult for me to find out if something went wrong in the process). So I too believe that keeping the same bibdoc id's is the way to go in this particular case, and it will also make it easier for me to check things between old and new systems. Having said all that, It would be a nice addition for the 'future' if an admin would be able to export the bibdocs along with their stats. Also, why not even be able to import them back to the system... I see possible issues with adding/replacing bibdocs (and what will happen to their statistics), but I'm just mentioning an issue that I personally needed once. In case you see a more generic Invenio community need for that, I'm sure you will find a way (or make the necessary changes in DB) to overcome that. Best regards, Theodoros
Re: Unexpected keyword argument errors in invenio.err for doctypeconfiguresubmissionfunctions and doctypeconfiguresubmissionpages
On 28/5/2013 7:50 μμ, Tibor Simko wrote: On Mon, 15 Apr 2013, Theodoros Theodoropoulos wrote: I'm pretty sure it will be reproducible in a stock Invenio site too. Can you verify it so that I submit a ticket? Verified. Please submit a new `known issue' for WebSubmit. Thanks for reporting! Done. Ticketized under http://invenio-software.org/ticket/1521 (btw I cannot select 'known issue' for a ticket... One must probably have higher credentials for that :) ) Best regards, Theodoros
Re: Increase in maximum allowed size in Document Type ID, Element Name
On 29/5/2013 12:53 μμ, Tibor Simko wrote: That said, we are however fully concentrating on developing the new WebDeposit module which brings many new goodies and which will eventually replace WebSubmit. So, if at all possible, it would be simpler if you worked with the current WebSubmit field length limitations until WebDeposit comes? Well, I'm several months past this point of having to choose the Element names :) I ended up using ugly abbreviations, but I'm happy I didn't create any extra potential problems for a future migration from WebSubmit to WebDeposit. Having said that, I will be happily surprised if/when such an 'automatic' migration procedure to WebDeposit will be made available (bearing in mind all the custom elements, checks, javascript-insertion dummy elements, etc WebSubmit users might have introduced). Cheers, Theodoros
Re: Copyright notice per docfile
Hey, I could give you my kidney to have you revive BibAuthority!! I've written many times to some developers and to the list regarding BibAuthority, but I never got a straight answer regarding to it's status. Nevertheless, I based all my author tags based on the very promising 'recipe' found in https://twiki.cern.ch/twiki/bin/view/CDS/BibAuthority hoping that some day Invenio would get some proper authority support. I don't know how useful it would be for licensing, but I can guarantee that it will make the life of MANY Invenio users way better. I remember such a strong request in the last IUGM. Please tell me that you're thinking seriously to revive BibAuthority! (even if it's a lie...) Best regards, Theodoros
Question about document migration to new server
Hi to all, I want to move a collection (with it's documents) to a new server, while retaining the record ids and all the STATS. Things are somewhat easy with bibrecs. However, there are issues with the handling of bibdocs. With the FFT tag way of copying bibdocs, I cannot find a way to force a bibdoc to get a specific bibdoc_id (id in bibdoc table), so the only way I see as an alternative, is the manual copy of selected files (with their full path structure in the filesystem) along with the copy of the related bibdoc, rnkDOWNLOADS, rnkPAGEVIEWS entries. Do you see any drawbacks with this method? Should I pay special attention to something? Do you have a better idea (ie 'translation' of bibdoc ids in bibdoc, rnkDOWNLOADS, rnkPAGEVIEWS to match the new ones created by FFT)? Your ideas are most welcome! Best regards, Theodoros
Re: search-guide.webdoc with Greek translation added
On Τρίτη, 28 Μάιος 2013 4:10:32 μμ, Tibor Simko wrote: I'd like to acknowledge the co-authorship of Vaggelis in the commit log. Is this OK, and if so, is it possible to know and publicly publish his email address? Of course! It's Evangelos Savvidis (vsavi...@gmail.com) and you may publish both the name and email. Thank you!
Re: Question about document migration to new server
On 28/5/2013 3:23 μμ, Tibor Simko wrote: Yes, I think this is the fastest way to go. Don't forget all bibdoc related tables such as `bibdocmoreinfo'. Whoops! I didn't see that coming! I'm moving data away from an old invenio (0.99.x series) and there are no such tables as bibdocmoreinfo/bibdocsinfo Any chance they can be regenerated from the filesystem? (with bibdocfile --fix-bibdocfsinfo-cache or --fix-marc or fix-all)? Theodoros
Questions regarding Knowledge Bases
Hello everyone, Just some quick questions this time regarding Knowledge Bases :) First of all, most KBs are now moved into the DB (and served via BibKnowledge). That's great, but are these KBs available to template (.tpl) files used to convert/create record data? Or for that purpose will I have to use flat files? Which tools have access ONLY to flat or DB KBs? I'm also having some problems understanding the difference between search/match/search (in KB)/search (reciprocal) in the following part of bibconvert admin guide: {*input_value---output_value*} KB(file,1) searches the exact value passed. KB(file,0) searches the KB code inside the value passed. KB(file,2) as 0 but not case sensitive KB(file,R) replacements are applied on substrings/characters only. bibconvert look-up value in KB_file in one of following modes: === 1 - case sensitive / match (default) 2 - not case sensitive / search 3 - case sensitive / search 4 - not case sensitive / match 5 - case sensitive / search (in KB) 6 - not case sensitive / search (in KB) 7 - case sensitive / search (reciprocal) 8 - not case sensitive / search (reciprocal) 9 - replace by _DEFAULT_ only R - not case sensitive / search (reciprocal) replace If i want to covert between A-B (and vice versa) and I have a translate.kb containing A---B, will KB(translate.kb,8) will be enough for both-ways translation? Thanks in advance for your time, Theodoros
Re: Integration of scholar into Invenio 1.0Rc0
Hello everyone, I think I just managed to backport it to 0.99.x as well... At least the new META tags have appeared :) The situation is a bit more complicated when compared to 1.0.x series, since additions to other files is required (for example htmlutils.py) as well as changes to the bfe_meta.py in order to make it work with the old series... If there is anyone needing this feature in an old 0.99.x installation, I could provide some hints (nothing advanced to be honest), but I fear that with this king of 'hacking' -especially in htmlutils.py-, internal parts might have been affected, so I strongly discourage it for productions systems [1]. Best regards, Theodoros [1] Having said that, I personally plan to keep it for a day or two (always keeping my fingers crossed) and in the first glimpse of suspicious behavior, I will revert back to the original files. On 15/5/2013 3:59 μμ, Jerome Caffaro wrote: Hi Greg, Gregory Favre wrote: I know you have updated a lot of things helping the interaction with this service. Basically what was done? One thing that was done was the integration of some Google Scholar markup in the page header. The following commit shows which files have been touched: http://invenio-software.org/repo/invenio/commit/?id=96ac7c5eefb647de22dd41c4baba0d7e37597abb In two words, a new output format HDM has been added, which is called by the search engine when displaying a record. Note that these files have been later enhanced by further commits. Does it seam feasible to backport these features from invenio 1.1 to 1.0rc0? I don't foresee any major trouble if you try to backport the necessary files to your installation. It is mostly new files, with very few changes to an existing one. Best regards -- Jerome Caffaro
Re: Record Editor showing Checking status... and nothing more
Hello Petr, I too can verify the issue (tried with the latest maint-1.1 branch). The problem lies not in invenio sources, but in the latest jquery.hotkeys.js! If one runs make install-jquery-plugins, he/she gets the latest version of jquery.hotkeys.js (size 3598 bytes) that says: /* * One small change is: now keys are passed by object { keys: '...' } * Might be useful, when you want to pass some other data to your handler */ Probably this is related to the issue that you mention. Luckily, I had a previous version of jquery.hotkeys.js (size 3080 bytes), copied it in the /opt/invenio/var/www/js/ directory and everything went back to normal :) As a quick-n-dirty solution, you can try the patch that is included in the end of the email in order to revert to the previous version of jquery.hotkeys.js, but the Invenio devs will probably come up with a better solution... Best regards, Theodoros Theodoropoulos ps. This is the patch to revert to an old version of jquery.hotkeys.js that works (until a more permanent fix is produced)... --- /opt/invenio/var/www/js/jquery.hotkeys.js 2013-05-13 10:11:23.825329769 +0300 +++ /opt/invenio/var/www/js/jquery.hotkeys.js.orig 2013-05-12 15:58:33.742961433 +0300 @@ -10,6 +10,11 @@ * Binny V A, http://www.openjs.com/scripts/events/keyboard_shortcuts/ */ +/* + * One small change is: now keys are passed by object { keys: '...' } + * Might be useful, when you want to pass some other data to your handler + */ + (function(jQuery){ jQuery.hotkeys = { @@ -22,7 +27,8 @@ 96: 0, 97: 1, 98: 2, 99: 3, 100: 4, 101: 5, 102: 6, 103: 7, 104: 8, 105: 9, 106: *, 107: +, 109: -, 110: ., 111 : /, 112: f1, 113: f2, 114: f3, 115: f4, 116: f5, 117: f6, 118: f7, 119: f8, - 120: f9, 121: f10, 122: f11, 123: f12, 144: numlock, 145: scroll, 191: /, 224: meta + 120: f9, 121: f10, 122: f11, 123: f12, 144: numlock, 145: scroll, 186: ;, 191: /, + 220: \\, 222: ', 224: meta }, shiftNums: { @@ -33,18 +39,23 @@ }; function keyHandler( handleObj ) { + if ( typeof handleObj.data === string ) { + handleObj.data = { keys: handleObj.data }; + } + // Only care when a possible input has been specified - if ( typeof handleObj.data !== string ) { + if ( !handleObj.data.keys || typeof handleObj.data.keys !== string ) { return; } - + var origHandler = handleObj.handler, - keys = handleObj.data.toLowerCase().split( ); + keys = handleObj.data.keys.toLowerCase().split( ), + textAcceptingInputTypes = [text, password, number, email, url, range, date, month, week, time, datetime, datetime-local, search, color]; handleObj.handler = function( event ) { // Don't fire in text-accepting inputs that we didn't directly bind to if ( this !== event.target (/textarea|select/i.test( event.target.nodeName ) || -event.target.type === text) ) { + jQuery.inArray(event.target.type, textAcceptingInputTypes) -1 ) ) { return; }
Re: Access rights for files
So we encourage all our users to upload a full text and only give access at the library if we are allowed to. So this might be very well two distinct tasks. Your scenario is not uncommon i think... And it seems not that difficult to implement! Even with the existing Create_Upload_Files_Interface, you could define a custom 'static' restriction -as Sam said- that allows librarian group and submitter and denies everyone else. IF user chooses to 'restrict' the document this will be applied. The added bonus is that with invenio 1.1 you could CHANGE this restriction afterwards with Document File Management. OR, even with earlier versions you could probably implement the same functionality with Submit Revised Version ('SRV') action.
Re: Access rights for files
On 29/4/2013 9:37 πμ, Alexander Wagner wrote: ## Eg: ## [('', 'No restriction'), ('restr', 'Restricted')] CFG_BIBDOCFILE_DOCUMENT_FILE_MANAGER_RESTRICTIONS = [ ('', 'Public'), ('restricted', 'Restricted')] If this could learn something like 6-month embargo and 12-month embargo e.g. by reading a field like Lars' OpenAIRE setting we would already have a 99% solution and the rest could be done easily by hand. It would fall back to some time calculations, but well. Probably one could merge it with the 942-aproach of OpenAIRE? This type of embargo is what we also needed in the past. Back in the 0.99.x days I had it solved -partly- with a custom Upload_Files.py websumbit function, but nowadays you could do even more with a custom Websubmit Element of type Response. You can check Upload_Files websubmit element in the demo site where you may call the create_file_upload_interface python function with all the parameters you need. Since you could calculate the exact date in a 12 months-from-now embargo restriction, you could create a semi-dynamic embargo firerole restriction in the response element! The downside is that you could not easily edit the end of embargo date at a later stage using the gui. But it's a start... Sam could correct me if I'm wrong... Best regards, Theodoros
Re: Access rights for files
Hello Lars and Alexander! I too would be interested in such a functionality! Personally, I don't think there is an easy way to do it. As i see it, a new Action (similar to SRV) should be created (along with relevant websubmit functions and elements) that would read and display the current files along with their restrictions. Important note: an update to relevant bibdocfile.py functions would be needed as 'get-restrictions' is not supported! If you feed the current restrictions to a websubmit form's textarea that is editable, it's easy to ask the bibdocfile library to 're-set' the permissions for each bibdocfile to the new values when the librarian clicks submit. The current Create_Upload_Files_Interface.py functions does not include such functionality, but it could be done... (Having said that, one sould probably have to mess with websubmit_templates.py too) I'm using a custom websubmit_function (based on the 'older' Upload_Files) to perform a similar task but during the initial phase of the submission only where 'set' is only needed. If one needs to 'edit' the relevant data he/she has to contact me so that I would manually run the bibdocfile (in cli) with the new values. Just my two cents, Theodoros
Re: Access rights for files
Hello Ludmila, On 26/4/2013 5:51 μμ, Ludmila Marian wrote: The restrictions are stored as the status of the file, so get_status() should do the trick to get the restriction of the file. True, true! I see that you're already using get_status to merge restrictions when merging bibdocs. My suggestion for a 'get_restriction' function was based on the already implemented get_comment, get_description, get_version, etc functions. The functionality as you pointed out is there, the new name would just be more in par with the rest of the function names :) In any case, thanks for the hint! Theodoros
Re: Exceptions due to attacks
On 25/4/2013 12:29 μμ, Theodoros Theodoropoulos wrote: I tried the same with cds.lib.auth.gr and it also displays a 404 error (i don't know if an error is logged) Correction: I meant cds.cern.ch :)
Re: Exceptions due to attacks
On 25/4/2013 12:37 μμ, Ferran Jorba wrote: but try an index.php or any other missing hit at http://cds.cern.ch. It is effectively handled by Invenio. My point exactly. I see that both my installations and CERN's correctly handle those 'attacks'. I even tried with .php and .py files and there is no exception raised and sent to the admin even if you set CFG_SITE_ADMIN_EMAIL_EXCEPTIONS = 2 in invenio(-local).conf Unless I'm missing something here, I suspect something weird happening only with your installation... Theodoros
Unexpected keyword argument errors in invenio.err for doctypeconfiguresubmissionfunctions and doctypeconfiguresubmissionpages
Hello all, If you go to WebSubmit Admin-[any demo doctype]-[any action]-view functions (or view interface) you get the following error shown in invenio.err (no exception is shown to the frontend): = Wrong GET parameter set in calling a legacy publisher handler for doctypeconfiguresubmissionfunctions: expected_args=['ln', 'addfunctionscore', 'addfunctionstep', 'addfunctionname', 'configuresubmissionaddfunctioncommit', 'configuresubmissionaddfunction', 'deletefunctionscore', 'deletefunctionstep', 'deletefunctionname', 'movetofunctionscore', 'movetofunctionstep', 'movetofunctionname', 'movefromfunctionscore', 'movefromfunctionstep', 'movefromfunctionname', 'movedownfunctionscore', 'movedownfunctionstep', 'movedownfunctionname', 'moveupfunctionscore', 'moveupfunctionstep', 'moveupfunctionname', 'action', 'doctype', 'req'], found_args=['action', 'doctype', 'viewSubmissionFunctions'] * 2013-04-15 20:57:40 - TypeError: doctypeconfiguresubmissionfunctions() got an unexpected keyword argument 'viewSubmissionFunctions' (webinterface_handler_wsgi.py:677:mp_legacy_publisher) ** User details agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0 email: th...@xxx.yyy.zzz group: [] guest: 0 language: en login_method: Local nickname: admin precached_canseehiddenmarctags: True precached_permitted_restricted_collections: ['Theses', 'Drafts', 'ALEPH Theses', 'ALEPH Internal Notes', 'ISOLDE Internal Notes', 'Atlantis Times Drafts'] precached_sendcomments: True precached_useadmin: True precached_usealerts: True precached_useapprove: True precached_usebaskets: True precached_usegroups: True precached_useloans: True precached_usemessages: True precached_usepaperattribution: True precached_usepaperclaim: True precached_usestats: True precached_viewclaimlink: True precached_viewsubmissions: True referer: remote_host: remote_ip: www.xxx.yyy.zzz session: cc9766caae02814486f3c9048aebd64a uid: 1 uri: /person/S.D.Ellis.1?open_claim=True ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 677, in mp_legacy_publisher return _check_result(req, module_globals[possible_handler](req, **form)) TypeError: doctypeconfiguresubmissionfunctions() got an unexpected keyword argument 'viewSubmissionFunctions' ** Stack frame details Frame mp_legacy_publisher in /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py at line 685 --- 682 expected_defaults = list(inspected_args[3]) 683 expected_args.reverse() 684 expected_defaults.reverse() 685 register_exception(req=req, prefix=Wrong GET parameter set in calling a legacy publisher handler for %s: expected_args=%s, found_args=%s % (possible_handler, repr(expected_args), repr(req.form.keys())), alert_admin=CFG_DEVEL_SITE) 686 cleaned_form = {} 687 for index, arg in enumerate(expected_args): 688 if arg == 'req': --- inspected_args = ArgSpec(args=['req', 'doctype', 'action', 'moveupfunctionname', 'moveupfunctionstep', 'moveupfunctionscore', 'movedownfunctionname', 'movedownfunctionstep', 'movedownfunctionscore', 'movefromfunctionname', 'movefromfunctionstep', 'movefromfunctionscore', 'movetofunctionname', 'movetofunctionstep', 'movetofunctionscore', 'deletefunctionname', 'deletefunctionstep', 'deletefunctionscore', 'configuresubmissionaddfunction', 'configuresubmissionaddfunctioncommit', 'addfunctionname', 'addfunctionstep' [...] expected_defaults = ['en', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] err = 'TypeError(doctypeconfiguresubmissionfunctions() got an unexpected keyword argument \'viewSubmissionFunctions\',)' form = {'action': 'SBI', 'doctype': 'DEMOART', 'viewSubmissionFunctions': 'view functions'} the_module = '\'# -*- coding: utf-8
Copyright notice per docfile
Hello everyone, I was wondering what would be the best approach to use in order to insert a small copyright notice or a copyright id (ie 'by-nc-sa') for each docfile during file submission. I'm not concerned -for the moment- with how to implement it in a form, or how to display it in the bfe_fulltext, but where this could should be kept in the docfile related structure. Currently, bibdocfile supports 'description' and 'comment'. Description could be an option, but it should better be kept for other things. Comment looks more promising, but i cannot see where it's currently used/displayed. Do you think it's an appropriate attribute to set and use for copyright, or are you planning to set a new attribute for copyright notice in the future? Thanks for your time, Best regards, Theodoros ps. I've discovered a weird behavior in bibdocfile where --set-description and --set-comment on a record that has many docfiles applies only to the last docfile. Is this an implementation decision or is it a bug? (I thought I should ask before submitting a bug ticket :) )
Re: Copyright notice per docfile [update]
On 10/4/2013 4:35 μμ, Theodoros Theodoropoulos wrote: ps. I've discovered a weird behavior in bibdocfile where --set-description and --set-comment on a record that has many docfiles applies only to the last docfile. Is this an implementation decision or is it a bug? (I thought I should ask before submitting a bug ticket :) ) It seems that this has been fixed in the current 1.1-maint branch! I was testing this with a much older revision... :blush:
Increase in maximum allowed size in Document Type ID, Element Name
Hello everyone, Currently, maximum Document Type ID size is 10 chars and maximum Element Name size is 15 chars. I vaguely remember that these limits were even lower in older versions of Invenio (cdsware). Unless there is a design reason (ie big and 'ugly' parameters in URLs or unnecessary waste in DB tables), I believe that these limits are somewhat ...limiting and they should probably be increased. If current Invenio admins/developers think that this could be useful to others and easy to implement, I could submit a relevant enhancement request ticket. If, on the other hand, this is not needed to the vast majority and could be 'fixed' by simply altering the max allowed size in a DB column, I could proceed with the change in my local installation (provided that this won't break how things work). Best regards, Theodoros
Internal server error in invenio-demo-next [only with Greek translation]
Dear devs, While playing around with invenio-demo-next.ch, I found that an Internal Server Error is produced if you select the Greek translation and try to browse/search any collection. This error does not appear for other translations... I know next is under heavy development, but I thought I should mention it just in case it rings any bells. Best regards, Theodoros
Show records locked by users/queues (along with user/queue details) in bibeditcli
Hello everyone, Some times I happened to need the list of locked bibs (along with user details) and although bibedit_utils.py already has all the necessary functions, there is no such command line switch in bibedit/bibeditcli. So I thought that some useful new options that could be implemented would be: - show user_details for a specific record_id [probably using record_locked_by_user_details(recid, uid)] - show all locked record ids (by queues/users), maybe even filtered by dates (so as to see which are locked for t long, although if CFG_BIBEDIT_TIMEOUT is set properly, this might not be needed). - 'release' certain record_id (s) (also deleting the related cache) Although probably easy to implement, all these are of minor importance, and might even not fall into your TODO list, so I'm not submitting an enhancement ticket. Having said that, if you see any real use for Invenio users, I would be happy to see them being added in a future release:) Best regards, Theodoros
Exception with bibauthorid --update-personid
Hello everyone, I've downloaded and compiled the latest maint-1.1 from git because I wanted to test the latest improvements in the bibauthorID module that seem very interesting. After running all the required steps, creating the demo site and loading the demo-records, I run: sudo -u apache /opt/invenio/bin/bibauthorid --update-personid and got the following exception: 2013-02-13 13:17:13 -- Entering task_sleep_now_if_required with status=UNKNOWN [...] repeated 100 times [...] 2013-02-13 13:17:13 -- Entering task_sleep_now_if_required with status=UNKNOWN File /opt/invenio/bin/bibauthorid, line 35, in module main() File /opt/invenio/lib/python/invenio/bibauthorid_cli.py, line 35, in main daemon.bibauthorid_daemon() File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 104, in bibauthorid_daemon task_run_fnc=_task_run_core) File /opt/invenio/lib/python/invenio/bibtask.py, line 480, in task_init ret = _task_run(task_run_fnc) File /opt/invenio/lib/python/invenio/bibtask.py, line 962, in _task_run if callable(task_run_fnc) and task_run_fnc(): File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 147, in _task_run_core run_rabbit(record_ids, all_records) File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 286, in run_rabbit rabbit_with_log(None, True, 'bibauthorid_daemon, update_personid on all papers') File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 264, in rabbit_with_log insert_user_log('daemon', '-1', action, 'bibsched', 'status', comment=log_comment, timestamp=starting_time) File /usr/lib64/python2.6/site-packages/invenio/bibauthorid_dbinterface.py, line 863, in insert_user_log action, tag, value, comment)) File /usr/lib64/python2.6/site-packages/invenio/dbquery.py, line 211, in run_sql rc = cur.execute(sql, param) File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 203, in execute if not self._defer_warnings: self._warning_check() File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 117, in _warning_check warn(w[-1], self.Warning, 3) File /usr/lib64/python2.6/site-packages/invenio/errorlib.py, line 569, in fun traceback.print_stack() /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:203: Warning: Incorrect integer value: '' for column 'userid' at row 1 if not self._defer_warnings: self._warning_check() 2013-02-13 13:17:14 -- Task #15 finished. [DONE] sudo -u apache /opt/invenio/bin/bibauthorid --disambiguate also gives the following exception: 2013-02-13 13:44:15 -- Task #16 submitted. -bash-4.1# sudo -u apache /opt/invenio/bin/bibauthorid 16 2013-02-13 13:44:19 -- Task #16 started. 2013-02-13 13:44:22 -- Traceback (most recent call last): File /opt/invenio/lib/python/invenio/bibtask.py, line 962, in _task_run if callable(task_run_fnc) and task_run_fnc(): File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 152, in _task_run_core run_tortoise(bool(bibtask.task_get_option(from_scratch))) File /opt/invenio/lib/python/invenio/bibauthorid_daemon.py, line 305, in run_tortoise tortoise(modified) File /usr/lib64/python2.6/site-packages/invenio/bibauthorid_tortoise.py, line 124, in tortoise assert all(stat == os.EX_OK for stat in exit_statuses) AssertionError 2013-02-13 13:44:22 -- Task #16 finished. [CERROR] Can you verify/reproduce these exceptions? (There is always the possibility that I might have broken something while testing things). BTW, the aid* tables are filled with the appropriate author data and |aidUSERINPUTLOG| contains only the following row (1, 0, '2013-02-13 13:17:13', 0, 'daemon', -1, 'PID_UPDATE', 'bibsched', 'status', 'bibauthorid_daemon, update_personid on all papers') Thanks for your time, Theodoropoulos Theodoros ps. And one bonus question regarding BibAuthorID module :) Is there any way for a user to search for authors using any of their external IDs?
Bibcheck cli?
Hello everyone, I need to import a few thousand records (that were converted to marcxml from RefWorks format using bibconvert) into my Invenio instance. Because of the complexity of the original records -and mainly because not all users followed the instructions when filling the Refworks forms- the final records may lack some important fields to us (like 100$a or 260$c). So, i was wondering if there is way to check the produced marcxml file before importing it into the system and bibcheck came to my mind... It seems however, that although bibcheck web admin is there, there is no bibcheck cli... Is this module still in use? I also remember vaguely that they were plans to add checking functions in bibedit (web/cli)... With these in mind, do you have any suggestion as to what is currently the best way to check records that exist either in a marcxml file or in the holding pen? Thanks in advance for your time, Theodoros
Re: Bibcheck cli?
Thank you Sam for the prompt reply! No problem, I'll go with python. I just wanted to double check, in order to see if I was missing something that was already there and did the trick.
(pseudo)authorities: what is supported in Invenio/Inspire?
Good morning to all, In order to be able to plan ahead in my installation, I would like to know if there is any way to deal with authorities in Invenio. The BibAuthority module (described in detail here: https://twiki.cern.ch/twiki/bin/view/CDS/BibAuthority) is EXACLY what i would like to have, but i cannot find it in my installation nor in the invenio/inspire repo. Was the module abandoned/postponed/replaced by something even better? On the other hand, inspirehep.net seems to have such functionality (HepNames, Institutions, Conferences) seem to be 'authority' records that are considered when performing a search. Could you please shed a light on what's being available now and what's planned for the near future? Best regards, Theodoropoulos Theodoros ps. I apologize for sending a similar mail in the past too. It's quite urgent that i know at least some general information and the best practices for the current/next Invenio version in order to make the right decisions for my next milestone. I'll do the reading/implementation/testing myself.
Re: Add all search results to a basket?
On 30/11/2012 10:58 ??, Lars Holm Nielsen wrote: Hi, Would it be all records for an entire search or just all records displayed on one result page? In case of the latter, there's a toggle all button in the next-branch (see example on: http://invenio-demo-next.cern.ch/search?p=action_search=). Cheers, Lars Great demo site! It looks beautiful and is definitely user-friendlier (and very google-like) ;) Is that faceted search on the left? Hooray! Can't wait! Is this based on a new webstyle template that can be chosen in invenio(-local).conf or will this be the default look-and-feel of the next version? You never stop surprising us (in a good way)! Keep up the great work, Theodoros
Re: (pseudo)authorities: what is supported in Invenio/Inspire?
Hello Alexander, It's been a while since we talked at the last Invenio Users Meeting, but i'm following the list and I've seen the great work you've done at Juelich. We currently use some sort of authority control in JuSER where we mainly implemented the functionality within websubmit type of things. This however does not (yet?) use bibauthority. Unfortunately, the searchengine does not yet know about authorities. If we can be of some help here, drop me a note. I'm too using a way to correctly SELECT a user (with all proper IDs, affiliations, etc) from a list in websubmit part. However, the values are copied to 100/700__0,a,e,u bib record fields. This ensures consistency across records, but it's really a plan-b solution -but the only way to do things because authorities are not(?) supported. Plus, I think that there is no easy way to use this data in BibEdit: one could select from a list of authors in 100/700__a, but i'm not sure that this selection could update other subfields of 100/700 accordingly. Ideally, with authorities, in the bib record, only the 'reference' to the authority would be used, so that if anything changes in the author, all the 'connected' records will know the change, plus you get see/see also fields, alternate/translated names support etc. That would also involve indexing and searching, and this is described nicely, here http://twiki.cern.ch/twiki/bin/view/CDS/BibAuthority You can check out what we did at http://juser.fz-juelich.de/ in the Authorities-collections. You might also want to check out the Marc we used. E.g. for an institute http://juser.fz-juelich.de/record/98380 You may use the arrow links to go to predecessors, successors or top. You seem to have it nicely ordered. I'm currently interested mostly on authors, but Institutions will come next. I'll check your marc to get an idea of what you've used. Thanks again for sharing your work! We would definitely also want to know how they can be used within search. I mean not simple stuff like searching for an ID, we have this already but searching for an ID x and find all records for ID x and y as the authority record of x list's y as addional identifier. We work around our current limitations e.g. using the index cid (contributing institutes also gathering all former forms of the institute) but this is not really done in a nice way. Bascially a record here just has enough collection entires, but it's no real authority lookup. Exactly! When we started to implement this stuff I checked closely with Annette Holtkamp, so I think we should indeed use the same or at least a very similar authority record layout as Inspire. Hmm.. I didn't go yet too much into Inspire's (pseudo?)authority marc, but will do soon. I also believe that we should always try to use common fields when possible, keeping always in mind the marc standards. Having said that, I am mostly interested in more fundamental issues: - (Is/was/will there be) support for authorities soon -indexing/searching/bibedit support included- and what can i do now to be ready? - Is the final module going to be something similar to what's being described in that wiki page? - Is it planned for 1.2 1.x or 2.x series? My best regards to all, Theodoros
Case insensitivity in some webaccess related tables vs relevant python functions
Hello everyone, Yesterday i hit the following situation: My aim was to restrict access to collection test to certain users. Reading the docs, I created a test_viewers role and connected certain users to the viewrestrcoll authorization to the test collection (all using the web interface). But, instead of entering test in the collection argument of the viewrestrcoll, I mistakenly added Test. The issue I found was that, even though the parameter 'collection' is entered and kept in the DB in the case you enter it, you cannot add ie test if you have previously added TEST or Test. If you try that you get a warning saying sorry, authorization could not be added, it probably already exists. This is obviously related to the DB having uf8_general_ci collation, so the relevant check returns a row, regardless if you search for Test, test, or TEST... Having said that, some(?) python libraries distinguish arguments passed to authorizations in a case-sensitive way (probably because they work with IDs). For example, search_engine.collection_restricted_p('Test') is true, while search_engine.collection_restricted_p('test') is not. Furthermore the entries in accARGUMENT are never(?) deleted even if you delete the appropriate authorization. This means that if one for some reason enters a value for keyword in this table, that value in all lower/capital letter combinations is reserved for ever... I don't know if this is the desired operation, but if not, you might either consider changing collation (dangerous?) to utf8_bin for certain tables (accARGUMENT for example), or alternatively change certain functions (like collection_restricted_p) to be case insensitive. Although my problem is now solved by changing the string in the DB, I thought I should raise the issue in case you wanted to deal with it in the future :) Best regards, Theodoros
Re: Record Editor Suggest values: ctrl-shift-a brings up Addon Manager in Firefox (windows)
It seems that the key combination is hardcoded in bibedit_keys.js, which might require a few changes in the way it's called inside bibedit_engine in order for one to be able to pass this parameter... (It's a pity because the code is so beautiful and concise :( ). ps. For now, I will change the value into something else, but i think it's an issue that eventually should be dealt with, since it affects a large percentage of non-linux users. (Btw, it seems that the same shortcut exists in Firefox for linux, but -at least in my installation- does NOT bring up the addon manager...)
MathJax/LATEX support for titles in hb,hd
Hello everyone, I'm trying to parse LATEX strings in titles for the first time, so I changed the appropriate entries in the invenio-local configuration file (mathjax source was already installed), run inveniocfg --update-all, added latex_to_html=yes in the appropriate format template, and... nothing. Latex is not parsed. I even tried importing bibformat_utils in ipython and running i.e. print bibformat_utils.latex_to_html(monadic $\forall^{1}_{1}$ positive theory over large sets ), but I only get the original string. What am I doing wrong? I KNOW I'm missing something, because this feature works properly in the CERN demo site (the same entries in my local demo site, that are not parsed). Any ideas would be welcome! Best regards, Theodoros
How to accessing attached files/temporary marcxml record before approving a record?
Hello everyone, When a document is in the pending directory (ie is waiting for approval) how can a librarian access the attached files and/or the dummymarcxml that is produced in the form-submission phase in order to verify that everything is in order and proceed with the approval? The record is not yet integrated, so It seems logical that /record/xxx and /record/xxx/files are empty ... (although in the Send_Approval_Request websubmit function you send an approval email that points to the -always empty- /record/xxx/files dir). I know i can access and include (in the approval mail) the contents of the files in the related pending dir for the record so, I suppose I could also attach the whole (parsed?) marcxml in the mail, but I was looking for a better solution... Hmmm... What mechanism are you using at CERN for your librarians that handle approval requests? Any ideas/comments would be welcome! Best regards, Theodoros
Re: MathJax/LATEX support for titles in hb,hd
What is the result of not using this latex_to_html functionaility? The same unparsed formula. I even tried importing bibformat_utils in ipython and running i.e. print bibformat_utils.latex_to_html(monadic $\forall^{1}_{1}$ positive theory over large sets ), but I only get the original string. That's good :-) Infact MathJax expects to find original LaTeX formulas and render them on the fly using Javascript on the client's browser. Check, e.g. the source HTML of: http://invenio-demo.cern.ch/record/80?ln=en As you will see formulas are nicely there. I know! That's why I'm puzzled! How have you installed the MathJax sources? Have you correctly run $ sudo -u www-data make install-mathjax-plugin Yes! I just run it again just in case. No difference. Hmm maybe a matter of permissions? Apart from /opt/invenio/var/www/MathJax/config/MathJax.js (which is apache:apache), the rest are root:root. Having said that, there are no related warnings/errors in the apache log file... Do you have any errors in the Javascript console of your browser? No. I already checked that before sending the first email. Any other ideas? I don't know what else to check :(
Re: How to accessing attached files/temporary marcxml record before approving a record?
On Δευτέρα, 26 Νοέμβριος 2012 2:43:49 μμ, Samuele Kaplun wrote: [...] So we basically don't use at all the pattern given by WebSubmit, Move_to_Pending, Move_from_Pending. The drawback of this approach is that a recid is allocated even in case the document is then rejected but this is a small price to pay. Thank you Sam for all these details! They are indeed very interesting and I'm so sorry to learn them now[1]... The last few weeks I had to create 14 new doctypes with many pages/new custom elements/websubmit_functions that all use the good-old APP action workflow. I'm afraid to touch the process again in fear that i may break things (and you know how 'fragile' things are in the websubmit module!) Having said that, the workflow you describe is waay better and more flexible than the old one! Losing a recid no. is nothing much compared to what you get in return! Cataloguers/referees can then approve/amend the document using a dedicated submission workflow that will read the document from the system, and at the end will simply update the record to reflect the final collection tag in 980. Hmmm... Are these websubmit functions publicly available, or are they 'tailored' only for CERN library? I'm eager to test them out (when I find some spare time...) Keep up the good work, Theodoros -- [1] I had an indication of this because when i tried to access a newly created -but not approved- record, i got a nice big RESTRICTED sign
Re: How to accessing attached files/temporary marcxml record before approving a record?
In principle it could be nice to have one websubmit function that take one parameter which would be the new value to be put in 980, and having such submission function taking care of preparing marcxml and triggering bibupload, etc. I don't know the internals of your functions, but as i understand it, there could a workflow where SBI creates -as usual- the xmlmarc and puts the correct collection in your 980 (at least my implementation takes doctype and categformat and puts them into 980). From there, in order to appove it you only have to take out the restriction bit (which, i suppose, is easily done with either direct access to the appropriate table, or, even better with a temp .xml file and bibupload) and a change in sbmAPPROVAL table... Even if your workflow initially puts all pending docs in a 'hidden' collection, you could put the correct doctype in another local field (let's say 989), and then when you approve it, just create a temp .xmp file and use bibupload in 'correct' mode to replace the old 980 with the value(s) from 989. Another similar approach would be for the approve function to remove a 'magic' string (in the already complete record) similar to DELETED, which, when present, the record is not displayed/counted in the respective collections and is accessible only by the librarians who have the 'approve' right for that collection... This way the approve function can be very generic and no parameter passing (for the collection code) is needed. Just my two cents :) Cheers, Theodoros
Re: MathJax/LATEX support for titles in hb,hd
Do you have a public URL where we can check? Is there any reference for MathJax in the HEAD part of your pages? Could it be that you are working with a custom template which is not including the %{metaheaderadd}s placemark (where MathJax would be usually included... Indeed the meta entries [1] in my source, that are similar to what you have in your demo site, BUT in my case the script src='/MathJax/MathJax.js' type='text/javascript'/script is missing! I'm now debugging to see why. I'll update the thread when i find the reason. Best regards, Theodoros [1] BTW, the Default HTML meta.bft is 'Not OK' because format elements 'META_OPENGRAPH_VIDEO' and 'META_OPENGRAPH_IMAGE' use unknown parameter protocol
Re: MathJax/LATEX support for titles in hb,hd
Found it! I had CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS = hd, hb instead of hd,hb This extra space after the comma messed things up and if of.lower() in CFG_WEBSEARCH_USE_MATHJAX_FOR_FORMATS: metaheaderadd = get_mathjax_header(req.is_https()) in search_engine.py was never executed. Next time i'll stick to the example! :) Best regards, Theodoros
ReferenceError: edToolbar is not defined
Hello everyone, This error (seen in Firefox error console), created from bibformat_templates.py, appears when editing Format Templates and appears in several 1.0 and 1.1 versions, although the rest of the interface works as expected. It's only of minor importance, but you verify it and happen to clean that part of the code, you could also take care of it... Best regards, Theodoros Theodoropoulos
Format template exception (SERVER_RETURN: 404)
Hello everyone, I keep getting (140 times and counting) the following exception every time i try to edit a format template, although the rest of the interface works fine... I cannot figure out why it appears... Any ideas? It happens for built-in templates as well as custom ones... BTW, this specific box runs a 3 weeks old, pre-1.1 version of master branch, but i tried with a latest maint-1.1 box and appears also. * 2012-10-31 20:10:55 - SERVER_RETURN: 404 (webinterface_handler.py:427:_handler) ** User details agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0 ... guest: 0 language: en login_method: AUTH nickname: 001000107412 precached_canseehiddenmarctags: True precached_permitted_restricted_collections: [] precached_useadmin: True precached_usealerts: True precached_useapprove: True precached_usebaskets: True precached_usegroups: True precached_useloans: True precached_usemessages: True precached_usepaperattribution: True precached_usepaperclaim: True precached_usestats: True precached_viewclaimlink: False precached_viewsubmissions: True referer:https://snoopy.lib.auth.gr/admin/bibformat/bibformatadmin.py/format_template_show?bft=IKEEBOOKRVDetailed.bftln=en remote_host: remote_ip: 155.207.48.160 session: 7b8dade1060076d9856df1736fa8b8a0 uid: 1 uri: /admin/bibformat/js_quicktags.js? ** Traceback details Traceback (most recent call last): File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 462, in application ret = invenio_handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler return _handler(req) File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 427, in _handler raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND SERVER_RETURN: 404 ** Stack frame details Frame application in /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py at line 472 --- 469 admin_to_be_alerted = alert_admin_for_server_status_p(status, 470 req.headers_in.get('referer')) 471 if admin_to_be_alerted: 472 register_exception(req=req, alert_admin=True) 473 if not req.response_sent_p: 474 start_response(req.get_wsgi_status(), req.get_low_level_headers(), sys.exc_info()) 475 return generate_error_page(req, admin_to_be_alerted) --- status = '404' start_response = 'built-in method start_response of mod_wsgi.Adapter object at 0x7f46b455e918' req = 'invenio.webinterface_handler_wsgi.SimulatedModPythonRequest object at 0x7f46b4127cd0' admin_to_be_alerted = 'True' possible_handler = 'None' possible_module = 'None' ... Best regards, Theodoros Theodoropoulos
Wrong error msg in Submit tab when no submit role is connected to a user
Hello everyone, When a logged-in user has NO sumbit permissions (to any doctype) and tries to click the submit button (on top), he gets: Account 'x...@yyy.zzz' is not yet activated. Try to login https://snoopy.lib.auth.gr/youraccount/login?referer=../submit with another account Which is misleading... He should get an error that says You are not authorized to perform this action. Probably, for registered but not-yet-activated accounts an extra check should be performed... Can you verify it, or did i break something? Best regards, Theodoros ps. Call is initiated in websubmit_webinterface.py [...] if not at_least_one_submission_authorized and submission_exists: if isGuestUser(uid): return redirect_to_url(req, %s/youraccount/login%s % ( CFG_SITE_SECURE_URL, make_canonical_urlargd({'referer' : CFG_SITE_SECURE_URL + req.unparsed_uri, 'ln' : args['ln']}, {})) , norobot=True) else: return page_not_authorized(req, ../submit, # this is executed uid=uid, navmenuid='submit') return home(req,catalogues_text, c,ln) and then page_not_authorized is called: [...] if res and res[0][0]: if text: body = text else: body = %s %s % (CFG_WEBACCESS_WARNING_MSGS[9] % cgi.escape(res[0][0]),# this is executed (%s %s % (CFG_WEBACCESS_MSGS[0] % urllib.quote(referer), CFG_WEBACCESS_MSGS[1]))) [...] but from access_control_config: CFG_WEBACCESS_WARNING_MSGS[9] = Account '%s' is not yet activated.
Re: Exception handling when one changes CFG_SITE_NAME to a string that already exists in collections table
That's what i came up with: (you should probably change it to match your way of doing things, but i tested it and works for me) --- /opt/invenio/lib/python/invenio/inveniocfg.py.orig 2012-10-28 12:54:58.007229495 +0200 +++ /opt/invenio/lib/python/invenio/inveniocfg.py 2012-10-28 13:05:24.055227141 +0200 @@ -412,7 +412,12 @@ run_sql(INSERT INTO collection (id, name, dbquery, reclist) VALUES (1,%s,NULL,NULL), (sitename,)) except IntegrityError: -run_sql(UPDATE collection SET name=%s WHERE id=1, (sitename,)) +try: + run_sql(UPDATE collection SET name=%s WHERE id=1, (sitename,)) +except IntegrityError, errorcode: + if errorcode[0] == 1062: # Duplicate sitename + print ERROR: %s already exists in collections! CFG_SITE_NAME NOT updated. %sitename + sys.exit(1) # reset CFG_SITE_NAME_INTL: for lang in conf.get(Invenio, CFG_SITE_LANGS).split(,): sitename_lang = conf.get(Invenio, CFG_SITE_NAME_INTL_ + lang) I think It should be enough to put the check in inveniocfg... Unless someone messes up with config.py (in which case, he should know what he's doing :) ) I see of no other 'documented' way to change the sitename, so the uncaught exception in websearchadminlib.py should never appear (at least because of this). Best regards, Theodoros Theodoropoulos
BibAuthority module??
Hello everyone, I've been reading very eagerly the -excellent- wiki page regarding authorities at https://twiki.cern.ch/twiki/bin/view/CDS/BibAuthority and I even created an xml file with personal details/affiliations etc for all our staff members, only to find out later that this module does not exist in my installation! I cannot find it in the repo either, so I was wondering if it is available to everyone, or if it was meant to be used only by INSPIRE... I remember from the workshop that INSPIRE uses some sort of Authorities for HEP names, institutions, conferences, and there are some screenshots in the aforementioned wiki regarding bibedit-using-authorities, so I suppose that this functionality has been implemented already :) The big question is: Is it safe to use in a production server, and if so, how can i merge it in my invenio (maint 1.1) branch? Thanks in advance, Theodoros Theodoropoulos
Exception handling when one changes CFG_SITE_NAME to a string that already exists in collections table
Hello everyone, Although it is quite rare, when one changes the SITE_NAME from inveniocfg and then tries to run for example WebSearchadmin, he gets an exception saying: 2012-10-27 21:36:21 - IntegrityError: (1062, Duplicate entry 'X' for key 'name') (connections.py:36:defaulterrorhandler) Indeed, websearchadminlib.py gets executed and tries to 'rebuild' the root collection: if CFG_SITE_NAME != run_sql(SELECT name from collection WHERE id=1)[0][0]: res = run_sql(update collection set name=%s where id=1, (CFG_SITE_NAME, )) But the update fails because 'name' is unique and the requested name already exists as in a collection/subcollection. So admin gets an exception. Ideally, one should either check the proposed name upon inveniocfg --reset-sitename (and give an error that the name already exists in another collection, and so it can't be used), or at least give a proper error message in websearchadminlib.py, or better, BOTH! :) If i find some time tomorrow, I'll propose a patch (it should be a matter of just 4-5 lines of code). Best regards, Theodoros
Translation status
Hello everyone, I was wondering what is the status of completeness in the various translation files for invenio. Is the http://cdsware.cern.ch/invenio/i18n/ page still used or is there another one that substituted it? It's been a while since it was last updated... I suppose i could use 'msgfmt -v --statistics (po file)' to see for myself, but probably an update to the i18n page (at some point in the future when other, most important issues have been resolved) would 'motivate' us to update our translations ;) Just my two cents, Theodoros
Re: Issue with websubmit_engine (when mbpages1)
Hello Sam, We constantly use submission forms with multiple pages, and i can assure you that apart from this problem (and some other issues with custom javascript elements that don't get initialized with the previous value -something that is more or less an expected behavior of my poor programming skills) moving back and forth has been working flawlessly for us for several years! So please do ticketize it and try to find a solution... :) Coming to think of it, a quick workaround might require that the submission form writes the values of ALL elements, even empty ones... (although I'm confident that you'll come up with a more elegant solution!) Best regards, Theodoros
Typo in bibconvert help file
Hello everyone, I just found a typo in the bibconvert help file (-h) that seems important... One reads, [...] -C_s_, alternative to -c when config split to several files, _*target*_ -C_t_, alternative to -c when config split to several files, _*source*_ Where i believe the last two entries should probably be: -C*s*, alternative to -c when config split to several files,*source*** -C*t*, alternative to -c when config split to several files,**target* * because the description seems more fit to the switches... It exists in all branches (master,maint, etc) and probably even in old releases. Best regards, Theodoros ps. Apologies if this was the intended use of switches/description. It just seemed odd :)
Issue with websubmit_engine (when mbpages1)
Hello everyone, When submitting using the web forms and only if a document has more than 1 page, when you navigate from one page to another: If an element already has a value[*] in pg1 and you edit it, then go to pg2, everything is OK :) If an element already has a value[*] in pg1 and you DELETE THE VALUE COMPLETELY, then go to pg2, and then BACK to pg1, you see the old value! This means that the FILE in the disk that corresponds to this element is NOT updated IF the new value is (although the system checks of the page see the 'new', empty value). This results in users not being able to 'empty' old values when moving from one page to another. One can only change them into something else! I can verify that it appears in a snapshot of master branch taken from git 3 weeks ago, but i believe it appears in earlier/later releases. Can you verify it? If yes, I think it's a bit important and needs fixing... Best regards, Theodoros Theodoropoulos [*] either by continuing a previously-incomplete submission, or by entering a value, going to another page and then back
Re: Upload_Files response element possible issue
Nope :( I followed the instructions and i get the same(?) error: === [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 462, in application [Fri Sep 07 13:24:19 2012] [error] ret = invenio_handler(req) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler [Fri Sep 07 13:24:19 2012] [error] return _handler(req) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 425, in _handler [Fri Sep 07 13:24:19 2012] [error] return root._traverse(req, path, False, guest_p) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse [Fri Sep 07 13:24:19 2012] [error] result = _check_result(req, obj(req, form)) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1390, in index [Fri Sep 07 13:24:19 2012] [error] return _index(req, **args) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1386, in _index [Fri Sep 07 13:24:19 2012] [error] return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_engine.py, line 533, in interface [Fri Sep 07 13:24:19 2012] [error] exec co in the_globals [Fri Sep 07 13:24:19 2012] [error] File string, line 71, in module [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_managedocfiles.py, line 514, in create_file_upload_interface [Fri Sep 07 13:24:19 2012] [error] bibrecdocs = BibRecDocs(recid) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/bibdocfile.py, line 570, in __init__ [Fri Sep 07 13:24:19 2012] [error] self.build_bibdoc_list() [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/bibdocfile.py, line 685, in build_bibdoc_list [Fri Sep 07 13:24:19 2012] [error] status'DELETED' ORDER BY docname ASC, (self.id,)) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/dbquery.py, line 206, in run_sql [Fri Sep 07 13:24:19 2012] [error] rc = cur.execute(sql, param) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 175, in execute [Fri Sep 07 13:24:19 2012] [error] if not self._defer_warnings: self._warning_check() [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 89, in _warning_check [Fri Sep 07 13:24:19 2012] [error] warn(w[-1], self.Warning, 3) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/errorlib.py, line 511, in fun [Fri Sep 07 13:24:19 2012] [error] traceback.print_stack() [Fri Sep 07 13:24:19 2012] [error] /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:175: Warning: Out of range value for column 'id_bibrec' at row 1 [Fri Sep 07 13:24:19 2012] [error] if not self._defer_warnings: self._warning_check() [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 462, in application [Fri Sep 07 13:24:19 2012] [error] ret = invenio_handler(req) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler [Fri Sep 07 13:24:19 2012] [error] return _handler(req) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 425, in _handler [Fri Sep 07 13:24:19 2012] [error] return root._traverse(req, path, False, guest_p) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse [Fri Sep 07 13:24:19 2012] [error] result = _check_result(req, obj(req, form)) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1390, in index [Fri Sep 07 13:24:19 2012] [error] return _index(req, **args) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1386, in _index [Fri Sep 07 13:24:19 2012] [error] return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage) [Fri Sep 07 13:24:19 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_engine.py, line 533, in interface [Fri Sep 07 13:24:19 2012] [error] exec co in the_globals [Fri Sep 07 13:24:19 2012] [error] File string, line 71, in module [Fri Sep 07 13:24:19 2012] [error] File
Upload_Files response element possible issue
Hello everyone, While experimenting with Upload_Files response element (following help/admin/websubmit-admin-guide#5.4), i realised that while the default response element displays properly in a form (ie DEMOART), the following errors are produced in the apache log and thus the 'Upload' button is not functional: = [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 462, in application [Thu Sep 06 13:21:39 2012] [error] ret = invenio_handler(req) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler [Thu Sep 06 13:21:39 2012] [error] return _handler(req) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 425, in _handler [Thu Sep 06 13:21:39 2012] [error] return root._traverse(req, path, False, guest_p) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse [Thu Sep 06 13:21:39 2012] [error] result = _check_result(req, obj(req, form)) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1390, in index [Thu Sep 06 13:21:39 2012] [error] return _index(req, **args) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1386, in _index [Thu Sep 06 13:21:39 2012] [error] return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_engine.py, line 533, in interface [Thu Sep 06 13:21:39 2012] [error] exec co in the_globals [Thu Sep 06 13:21:39 2012] [error] File string, line 71, in module [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_managedocfiles.py, line 514, in create_file_upload_interface [Thu Sep 06 13:21:39 2012] [error] bibrecdocs = BibRecDocs(recid) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/bibdocfile.py, line 570, in __init__ [Thu Sep 06 13:21:39 2012] [error] self.build_bibdoc_list() [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/bibdocfile.py, line 685, in build_bibdoc_list [Thu Sep 06 13:21:39 2012] [error] status'DELETED' ORDER BY docname ASC, (self.id,)) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/dbquery.py, line 206, in run_sql [Thu Sep 06 13:21:39 2012] [error] rc = cur.execute(sql, param) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 175, in execute [Thu Sep 06 13:21:39 2012] [error] if not self._defer_warnings: self._warning_check() [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py, line 89, in _warning_check [Thu Sep 06 13:21:39 2012] [error] warn(w[-1], self.Warning, 3) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/errorlib.py, line 511, in fun [Thu Sep 06 13:21:39 2012] [error] traceback.print_stack() [Thu Sep 06 13:21:39 2012] [error] /usr/lib64/python2.6/site-packages/MySQLdb/cursors.py:175: Warning: Out of range value for column 'id_bibrec' at row 1 [Thu Sep 06 13:21:39 2012] [error] if not self._defer_warnings: self._warning_check() [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler_wsgi.py, line 462, in application [Thu Sep 06 13:21:39 2012] [error] ret = invenio_handler(req) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 362, in _profiler [Thu Sep 06 13:21:39 2012] [error] return _handler(req) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 425, in _handler [Thu Sep 06 13:21:39 2012] [error] return root._traverse(req, path, False, guest_p) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/webinterface_handler.py, line 252, in _traverse [Thu Sep 06 13:21:39 2012] [error] result = _check_result(req, obj(req, form)) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1390, in index [Thu Sep 06 13:21:39 2012] [error] return _index(req, **args) [Thu Sep 06 13:21:39 2012] [error] File /usr/lib64/python2.6/site-packages/invenio/websubmit_webinterface.py, line 1386, in _index [Thu Sep 06 13:21:39 2012] [error] return interface(req, c, ln, doctype, act, startPg, access, mainmenu, fromdir, nextPg, nbPg, curpage) [Thu Sep 06 13:21:39 2012] [error] File
TypeError: 'int' object is unsubscriptable exception
Hello Invenio team, I decided to fetch the latest master from git (there were a few changes during the last few days regarding author handling and i wanted to check them out), and followed the usual procedure. I noted that there were a few changes/additions in the Makefile regarding the database since my last fetch/build, so I applied these line by line. So far, so good. But when i tried to run update-config-py, i got a warning about CFG_BATCHUPLOADER_WEB_ROBOT_AGENT that had to be renamed to CFG_BATCHUPLOADER_WEB_ROBOT_AGENTS (fixed it), and then the following error: # inveniocfg --update-config-py Going to update config.py... Traceback (most recent call last): File /opt/invenio/bin/inveniocfg, line 33, in module main() File /usr/lib64/python2.6/site-packages/invenio/inveniocfg.py, line 1320, in main cli_cmd_update_config_py(conf) File /usr/lib64/python2.6/site-packages/invenio/inveniocfg.py, line 259, in cli_cmd_update_config_py line_out = convert_conf_option(option, conf.get(section, option)) File /usr/lib64/python2.6/site-packages/invenio/inveniocfg.py, line 146, in convert_conf_option option_value = option_value[1:-1] TypeError: 'int' object is unsubscriptable Is this a bug, or is it related to my variables in the invenio-local.conf? Best regards, Theodoros
Re: TypeError: 'int' object is unsubscriptable exception
FYI, printing option_value gives me the following output: { '127.0.0.1': ['*'], # useful for testing '127.0.1.1': ['*'], # useful for testing '10.0.0.1': ['BOOK', 'REPORT'], # Example 1 '10.0.0.2': ['POETRY', 'PREPRINT'], # Example 2 } ('txt', 'html', 'xml', 'odt', 'doc', 'docx', 'djvu', 'pdf', 'ps', 'ps.gz') { 'Poetry' : 'poem'} { 'global': ['INDEX-SYNONYM-TITLE', 'exact'], 'title': ['INDEX-SYNONYM-TITLE', 'exact'], } { '100__a': 2, '245__a': 4 } { 'title' : '[title]', 'title-author' : '[title] [author]', 'reportnumber' : 'reportnumber:[reportnumber]' } {} [ ('http(s)?://.*', {}), ] { 'marcxml': ('XOAIMARC', 'http://www.openarchives.org/OAI/1.1/dc.xsd', 'http://purl.org/dc/elements/1.1/'), 'oai_dc': ('XOAIDC', 'http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd', 'http://www.loc.gov/MARC21/slim'), } {} [] { 'Articles': ['506__d', '506__m'], } { 'Articles': '5061_a', 'Pictures': '5061_a', 'Theses': '5061_a', } { 'Articles': '562__c', 'Pictures': '562__c', } {} 4 What seems suspicious is the last number (4)... How could this be generated? When i rename invenio-local.conf, everything works as expected... So I suppose it's something in this file (that was working as expected until now!). Any ideas how to narrow down the section/parameter that causes the error?
Re: TypeError: 'int' object is unsubscriptable exception [fixed]
It seems that the error was produced by this parameter in invenio-local.conf: CFG_WEBSEARCH_FULLTEXT_SNIPPETS = 4 Reading the invenio.conf, i see that the usage is quite different... :blush: I removed it from invenio-local, and everything works... Apologies for the false alarm! Theodoros
Re: (sitename)/help/admin (and other cached help pages) display old CFG_SITE_URL entries after inveniocfg --reset-sitename
(note that you refer to sitename for CFG_SITE_URL, while sitename would rather apply to CFG_SITE_NAME and CFG_SITE_NAME_INTL*, Just wanted to point that out, as 'inveniocfg --reset-sitename' will account for site name change, not URL change) You're right. Indeed, in my description I mixed those two parameters (btw, I changed them both), but my main problem was with the links (related to CFG_SITE_URL). Thanks Jerome for your time, Theodoros
Re: websubmit_dump script?
Thank you Jerome for the file and the warnings/recommendations. I'm playing with the various parameters, and it will definitely save me a lot of work! Cheers, Theodoros ps. BTW, i'm thinking of transferring ALL the custom elements (sbmFIELDDESC table) and custom functions (sbmFUNDESC table) from the old system to the test, BEFORE copying the doctype details. This will probably require stripping of the relevant lines that are produced from the dump script, but since elements and functions are not directly connected to a certain doctype i fear that i will get errors if several doctypes try to insert them several times. I will try and see how it behaves. For the other doctype-related tables you have very cleverly implemented a delete from XXX where docname='ZZZ' kind of functionality that will work if importing the same doctype over an existing one.
websubmit_dump script?
Hello everyone, Looking at my notes from the Invenio Users Group workshop, I see a comment I've made about a websubmit_dump script that dumps the submission tables for a specific doctype. Where could I find this script? (I don't think it is available in the latest master...) Do you believe that it could work with old (v0.99.x) installations as well? Oh, and is there an equivalent websubmit_restore script (that takes into account the existing doctypes/elements/webaccess details/actions etc and applies what's needed according to the dump file?), or do i just import the output into the new db? Best regards, Theodoros
Re: Small bfe_fulltext.py error in parsing filenames when focus_on_main_file=no [Solved]
If so, we can amend it in this direction. Please tell me. Thanks Tibor for the info! In our production system, the bfe_fulltext_mini.py is older than the one found in master (hence, there is no focus_on_main_files etc). Using a test 1.x system -with a subset of our full recordset- everything looked OK with the 'mini' version. So, I believe that there is no reason for any of us to spend more time on that. My only comment would be that -for both fulltext bfes- when focus_on_main_files is off, the first file(s) in list should be the Main one(s), and then the additional ones should follow, sorted seperately by filename. But that is only a personal preference :) Generally speaking, it's good to know that the 'mini' version is working even better with regard to versions etc. However, it seems that we will eventually have to create a custom bfe for our Insitution, because we were asked to somehow display the licensing info (ie Creative Commons, etc) that the user has selected during the file upload step (a bit more customization here), next to each(!) file... Hmmm... Best regards, Theodoros
Virus scan during file submission
Dear Invenio devs, I was recently asked to think of a way to implement virus scan procedure during the file submission step. Coming to think of it, it's not a bad idea... Of course, it does not apply to PDF and GIF files, but it could come useful for ZIP/RAR/EXE/MSOffice files... With that in mind, I have the following comments/questions: - There are some free antivirus packages for Linux. AVG (http://free.avg.com/us-en/download.prd-alf) is one of them, but i have never tried it. Do you have any better suggestions? - The chosen antivirus program should have a CLI that should take the file(s) in question and reply with a code that determines whether the file is infected, suspicious, clean etc - Some websubmit function (probably Create_Upload_Files_Interface.py ??) must be modified to check the files-to-be-uploaded and reject the, or warn the user accordingly - Maybe, in addition to this, there could be a scheduled task that would periodically search the /opt/invenio/var/data/files/ (say once per day), and run a bibdocfile --delete for the definitely infected files (probably also sending a warning email to the admin and/or original submitter), and just a warning for the suspicious ones. What do you think? Do you have a similar procedure at CERN? If not, do you now of any Invenio installation that incorporates it? If not, would you be interested in implementing it? I think I could contribute (with ideas, tests and maybe some very basic code) :) Best regards, Theodoros Theodoropoulos